From: "Ondřej Bílka" <neleai@seznam.cz>
To: David Turner <dturner@twopensource.com>
Cc: git@vger.kernel.org, gitster@pobox.com, mhagger@alum.mit.edu,
David Turner <dturner@twitter.com>
Subject: Re: [PATCH v7 1/1] refs.c: SSE4.2 optimizations for check_refname_component
Date: Sat, 14 Jun 2014 17:22:09 +0200 [thread overview]
Message-ID: <20140614152209.GA14125@domone.podge> (raw)
In-Reply-To: <1402012575-16546-2-git-send-email-dturner@twitter.com>
On Thu, Jun 05, 2014 at 07:56:15PM -0400, David Turner wrote:
> Optimize check_refname_component using SSE4.2, where available.
>
> git rev-parse HEAD is a good test-case for this, since it does almost
> nothing except parse refs. For one particular repo with about 60k
> refs, almost all packed, the timings are:
>
> Look up table: 29 ms
> SSE4.2: 25 ms
>
> This is about a 15% improvement.
>
> The configure.ac changes include code from the GNU C Library written
> by Joseph S. Myers <joseph at codesourcery dot com>.
>
> Only supports GCC and Clang at present, because C interfaces to the
> cpuid instruction are not well-standardized.
>
Still a SSE4.2 is not that useful, in most cases SSE2 is faster. Here I
think that difference will not be that big when correctly implemented.
That will avoid a runtime checks.
For parallelisation you need to take extra step and paralelize whole
check than going component-by-component.
For detecting sequences a faster way is construct bitmasks with SSE2 so
you could combine these. It avoids needing special casing on 16-byte
boundaries.
Below is untested implementation where you could add a bad character
check with SSE4.2 which would speed it up. Are refs mostly
alphanumerical? If so we could speed this up by paralelized alnum check
and handling other characters in slower path.
#include <stdint.h>
#include <emmintrin.h>
char bad_table[256]; // TODO
int bad_characters(unsigned char *x)
{
while (*x)
if (bad_table[*x++])
return -1;
return 0;
}
int check_refname_skeleton(char *x)
{
if (bad_characters(x))
return -1;
__m128i slash = _mm_set1_epi8 ('/');
__m128i dot = _mm_set1_epi8 ('.');
__m128i char_l = _mm_set1_epi8 ('l');
__m128i at = _mm_set1_epi8 ('@');
__m128i brace = _mm_set1_epi8 ('{');
while (1) {
if (((uint64_t) x) & 4095 < 4096 - 17)
{
if (bytewise_check(x) != -2);
return bytewise_check(x);
x += 16;
}
__m128i v0 = _mm_loadu_si128 ((__m128i *)(x));
__m128i v1 = _mm_loadu_si128 ((__m128i *)(x + 1));
__m128i result;
// terminating 0
result = _mm_cmpeq_epi8(v0, _mm_set1_epi8('\000'));
// sequence ..
result = _mm_or_si128(result, _mm_and_si128 (_mm_cmpeq_epi8(v0, dot),
_mm_cmpeq_epi8(v1, dot)));
// sequence /.
result = _mm_or_si128(result, _mm_and_si128 (_mm_cmpeq_epi8(v0, slash),
_mm_cmpeq_epi8(v1, dot)));
// sequence //
result = _mm_or_si128(result, _mm_and_si128 (_mm_cmpeq_epi8(v0, slash),
_mm_cmpeq_epi8(v1, slash)));
// sequence .l
result = _mm_or_si128(result, _mm_and_si128 (_mm_cmpeq_epi8(v0, dot),
_mm_cmpeq_epi8(v1, char_l)));
// sequence @{
result = _mm_or_si128(result, _mm_and_si128 (_mm_cmpeq_epi8(v0, at),
_mm_cmpeq_epi8(v1, brace)));
uint64_t mask = _mm_movemask_epi8(result);
if (mask) {
char *p = x + __builtin_ctzl(mask);
if (!*p)
return 0;
else if (p[0] == '.' && p[1] == 'l')
if (bytewise_check(x) != -2)
return bytewise_check(x);
else
return -1;
}
x += 16;
}
}
next prev parent reply other threads:[~2014-06-14 15:22 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-06-05 23:56 [PATCH v7 0/1] refs.c: SSE4.2 optimizations for check_refname_component David Turner
2014-06-05 23:56 ` [PATCH v7 1/1] " David Turner
2014-06-14 15:22 ` Ondřej Bílka [this message]
2014-06-15 5:53 ` David Turner
2014-06-09 22:16 ` [PATCH v7 0/1] " Junio C Hamano
2014-06-09 22:39 ` David Turner
2014-06-09 23:05 ` Junio C Hamano
2014-06-10 6:04 ` Johannes Sixt
2014-06-10 6:55 ` Junio C Hamano
2014-06-13 1:18 ` David Turner
2014-06-13 4:11 ` Torsten Bögershausen
2014-06-14 10:24 ` Philip Oakley
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140614152209.GA14125@domone.podge \
--to=neleai@seznam.cz \
--cc=dturner@twitter.com \
--cc=dturner@twopensource.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=mhagger@alum.mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).