From: Jeff King <peff@peff.net>
To: git@vger.kernel.org
Cc: Thomas Rast <tr@thomasrast.ch>
Subject: XDL_FAST_HASH can be very slow
Date: Sun, 21 Dec 2014 23:19:45 -0500 [thread overview]
Message-ID: <20141222041944.GA441@peff.net> (raw)
I ran across an interesting case that diffs very slowly with modern git.
And it's even public. You can clone:
git://github.com/outpunk/evil-icons
and try:
git show fc4efe426d5b4e6aa8d5a4dc14babeada7c5f899
(which is also the tip of master as of this writing).
The interesting file there is a 10MB Illustrator file, "assets/ei.ai".
Git treats it as text, as the early part doesn't have any NULs, but it
is mostly non-human-readable. It has a large number of lines, and some
of the lines themselves are quite large.
On my machine, "git show" takes ~77 seconds using v2.2.1. But if I build
the same version with "make XDL_FAST_HASH=", it completes in about 0.4s.
Both produce the same output.
I'm not really sure what's going on. A few points of interest:
- You can replicate this with the very first commit that added
XDL_FAST_HASH, 6942efc (xdiff: load full words in the inner loop of
xdl_hash_record, 2012-04-06). So it was always bad on this case, and
it's not part of any more recent changes.
- We actually _don't_ spend most of our time in xdl_hash_record, the
function modified by 6942efc. Instead, it all goes to
xdl_classify_record, which is looping over the set of hash records.
It's not clear to me if more or different hash records is part of the
design of XDL_FAST_HASH, or if this is actually a bug.
I haven't dug much further than that.
-Peff
next reply other threads:[~2014-12-22 4:20 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-12-22 4:19 Jeff King [this message]
2014-12-22 9:08 ` XDL_FAST_HASH can be very slow Patrick Reynolds
2014-12-22 10:48 ` Thomas Rast
2014-12-23 2:51 ` demerphq
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20141222041944.GA441@peff.net \
--to=peff@peff.net \
--cc=git@vger.kernel.org \
--cc=tr@thomasrast.ch \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).