git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: git@vger.kernel.org
Cc: Thomas Rast <tr@thomasrast.ch>
Subject: XDL_FAST_HASH can be very slow
Date: Sun, 21 Dec 2014 23:19:45 -0500	[thread overview]
Message-ID: <20141222041944.GA441@peff.net> (raw)

I ran across an interesting case that diffs very slowly with modern git.
And it's even public. You can clone:

  git://github.com/outpunk/evil-icons

and try:

  git show fc4efe426d5b4e6aa8d5a4dc14babeada7c5f899

(which is also the tip of master as of this writing).

The interesting file there is a 10MB Illustrator file, "assets/ei.ai".
Git treats it as text, as the early part doesn't have any NULs, but it
is mostly non-human-readable. It has a large number of lines, and some
of the lines themselves are quite large.

On my machine, "git show" takes ~77 seconds using v2.2.1. But if I build
the same version with "make XDL_FAST_HASH=", it completes in about 0.4s.
Both produce the same output.

I'm not really sure what's going on.  A few points of interest:

 - You can replicate this with the very first commit that added
   XDL_FAST_HASH, 6942efc (xdiff: load full words in the inner loop of
   xdl_hash_record, 2012-04-06). So it was always bad on this case, and
   it's not part of any more recent changes.

 - We actually _don't_ spend most of our time in xdl_hash_record, the
   function modified by 6942efc. Instead, it all goes to
   xdl_classify_record, which is looping over the set of hash records.
   It's not clear to me if more or different hash records is part of the
   design of XDL_FAST_HASH, or if this is actually a bug.

I haven't dug much further than that.

-Peff

             reply	other threads:[~2014-12-22  4:20 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-12-22  4:19 Jeff King [this message]
2014-12-22  9:08 ` XDL_FAST_HASH can be very slow Patrick Reynolds
2014-12-22 10:48   ` Thomas Rast
2014-12-23  2:51     ` demerphq

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20141222041944.GA441@peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=tr@thomasrast.ch \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).