git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Shawn Pearce <spearce@spearce.org>
To: Jeff King <peff@peff.net>
Cc: steve.norman@thomsonreuters.com,
	"Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>,
	git <git@vger.kernel.org>
Subject: Re: [PATCH] index-pack: avoid excessive re-reading of pack directory
Date: Tue, 9 Jun 2015 20:46:24 -0700	[thread overview]
Message-ID: <CAJo=hJv6O66yFC_O_4aeL2JxBOtk2f+k1wGt3VW7dk71Q1ov3A@mail.gmail.com> (raw)
In-Reply-To: <20150605122921.GA7586@peff.net>

On Fri, Jun 5, 2015 at 5:29 AM, Jeff King <peff@peff.net> wrote:
>
> However, some code paths make a large number of
> has_sha1_file checks which are _not_ expected to return 1.
> The collision test in index-pack.c is such a case. On a
> local system, this can cause a performance slowdown of
> around 5%. But on a system with high-latency system calls
> (like NFS), it can be much worse.
>
> This patch introduces a "quick" flag to has_sha1_file which
> callers can use when they would prefer high performance at
> the cost of false negatives during repacks. There may be
> other code paths that can use this, but the index-pack one
> is the most obviously critical, so we'll start with
> switching that one.

Hilarious. We did this in JGit back in ... uhm before 2009. :)

But its Java. So of course we had to do optimizations.


> @@ -3169,6 +3169,8 @@ int has_sha1_file(const unsigned char *sha1)
>                 return 1;
>         if (has_loose_object(sha1))
>                 return 1;
> +       if (flags & HAS_SHA1_QUICK)
> +               return 0;
>         reprepare_packed_git();
>         return find_pack_entry(sha1, &e);

Something else we do is readdir() over the loose objects and store
them in a map in memory. That way we avoid stat() calls during that
has_loose_object() path. This is apparently a win enough of the time
that we always do that when receiving a pack over the wire (client or
server).

  parent reply	other threads:[~2015-06-10  3:46 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-05-21 13:13 Troubleshoot clone issue to NFS steve.norman
2015-05-21 14:00 ` Christian Couder
2015-05-21 14:31 ` Duy Nguyen
2015-05-21 14:38   ` Duy Nguyen
2015-05-21 15:53     ` steve.norman
2015-05-22  0:16       ` Duy Nguyen
2015-05-22  7:12         ` Jeff King
2015-05-22  8:35           ` steve.norman
2015-05-22 10:05             ` Duy Nguyen
2015-05-22 14:37               ` Junio C Hamano
2015-05-22 15:02               ` steve.norman
2015-05-22 23:51                 ` [PATCH/RFC 0/3] using stat() to avoid re-scanning pack dir Jeff King
2015-05-22 23:51                   ` [PATCH 1/3] stat_validity: handle non-regular files Jeff King
2015-05-23 11:00                     ` Michael Haggerty
2015-05-24  8:29                       ` Jeff King
2015-05-22 23:52                   ` [PATCH 2/3] cache.h: move stat_validity definition up Jeff King
2015-05-22 23:54                   ` [PATCH 3/3] prepare_packed_git: use stat_validity to avoid re-reading packs Jeff King
2015-05-23  1:19                   ` [PATCH/RFC 0/3] using stat() to avoid re-scanning pack dir Duy Nguyen
2015-05-23  1:21                     ` Duy Nguyen
2015-05-24  8:20                     ` Jeff King
2015-05-24  9:00           ` Troubleshoot clone issue to NFS Duy Nguyen
2015-06-05 12:01             ` steve.norman
2015-06-05 12:18               ` Jeff King
2015-06-05 12:29                 ` [PATCH] index-pack: avoid excessive re-reading of pack directory Jeff King
2015-06-09 17:24                   ` Jeff King
2015-06-09 17:41                     ` Jeff King
2015-06-10  3:46                   ` Shawn Pearce [this message]
2015-06-10 14:00                     ` Jeff King
2015-06-10 14:36                       ` Duy Nguyen
2015-06-10 21:34                       ` Shawn Pearce
2015-06-05 14:20                 ` Troubleshoot clone issue to NFS steve.norman
2015-06-16 20:50                 ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAJo=hJv6O66yFC_O_4aeL2JxBOtk2f+k1wGt3VW7dk71Q1ov3A@mail.gmail.com' \
    --to=spearce@spearce.org \
    --cc=git@vger.kernel.org \
    --cc=pclouds@gmail.com \
    --cc=peff@peff.net \
    --cc=steve.norman@thomsonreuters.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).