git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: git@vger.kernel.org
Subject: [PATCH 0/7] tweaking the delta base cache
Date: Mon, 22 Aug 2016 17:57:26 -0400	[thread overview]
Message-ID: <20160822215725.qdikfcaz3smhulau@sigill.intra.peff.net> (raw)

After the experiments I did with --depth=50 recently, I noticed there
seemed to be a lot of room for improvement in the delta-base-cache (and
in particular, there seemed to be a lack of actual numbers).

So I tried a series of experiments, and these are the tweaks I came up
with. There are a lot of numbers and analysis in the commit messages
themselves. The most dramatic effect I got was that before this patch,
bumping core.deltaBaseCacheLimit for the kernel gets you basically
nothing, whereas with it, I get:

  core.deltaBaseCacheLimit    time to run git log -Sfoo --raw
  ------------------------    -------------------------------
                      128m    4m56.486s
                      256m    4m33.769s
                      512m    4m12.968s
                     1024m    3m32.623s

Note that I don't actually propose bumping the memory limit in this
series. That's a bit more contentious, as it's really using more
resources to do a space/time tradeoff, and people may not want to spend
the RAM. Whereas this series just adjusts the actual data structures to
let us use the RAM we've already been allocated more efficiently.

The interesting changes are really patches 5 and 6, which adjust the LRU
management and the underlying hash structure.

There are a few ideas I thought of or saw in past threads but didn't
explore. I don't plan on digging further on them right now, so if
anybody else wants to do so, be my guest:

  - limiting the size of items entering the cache (e.g., to avoid a
    single giant blob blowing out all of the other entries)

  - something more clever than LRU, like weighting by a mix of size and
    recency

  - I didn't look at the criteria for adding entries to the cache at all

  - we seem to drop cache entries as we use them in unpack_entry(); I'm
    not sure if we would do better to retain them and let them leave via
    LRU expiration

So there may be more work, but I think these improvements stand on their
own.

  [1/7]: cache_or_unpack_entry: drop keep_cache parameter
  [2/7]: clear_delta_base_cache_entry: use a more descriptive name
  [3/7]: release_delta_base_cache: reuse existing detach function
  [4/7]: delta_base_cache: use list.h for LRU
  [5/7]: delta_base_cache: drop special treatment of blobs
  [6/7]: delta_base_cache: use hashmap.h
  [7/7]: t/perf: add basic perf tests for delta base cache

-Peff

             reply	other threads:[~2016-08-22 21:57 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-08-22 21:57 Jeff King [this message]
2016-08-22 21:57 ` [PATCH 1/7] cache_or_unpack_entry: drop keep_cache parameter Jeff King
2016-08-23 21:45   ` Junio C Hamano
2016-08-22 21:57 ` [PATCH 2/7] clear_delta_base_cache_entry: use a more descriptive name Jeff King
2016-08-23 21:47   ` Junio C Hamano
2016-08-22 21:57 ` [PATCH 3/7] release_delta_base_cache: reuse existing detach function Jeff King
2016-08-23 21:49   ` Junio C Hamano
2016-08-24 17:41     ` Jeff King
2016-08-24 17:59       ` Junio C Hamano
2016-08-22 21:59 ` [PATCH 4/7] delta_base_cache: use list.h for LRU Jeff King
2016-08-22 23:18   ` Eric Wong
2016-08-23  0:48     ` Jeff King
2016-08-22 21:59 ` [PATCH 5/7] delta_base_cache: drop special treatment of blobs Jeff King
2016-08-23 22:18   ` Junio C Hamano
2016-08-22 22:00 ` [PATCH 6/7] delta_base_cache: use hashmap.h Jeff King
2016-08-23 22:26   ` Junio C Hamano
2016-08-22 22:01 ` [PATCH 7/7] t/perf: add basic perf tests for delta base cache Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160822215725.qdikfcaz3smhulau@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).