git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Ben Peart <peartben@gmail.com>
To: Jeff King <peff@peff.net>, Ben Peart <Ben.Peart@microsoft.com>
Cc: "git@vger.kernel.org" <git@vger.kernel.org>,
	"gitster@pobox.com" <gitster@pobox.com>
Subject: Re: [PATCH v1 0/3] [RFC] Speeding up checkout (and merge, rebase, etc)
Date: Mon, 23 Jul 2018 11:48:18 -0400	[thread overview]
Message-ID: <a2ad0044-f317-69f7-f2bb-488111c626fb@gmail.com> (raw)
In-Reply-To: <20180718213420.GA17291@sigill.intra.peff.net>



On 7/18/2018 5:34 PM, Jeff King wrote:
> On Wed, Jul 18, 2018 at 08:45:14PM +0000, Ben Peart wrote:
> 
>> When working directories get big, checkout times start to suffer.  Even with
>> GVFS virtualization (which limits git to only having to update those files
>> that have been changed locally) we�re seeing P50 times for checkout of 31
>> seconds and the P80 time is 43 seconds.
> 
> Funny aside: all of your apostrophes look like the unicode question
> mark. Looking at raw bytes of your mail, they're actually u+fffd
> (unicode "replacement character"). Your headers correctly claim to be
> utf8. So presumably they got munged by whatever converted to unicode and
> didn't have the original character in its translation table. I wonder if
> this was send-email (so really perl's encode module), or if your smtp
> server tried to do an on-the-fly conversion (I know many servers will
> switch the content-transfer-encoding, but I haven't seen a charset
> conversion before).
> 

This was my bad.  I wrote the email in Word so I could get spell 
checking and it has this 'feature' where it converts all straight quotes 
to "smart quotes."  I just forgot to search/replace them back to 
straight quotes before sending the mail.

> Anyway, on to the actual discussion:
> 
>> Here is a checkout command with tracing turned on to demonstrate where the
>> time is spent.  Note, this is somewhat of a �best case� as I�m simply
>> checking out the current commit:
>>
>> benpeart@gvfs-perf MINGW64 /f/os/src (official/rs_es_debug_dev)
>> $ /usr/src/git/git.exe checkout
>> 12:31:50.419016 read-cache.c:2006       performance: 1.180966800 s: read cache .git/index
>> 12:31:51.184636 name-hash.c:605         performance: 0.664575200 s: initialize name hash
>> 12:31:51.200280 preload-index.c:111     performance: 0.019811600 s: preload index
>> 12:31:51.294012 read-cache.c:1543       performance: 0.094515600 s: refresh index
>> 12:32:29.731344 unpack-trees.c:1358     performance: 33.889840200 s: traverse_trees
>> 12:32:37.512555 read-cache.c:2541       performance: 1.564438300 s: write index, changed mask = 28
>> 12:32:44.918730 unpack-trees.c:1358     performance: 7.243155600 s: traverse_trees
>> 12:32:44.965611 diff-lib.c:527          performance: 7.374729200 s: diff-index
>> Waiting for GVFS to parse index and update placeholder files...Succeeded
>> 12:32:46.824986 trace.c:420             performance: 57.715656000 s: git command: 'C:\git-sdk-64\usr\src\git\git.exe' checkout
> 
> What's the current state of the index before this checkout? 

This was after running "git checkout" multiple times so there was really 
nothing for git to do.

> I don't
> recall offhand how aggressively we prune the tree walk based on the diff
> between the index and the tree we're loading. If we're starting from > scratch, then obviously we do have to walk the whole thing. But in most
> cases we should be able to avoid walking into sub-trees where the index
> has a matching cache_tree record.
> 
> If we're not doing that, it seems like that's going to be the big
> obvious win, because it reduces the number of trees we have to consider
> in the first place.
> 

I agree this could be a big win.  Especially in large trees, the 
percentage of the tree that changes between two commits is often quite 
small.  Saving 100% of that is a much bigger win than actually doing all 
that work even in parallel. Today, we aren't aggressive at all and do no 
pruning.

This brings up a concern I have with this approach altogether. In an 
earlier patch series, I tried to optimize the "git checkout -b" code 
path to not update every file in the working directory but only to 
create the new branch and switch to it.  The feedback to that patch was 
that people rely on the current behavior of rewriting every file so the 
patch was rejected.  This earlier attempt/failure to optimize checkout 
makes me worried that _any_ effort to prune the tree will be rejected 
for the same reason.

I'd be interested in how we can prune the tree and only do the work 
required without breaking the implied behavior of the current 
implementation. Would it be acceptable to have two code paths 1) the old 
one for back compat that updates every file whether there are changes or 
not and 2) a new/optimized one that only does the minimum work required? 
  Then we could put which code path executes by default behind by a new 
config setting that allows people to opt-in to the new/faster behavior.

Any other ideas or suggestions that don't require coming up with new git 
commands (ie "git fast-checkout") and retraining existing git users?

>> ODB cache
>> =========
>> Since traverse_trees() hits the ODB for each tree object (of which there are
>> over 500K in this repo) I wrote and tested having an in-memory ODB cache
>> that cached all tree objects.  This resulted in a > 50% hit ratio (largely
>> due to the fact we traverse the tree twice during checkout) but resulted in
>> only a minimal savings (1.3 seconds).
> 
> In my experience, one major cost of object access is decompression, both
> delta and zlib. Trees in particular tend to delta very well across
> versions. We have a cache to try to reuse intermediate delta results,
> but the default size is probably woefully undersized for your repository
> (I know from past tests it's undersized a bit even for the linux
> kernel).
> 
> Try bumping core.deltaBaseCacheLimit to see if that has any impact. It's
> 96MB by default.
> 
> There may also be some possible work in making it more aggressive about
> storing the intermediate results. I seem to recall from past
> explorations that it doesn't keep everything, and I don't know if its
> heuristics have ever been proven sane.
> 
> For zlib compression, I don't have numbers handy, but previous
> experiments showed that trees don't actually benefit all that much from
> zlib (presumably because they're mostly random-looking hashes). So one
> option would be to try repacking _just_ the trees with
> "pack.compression" set to 0, and see how the result behaves. I suspect
> that will be pretty painful with your giant multi-pack repo.
> 
> It might be slightly easier if we had an option to set the compression
> level on a per-type basis (both to experiment, and then of course if it
> works to actually tune your repo).
> 
> The numbers above aren't specific enough to know how much time was spent
> doing zlib stuff, though. And even with more specific probes, it's
> generally still hard to tell the difference between what's specific to
> the compression level, and what's a result of the fact that zlib is
> essentially copying all the bytes from the filesystem into memory.
> Still, my timings with zstd[1] showed something like 10-20% improvement
> on object access, so we should be able to get something at least as good
> by moving to no compression.
> 
> [1] https://public-inbox.org/git/20161023080552.lma2v6zxmyaiiqz5@sigill.intra.peff.net/
> 

Thanks, these are good ideas to pursue.  I've added them to my list of 
things to look into but believe pruning the tree or traversing it in 
parallel has more performance saving potential so I'll be looking there 
first.

<snip>


>> Multi-threading unpack_trees()
>> ==============================
>> The current model of unpack_trees() is that a single thread recursively
>> traverses each tree object as it comes across it.  One thought I had was to
>> multi-thread the traversal so that each tree object could be processed in
>> parallel.  To test this idea out, I wrote an unbounded
>> Multi-Product-Multi-Consumer queue and then wrote a
>> traverse_trees_parallel() function that would add any new tree objects into
>> the queue where they can be processed by a pool of worker threads.  Each
>> thread will wake up when there is work in the queue, remove a tree object,
>> process it adding any additional tree objects it finds.
> 
> I'm generally terrified of multi-threading anything in the core parts of
> Git. There are so many latent bits of non-reentrant or racy code.
> 
> I think your queue suggestion may be the sanest approach, though,
> because it makes it keeps the responsibilities of the worker threads
> pretty clear.
> 

I agree the thought of multi-threading unpack_trees() is daunting!  It 
would be nice if the model of pruning the tree was sufficient to get 
reasonable performance with large repos.  I guess we'll see...

>> When I brought up this idea with some other git contributors they mentioned
>> that multi threading unpack_trees() had been discussed a few years ago on
>> the list but that the idea was discarded.  They couldn�t remember exactly
>> why it was discarded and none of us have been able to find the email threads
>> from that earlier discussion. As a result, I decided to write up this RFC
>> and see if the greater git community has ideas, suggestions, or more
>> background/history on whether this is a reasonable path to pursue or if
>> there are other/better ideas on how to speed up checkout especially on large
>> repos.
> 
> I don't remember any specific discussion, and didn't dig anything up
> after a few minutes. But I'd be willing to bet that the primary reason
> it would not be pursued is the general lack of thread safety in the
> current codebase.
> 
> -Peff
> 

  reply	other threads:[~2018-07-23 15:48 UTC|newest]

Thread overview: 121+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-18 20:45 [PATCH v1 0/3] [RFC] Speeding up checkout (and merge, rebase, etc) Ben Peart
2018-07-18 20:45 ` [PATCH v1 1/3] add unbounded Multi-Producer-Multi-Consumer queue Ben Peart
2018-07-18 20:57   ` Stefan Beller
2018-07-19 19:11   ` Junio C Hamano
2018-07-18 20:45 ` [PATCH v1 2/3] add performance tracing around traverse_trees() in unpack_trees() Ben Peart
2018-07-18 20:45 ` [PATCH v1 3/3] Add initial parallel version of unpack_trees() Ben Peart
2018-07-18 22:56   ` Junio C Hamano
2018-07-18 21:02 ` [PATCH v1 0/3] [RFC] Speeding up checkout (and merge, rebase, etc) Stefan Beller
2018-07-18 21:34 ` Jeff King
2018-07-23 15:48   ` Ben Peart [this message]
2018-07-23 17:03     ` Duy Nguyen
2018-07-23 20:51       ` Ben Peart
2018-07-24  4:20         ` Jeff King
2018-07-24 15:33           ` Duy Nguyen
2018-07-25 20:56             ` Ben Peart
2018-07-26  5:30               ` Duy Nguyen
2018-07-26 16:30                 ` Duy Nguyen
2018-07-26 19:40                   ` Junio C Hamano
2018-07-27 15:42                     ` Duy Nguyen
2018-07-27 16:22                       ` Ben Peart
2018-07-27 18:00                         ` Duy Nguyen
2018-07-27 17:14                       ` Junio C Hamano
2018-07-27 17:52                         ` Duy Nguyen
2018-07-29  6:24                           ` Duy Nguyen
2018-07-29 10:33                       ` [PATCH v2 0/4] Speed up unpack_trees() Nguyễn Thái Ngọc Duy
2018-07-29 10:33                         ` [PATCH v2 1/4] unpack-trees.c: add performance tracing Nguyễn Thái Ngọc Duy
2018-07-30 20:16                           ` Ben Peart
2018-07-29 10:33                         ` [PATCH v2 2/4] unpack-trees: optimize walking same trees with cache-tree Nguyễn Thái Ngọc Duy
2018-07-30 20:52                           ` Ben Peart
2018-07-29 10:33                         ` [PATCH v2 3/4] unpack-trees: reduce malloc in cache-tree walk Nguyễn Thái Ngọc Duy
2018-07-30 20:58                           ` Ben Peart
2018-07-29 10:33                         ` [PATCH v2 4/4] unpack-trees: cheaper index update when walking by cache-tree Nguyễn Thái Ngọc Duy
2018-08-08 18:46                           ` Elijah Newren
2018-08-10 16:39                             ` Duy Nguyen
2018-08-10 18:39                               ` Elijah Newren
2018-08-10 19:30                                 ` Duy Nguyen
2018-08-10 19:40                                   ` Elijah Newren
2018-08-10 19:48                                     ` Duy Nguyen
2018-07-30 18:10                         ` [PATCH v2 0/4] Speed up unpack_trees() Ben Peart
2018-07-31 15:31                           ` Duy Nguyen
2018-07-31 16:50                             ` Ben Peart
2018-07-31 17:31                               ` Ben Peart
2018-08-01 16:38                                 ` Duy Nguyen
2018-08-08 20:53                                   ` Ben Peart
2018-08-09  8:16                                     ` Ben Peart
2018-08-10 16:08                                       ` Duy Nguyen
2018-08-10 15:51                                     ` Duy Nguyen
2018-07-30 21:04                         ` Ben Peart
2018-08-04  5:37                         ` [PATCH v3 " Nguyễn Thái Ngọc Duy
2018-08-04  5:37                           ` [PATCH v3 1/4] unpack-trees: add performance tracing Nguyễn Thái Ngọc Duy
2018-08-04  5:37                           ` [PATCH v3 2/4] unpack-trees: optimize walking same trees with cache-tree Nguyễn Thái Ngọc Duy
2018-08-08 18:23                             ` Elijah Newren
2018-08-10 16:29                               ` Duy Nguyen
2018-08-10 18:48                                 ` Elijah Newren
2018-08-04  5:37                           ` [PATCH v3 3/4] unpack-trees: reduce malloc in cache-tree walk Nguyễn Thái Ngọc Duy
2018-08-08 18:30                             ` Elijah Newren
2018-08-04  5:37                           ` [PATCH v3 4/4] unpack-trees: cheaper index update when walking by cache-tree Nguyễn Thái Ngọc Duy
2018-08-06 15:48                           ` [PATCH v3 0/4] Speed up unpack_trees() Junio C Hamano
2018-08-06 15:59                             ` Duy Nguyen
2018-08-06 18:59                               ` Junio C Hamano
2018-08-08 17:00                                 ` Ben Peart
2018-08-08 17:46                               ` Junio C Hamano
2018-08-08 18:12                                 ` Junio C Hamano
2018-08-08 18:39                                   ` Junio C Hamano
2018-08-10 16:53                                     ` Duy Nguyen
2018-08-12  8:15                           ` [PATCH v4 0/5] " Nguyễn Thái Ngọc Duy
2018-08-12  8:15                             ` [PATCH v4 1/5] trace.h: support nested performance tracing Nguyễn Thái Ngọc Duy
2018-08-13 18:39                               ` Ben Peart
2018-08-12  8:15                             ` [PATCH v4 2/5] unpack-trees: add " Nguyễn Thái Ngọc Duy
2018-08-12 10:05                               ` Thomas Adam
2018-08-13 18:50                                 ` Junio C Hamano
2018-08-13 18:44                               ` Ben Peart
2018-08-13 19:25                               ` Jeff King
2018-08-13 19:36                                 ` Stefan Beller
2018-08-13 20:11                                   ` Ben Peart
2018-08-13 19:52                                 ` Duy Nguyen
2018-08-13 21:47                                   ` Jeff King
2018-08-13 22:41                                 ` Junio C Hamano
2018-08-14 18:19                                   ` Jeff Hostetler
2018-08-14 18:32                                     ` Duy Nguyen
2018-08-14 18:44                                       ` Stefan Beller
2018-08-14 18:51                                         ` Duy Nguyen
2018-08-14 19:54                                           ` Jeff King
2018-08-14 20:52                                           ` Junio C Hamano
2018-08-15 16:32                                             ` Duy Nguyen
2018-08-15 18:28                                               ` Junio C Hamano
2018-08-14 20:14                                         ` Jeff Hostetler
2018-08-12  8:15                             ` [PATCH v4 3/5] unpack-trees: optimize walking same trees with cache-tree Nguyễn Thái Ngọc Duy
2018-08-13 18:58                               ` Ben Peart
2018-08-15 16:38                                 ` Duy Nguyen
2018-08-12  8:15                             ` [PATCH v4 4/5] unpack-trees: reduce malloc in cache-tree walk Nguyễn Thái Ngọc Duy
2018-08-12  8:15                             ` [PATCH v4 5/5] unpack-trees: reuse (still valid) cache-tree from src_index Nguyễn Thái Ngọc Duy
2018-08-13 15:48                               ` Elijah Newren
2018-08-13 15:57                                 ` Duy Nguyen
2018-08-13 16:05                                 ` Ben Peart
2018-08-13 16:25                                   ` Duy Nguyen
2018-08-13 17:15                                     ` Ben Peart
2018-08-13 19:01                             ` [PATCH v4 0/5] Speed up unpack_trees() Junio C Hamano
2018-08-14 19:19                             ` Ben Peart
2018-08-18 14:41                             ` [PATCH v5 0/7] " Nguyễn Thái Ngọc Duy
2018-08-18 14:41                               ` [PATCH v5 1/7] trace.h: support nested performance tracing Nguyễn Thái Ngọc Duy
2018-08-18 14:41                               ` [PATCH v5 2/7] unpack-trees: add " Nguyễn Thái Ngọc Duy
2018-08-18 14:41                               ` [PATCH v5 3/7] unpack-trees: optimize walking same trees with cache-tree Nguyễn Thái Ngọc Duy
2018-08-20 12:43                                 ` Ben Peart
2018-08-18 14:41                               ` [PATCH v5 4/7] unpack-trees: reduce malloc in cache-tree walk Nguyễn Thái Ngọc Duy
2018-08-18 14:41                               ` [PATCH v5 5/7] unpack-trees: reuse (still valid) cache-tree from src_index Nguyễn Thái Ngọc Duy
2018-08-18 14:41                               ` [PATCH v5 6/7] unpack-trees: add missing cache invalidation Nguyễn Thái Ngọc Duy
2018-08-18 14:41                               ` [PATCH v5 7/7] cache-tree: verify valid cache-tree in the test suite Nguyễn Thái Ngọc Duy
2018-08-18 21:45                                 ` Elijah Newren
2018-08-18 22:01                               ` [PATCH v5 0/7] Speed up unpack_trees() Elijah Newren
2018-08-19  5:09                                 ` Duy Nguyen
2018-08-25 12:18                               ` [PATCH] Document update for nd/unpack-trees-with-cache-tree Nguyễn Thái Ngọc Duy
2018-08-25 12:31                                 ` Martin Ågren
2018-08-25 13:02                                 ` [PATCH v2] " Nguyễn Thái Ngọc Duy
2018-07-27 15:50                     ` [PATCH v1 0/3] [RFC] Speeding up checkout (and merge, rebase, etc) Ben Peart
2018-07-26 16:35               ` Duy Nguyen
2018-07-24  5:54         ` Junio C Hamano
2018-07-24 15:13         ` Duy Nguyen
2018-07-24 21:21           ` Jeff King
2018-07-25 16:09           ` Ben Peart
2018-07-24  4:27       ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a2ad0044-f317-69f7-f2bb-488111c626fb@gmail.com \
    --to=peartben@gmail.com \
    --cc=Ben.Peart@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).