From: "Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>
To: Nicolas Pitre <nico@fluxnic.net>
Cc: git@vger.kernel.org, "Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>
Subject: [BAD PATCH 0/9] v4-aware tree walker API
Date: Wed, 9 Oct 2013 21:46:07 +0700 [thread overview]
Message-ID: <1381329976-32082-1-git-send-email-pclouds@gmail.com> (raw)
I know I still have a lot of holes to plug, but this was more
interesting because we could see some encouraging numbers.
Unfortunately the result is disappointing. Maybe I did it in a stupid
way and need to restart with a totally different way.
"rev-list --objects" on v2 takes 4 secs, v4 with current walker 11s
and the new walker 16s (worst!). perf's top functions with v2 are
23,51% git libz.so.1.2.7 [.] inflate
16,66% git git [.] lookup_object
11,46% git libz.so.1.2.7 [.] inflate_fast
6,89% git libc-2.16.so [.] __memcpy_ssse3_back
4,19% git libz.so.1.2.7 [.] inflate_table
4,15% git git [.] find_pack_entry_one
3,84% git git [.] decode_tree_entry
and with new walker
58,61% git git [.] decode_entries
18,66% git git [.] decode_varint
9,73% git git [.] use_pack
3,31% git git [.] nth_packed_object_offset
1,73% git git [.] process_tree
1,66% git git [.] pv4_lookup_blob
1,09% git git [.] get_pathref
1,03% git libc-2.16.so [.] __memcpy_ssse3_back
0,90% git libz.so.1.2.7 [.] inflate
0,50% git libz.so.1.2.7 [.] inflate_table
It's no surprise that lookup_object is no longer hot. The closet is
pv4_lookup_blob. nth_packed_object_offset is getting hotter as it's
used extensively by decode_entries.
And decode_entries is getting toooo hot. This function is now called
for each tree entry of every tree. And it does get_tree_offset_cache()
lookup for every call (ironically we try hard to avoid hash lookup in
lookup_object).
The only bit I haven't done is avoid checking if a tree is already
examined, if so do not bother with copy sequences referring to it.
That should cut down the number of decode_entries but not sure how
much because there's no relation between tree traversing order and how
copy sequences are made.
Maybe we could make an exception and allow the tree walker to pass
pv4_tree_cache* directly to decode_entries so it does not need to do
the first lookup every time..
Suggestions?
Nguyễn Thái Ngọc Duy (9):
sha1_file: provide real packed type in object_info_extended
pack v4: move v2 tree entry generation code out of decode_entries
pv4_tree_desc: introduce new struct for pack v4 tree walker
pv4_tree_desc: use struct tree_desc from pv4_tree_desc
pv4_tree_desc: allow decode_entries to return v4 trees, one at a time
pv4_tree_desc: complete interface
pv4_tree_desc: don't bother looking for v4 trees if no v4 packs are present
pv4_tree_desc: avoid lookup_object() when possible
list-object.c: take "advantage" of new pv4_tree_desc interface
cache.h | 3 +-
list-objects.c | 38 +++++----
packv4-parse.c | 263 ++++++++++++++++++++++++++++++++++++++++++++++-----------
packv4-parse.h | 48 +++++++++++
sha1_file.c | 9 +-
streaming.c | 9 +-
6 files changed, 300 insertions(+), 70 deletions(-)
--
1.8.2.83.gc99314b
next reply other threads:[~2013-10-09 14:42 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-10-09 14:46 Nguyễn Thái Ngọc Duy [this message]
2013-10-09 14:46 ` [PATCH 1/9] sha1_file: provide real packed type in object_info_extended Nguyễn Thái Ngọc Duy
2013-10-09 14:46 ` [PATCH 2/9] pack v4: move v2 tree entry generation code out of decode_entries Nguyễn Thái Ngọc Duy
2013-10-09 14:46 ` [PATCH 3/9] pv4_tree_desc: introduce new struct for pack v4 tree walker Nguyễn Thái Ngọc Duy
2013-10-09 14:46 ` [PATCH 4/9] pv4_tree_desc: use struct tree_desc from pv4_tree_desc Nguyễn Thái Ngọc Duy
2013-10-09 14:46 ` [PATCH 5/9] pv4_tree_desc: allow decode_entries to return v4 trees, one at a time Nguyễn Thái Ngọc Duy
2013-10-09 14:46 ` [PATCH 6/9] pv4_tree_desc: complete interface Nguyễn Thái Ngọc Duy
2013-10-09 14:46 ` [PATCH 7/9] pv4_tree_desc: don't bother looking for v4 trees if no v4 packs are present Nguyễn Thái Ngọc Duy
2013-10-09 14:46 ` [PATCH 8/9] pv4_tree_desc: avoid lookup_object() when possible Nguyễn Thái Ngọc Duy
2013-10-09 14:46 ` [PATCH 9/9] list-object.c: take "advantage" of new pv4_tree_desc interface Nguyễn Thái Ngọc Duy
2013-10-09 16:51 ` [BAD PATCH 0/9] v4-aware tree walker API Nicolas Pitre
2013-10-11 12:22 ` Duy Nguyen
2013-10-11 13:05 ` Duy Nguyen
2013-10-12 14:42 ` Nicolas Pitre
2013-10-12 15:59 ` Duy Nguyen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1381329976-32082-1-git-send-email-pclouds@gmail.com \
--to=pclouds@gmail.com \
--cc=git@vger.kernel.org \
--cc=nico@fluxnic.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).