git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Duy Nguyen <pclouds@gmail.com>
To: Ben Peart <peartben@gmail.com>
Cc: Ben Peart <Ben.Peart@microsoft.com>,
	Git Mailing List <git@vger.kernel.org>,
	Junio C Hamano <gitster@pobox.com>, Jeff King <peff@peff.net>
Subject: Re: [PATCH v2 0/4] Speed up unpack_trees()
Date: Fri, 10 Aug 2018 17:51:43 +0200	[thread overview]
Message-ID: <CACsJy8B9=HL3mnBVuPEmAR=ukCysxGtpAbr3KY-duL7cQ4D=CQ@mail.gmail.com> (raw)
In-Reply-To: <ccec34c9-b81a-bcb4-7d05-48dccc059cc8@gmail.com>

On Wed, Aug 8, 2018 at 10:53 PM Ben Peart <peartben@gmail.com> wrote:
>
>
>
> On 8/1/2018 12:38 PM, Duy Nguyen wrote:
> > On Tue, Jul 31, 2018 at 01:31:31PM -0400, Ben Peart wrote:
> >>
> >>
> >> On 7/31/2018 12:50 PM, Ben Peart wrote:
> >>>
> >>>
> >>> On 7/31/2018 11:31 AM, Duy Nguyen wrote:
> >>
> >>>>
> >>>>> In the performance game of whack-a-mole, that call to repair cache-tree
> >>>>> is now looking quite expensive...
> >>>>
> >>>> Yeah and I think we can whack that mole too. I did some measurement.
> >>>> Best case possible, we just need to scan through two indexes (one with
> >>>> many good cache-tree, one with no cache-tree), compare and copy
> >>>> cache-tree over. The scanning takes like 1% time of current repair
> >>>> step and I suspect it's the hashing that takes most of the time. Of
> >>>> course real world won't have such nice numbers, but I guess we could
> >>>> maybe half cache-tree update/repair time.
> >>>>
> >>>
> >>> I have some great profiling tools available so will take a look at this
> >>> next and see exactly where the time is being spent.
> >>
> >> Good instincts.  In cache_tree_update, the heavy hitter is definitely
> >> hash_object_file followed by has_object_file.
> >>
> >> Name                                 Inc %        Inc
> >> + git!cache_tree_update               12.4      4,935
> >> |+ git!update_one                     11.8      4,706
> >> | + git!update_one                    11.8      4,706
> >> |  + git!hash_object_file              6.1      2,406
> >> |  + git!has_object_file               2.0        813
> >> |  + OTHER <<vcruntime140d!strchr>>    0.5        203
> >> |  + git!strbuf_addf                   0.4        155
> >> |  + git!strbuf_release                0.4        143
> >> |  + git!strbuf_add                    0.3        121
> >> |  + OTHER <<vcruntime140d!memcmp>>    0.2         93
> >> |  + git!strbuf_grow                   0.1         25
> >
> > Ben, if you work on this, this could be a good starting point. I will
> > not work on this because I still have some other things to catch up
> > and follow through. You can have my sign off if you reuse something
> > from this patch
> >
> > Even if it's a naive implementation, the initial numbers look pretty
> > good. Without the patch we have
> >
> > 18:31:05.970621 unpack-trees.c:1437     performance: 0.000001029 s: copy
> > 18:31:05.975729 unpack-trees.c:1444     performance: 0.005082004 s: update
> >
> > And with the patch
> >
> > 18:31:13.295655 unpack-trees.c:1437     performance: 0.000198017 s: copy
> > 18:31:13.296757 unpack-trees.c:1444     performance: 0.001075935 s: update
> >
> > Time saving is about 80% by the look of this (best possible case
> > because only the top tree needs to be hashed and written out).
> >
> > -- 8< --
> > diff --git a/cache-tree.c b/cache-tree.c
> > index 6b46711996..67a4a93100 100644
> > --- a/cache-tree.c
> > +++ b/cache-tree.c
> > @@ -440,6 +440,147 @@ int cache_tree_update(struct index_state *istate, int flags)
> >       return 0;
> >   }
> >
> > +static int same(const struct cache_entry *a, const struct cache_entry *b)
> > +{
> > +     if (ce_stage(a) || ce_stage(b))
> > +             return 0;
> > +     if ((a->ce_flags | b->ce_flags) & CE_CONFLICTED)
> > +             return 0;
> > +     return a->ce_mode == b->ce_mode &&
> > +            !oidcmp(&a->oid, &b->oid);
> > +}
> > +
> > +static int cache_tree_name_pos(const struct index_state *istate,
> > +                            const struct strbuf *path)
> > +{
> > +     int pos;
> > +
> > +     if (!path->len)
> > +             return 0;
> > +
> > +     pos = index_name_pos(istate, path->buf, path->len);
> > +     if (pos >= 0)
> > +             BUG("No no no, directory path must not exist in index");
> > +     return -pos - 1;
> > +}
> > +
> > +/*
> > + * Locate the same cache-tree in two separate indexes. Check the
> > + * cache-tree is still valid for the "to" index (i.e. it contains the
> > + * same set of entries in the "from" index).
> > + */
> > +static int verify_one_cache_tree(const struct index_state *to,
> > +                              const struct index_state *from,
> > +                              const struct cache_tree *it,
> > +                              const struct strbuf *path)
> > +{
> > +     int i, spos, dpos;
> > +
> > +     spos = cache_tree_name_pos(from, path);
> > +     if (spos + it->entry_count > from->cache_nr)
> > +             return -1;
> > +
> > +     dpos = cache_tree_name_pos(to, path);
> > +     if (dpos + it->entry_count > to->cache_nr)
> > +             return -1;
> > +
> > +     /* Can we quickly check head and tail and bail out early */
> > +     if (!same(from->cache[spos], to->cache[spos]) ||
> > +         !same(from->cache[spos + it->entry_count - 1],
> > +               to->cache[spos + it->entry_count - 1]))
> > +             return -1;
> > +
> > +     for (i = 1; i < it->entry_count - 1; i++)
> > +             if (!same(from->cache[spos + i],
> > +                       to->cache[dpos + i]))
> > +                     return -1;
> > +
> > +     return 0;
> > +}
> > +
> > +static int verify_and_invalidate(struct index_state *to,
> > +                              const struct index_state *from,
> > +                              struct cache_tree *it,
> > +                              struct strbuf *path)
> > +{
> > +     /*
> > +      * Optimistically verify the current tree first. Alternatively
> > +      * we could verify all the subtrees first then do this
> > +      * last. Any invalid subtree would also invalidates its
> > +      * ancestors.
> > +      */
> > +     if (it->entry_count != -1 &&
> > +         verify_one_cache_tree(to, from, it, path))
> > +             it->entry_count = -1;
> > +
> > +     /*
> > +      * If the current tree is valid, don't bother checking
> > +      * inside. All subtrees _should_ also be valid
> > +      */
> > +     if (it->entry_count == -1) {
> > +             int i, len = path->len;
> > +
> > +             for (i = 0; i < it->subtree_nr; i++) {
> > +                     struct cache_tree_sub *down = it->down[i];
> > +
> > +                     if (!down || !down->cache_tree)
> > +                             continue;
> > +
> > +                     strbuf_setlen(path, len);
> > +                     strbuf_add(path, down->name, down->namelen);
> > +                     strbuf_addch(path, '/');
> > +                     if (verify_and_invalidate(to, from,
> > +                                               down->cache_tree, path))
> > +                             return -1;
> > +             }
> > +             strbuf_setlen(path, len);
> > +     }
> > +     return 0;
> > +}
> > +
> > +static struct cache_tree *duplicate_cache_tree(const struct cache_tree *src)
> > +{
> > +     struct cache_tree *dst;
> > +     int i;
> > +
> > +     if (!src)
> > +             return NULL;
> > +
> > +     dst = xmalloc(sizeof(*dst));
> > +     dst->entry_count = src->entry_count;
> > +     oidcpy(&dst->oid, &src->oid);
> > +     dst->subtree_nr = src->subtree_nr;
> > +     dst->subtree_alloc = dst->subtree_nr;
> > +     ALLOC_ARRAY(dst->down, dst->subtree_alloc);
> > +     for (i = 0; i < src->subtree_nr; i++) {
> > +             struct cache_tree_sub *dsrc = src->down[i];
> > +             struct cache_tree_sub *down;
> > +
> > +             FLEX_ALLOC_MEM(down, name, dsrc->name, dsrc->namelen);
> > +             down->count = dsrc->count;
> > +             down->namelen = dsrc->namelen;
> > +             down->used = dsrc->used;
> > +             down->cache_tree = duplicate_cache_tree(dsrc->cache_tree);
> > +             dst->down[i] = down;
> > +     }
> > +     return dst;
> > +}
> > +
> > +int cache_tree_copy(struct index_state *to, const struct index_state *from)
> > +{
> > +     struct cache_tree *it = duplicate_cache_tree(from->cache_tree);
> > +     struct strbuf path = STRBUF_INIT;
> > +     int ret;
> > +
> > +     if (to->cache_tree)
> > +             BUG("Sorry merging cache-tree is not supported yet");
> > +     ret = verify_and_invalidate(to, from, it, &path);
> > +     to->cache_tree = it;
> > +     to->cache_changed |= CACHE_TREE_CHANGED;
> > +     strbuf_release(&path);
> > +     return ret;
> > +}
> > +
> >   static void write_one(struct strbuf *buffer, struct cache_tree *it,
> >                         const char *path, int pathlen)
> >   {
> > diff --git a/cache-tree.h b/cache-tree.h
> > index cfd5328cc9..6981da8e0d 100644
> > --- a/cache-tree.h
> > +++ b/cache-tree.h
> > @@ -53,4 +53,6 @@ void prime_cache_tree(struct index_state *, struct tree *);
> >
> >   extern int cache_tree_matches_traversal(struct cache_tree *, struct name_entry *ent, struct traverse_info *info);
> >
> > +int cache_tree_copy(struct index_state *to, const struct index_state *from);
> > +
> >   #endif
> > diff --git a/unpack-trees.c b/unpack-trees.c
> > index cd0680f11e..cb3fdd42a6 100644
> > --- a/unpack-trees.c
> > +++ b/unpack-trees.c
> > @@ -1427,12 +1427,22 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
> >       ret = check_updates(o) ? (-2) : 0;
> >       if (o->dst_index) {
> >               if (!ret) {
> > -                     if (!o->result.cache_tree)
> > +                     if (!o->result.cache_tree) {
> > +                             uint64_t start = getnanotime();
> > +#if 0
> >                               o->result.cache_tree = cache_tree();
> > -                     if (!cache_tree_fully_valid(o->result.cache_tree))
> > +#else
> > +                             cache_tree_copy(&o->result, o->src_index);
> > +#endif
> > +                             trace_performance_since(start, "copy");
> > +                     }
> > +                     if (!cache_tree_fully_valid(o->result.cache_tree)) {
> > +                             uint64_t start = getnanotime();
> >                               cache_tree_update(&o->result,
> >                                                 WRITE_TREE_SILENT |
> >                                                 WRITE_TREE_REPAIR);
> > +                             trace_performance_since(start, "update");
> > +                     }
> >               }
> >               move_index_extensions(&o->result, o->src_index);
> >               discard_index(o->dst_index);
> > -- 8< --
> >
>
> I like the idea (and the perf win!) but it seems like there is an
> important piece missing.  If I'm reading this correctly, unpack_trees()
> will copy the source cache tree (instead of creating a new one) and then
> verify_and_invalidate() will walk the cache tree and for any tree that
> is dirty, it will flag its ancestors as dirty as well.

That, and the verification part. The for loop at the bottom of
verify_one_cache_tree() makes sure that the cache-tree is valid. That
is, if we recreate cache-tree from scratch in the destination index,
it should produce the same OID as the cache-tree we copy over.

But I think I'm a bit loose in that check. Suppose in the source index we have

abc
foo/abc
foo/def
xyz

the for loop tries to make sure that for cache-tree of 'foo', the
destination index must have foo/abc and foo/def (with same mode,
oid....) but it fails to catch this

abc
foo/abc
foo/def
foo/xyz
xyz

If we recreate cache-tree from scratch, the cache-tree for 'foo'
should cover three items and have different oid than one we copied
from the source index. Same problem could happen if we have something
in foo, but before foo/abc.

> What I don't understand is how any cache tree entries that became
> invalid as a result of the merge of the n-trees are marked as invalid.
> It seems like something needs to walk the cache tree and call
> cache_tree_invalidate_path() for all entries that changed as a result of
> the merge before the call to verify_and_invalidate().

I'm not sure I understand but anyway the way I understand it, when we
merge from o->src_index to o->result, we start o->result with empty
cache-tree. There's nothing in there to invalidate, even though we do
call cache_tree_invalidate_path() (from invalidate_ce_path() in
unpack-trees.c)

I don't think the merge operation is related to this at all. This
problem can be stated as "I have a set of good cache-trees that are
associated with index 'A' and a new index 'B' with no cache-tree at
all. Can I (cheaply) reuse some  cache-tree from 'A'?". My answer in
this patch is yes, for each cache-tree in A, make sure that the list
of cache-entries associated with it is present in B (which is almost
correct).
-- 
Duy

  parent reply	other threads:[~2018-08-10 15:52 UTC|newest]

Thread overview: 121+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-18 20:45 [PATCH v1 0/3] [RFC] Speeding up checkout (and merge, rebase, etc) Ben Peart
2018-07-18 20:45 ` [PATCH v1 1/3] add unbounded Multi-Producer-Multi-Consumer queue Ben Peart
2018-07-18 20:57   ` Stefan Beller
2018-07-19 19:11   ` Junio C Hamano
2018-07-18 20:45 ` [PATCH v1 2/3] add performance tracing around traverse_trees() in unpack_trees() Ben Peart
2018-07-18 20:45 ` [PATCH v1 3/3] Add initial parallel version of unpack_trees() Ben Peart
2018-07-18 22:56   ` Junio C Hamano
2018-07-18 21:02 ` [PATCH v1 0/3] [RFC] Speeding up checkout (and merge, rebase, etc) Stefan Beller
2018-07-18 21:34 ` Jeff King
2018-07-23 15:48   ` Ben Peart
2018-07-23 17:03     ` Duy Nguyen
2018-07-23 20:51       ` Ben Peart
2018-07-24  4:20         ` Jeff King
2018-07-24 15:33           ` Duy Nguyen
2018-07-25 20:56             ` Ben Peart
2018-07-26  5:30               ` Duy Nguyen
2018-07-26 16:30                 ` Duy Nguyen
2018-07-26 19:40                   ` Junio C Hamano
2018-07-27 15:42                     ` Duy Nguyen
2018-07-27 16:22                       ` Ben Peart
2018-07-27 18:00                         ` Duy Nguyen
2018-07-27 17:14                       ` Junio C Hamano
2018-07-27 17:52                         ` Duy Nguyen
2018-07-29  6:24                           ` Duy Nguyen
2018-07-29 10:33                       ` [PATCH v2 0/4] Speed up unpack_trees() Nguyễn Thái Ngọc Duy
2018-07-29 10:33                         ` [PATCH v2 1/4] unpack-trees.c: add performance tracing Nguyễn Thái Ngọc Duy
2018-07-30 20:16                           ` Ben Peart
2018-07-29 10:33                         ` [PATCH v2 2/4] unpack-trees: optimize walking same trees with cache-tree Nguyễn Thái Ngọc Duy
2018-07-30 20:52                           ` Ben Peart
2018-07-29 10:33                         ` [PATCH v2 3/4] unpack-trees: reduce malloc in cache-tree walk Nguyễn Thái Ngọc Duy
2018-07-30 20:58                           ` Ben Peart
2018-07-29 10:33                         ` [PATCH v2 4/4] unpack-trees: cheaper index update when walking by cache-tree Nguyễn Thái Ngọc Duy
2018-08-08 18:46                           ` Elijah Newren
2018-08-10 16:39                             ` Duy Nguyen
2018-08-10 18:39                               ` Elijah Newren
2018-08-10 19:30                                 ` Duy Nguyen
2018-08-10 19:40                                   ` Elijah Newren
2018-08-10 19:48                                     ` Duy Nguyen
2018-07-30 18:10                         ` [PATCH v2 0/4] Speed up unpack_trees() Ben Peart
2018-07-31 15:31                           ` Duy Nguyen
2018-07-31 16:50                             ` Ben Peart
2018-07-31 17:31                               ` Ben Peart
2018-08-01 16:38                                 ` Duy Nguyen
2018-08-08 20:53                                   ` Ben Peart
2018-08-09  8:16                                     ` Ben Peart
2018-08-10 16:08                                       ` Duy Nguyen
2018-08-10 15:51                                     ` Duy Nguyen [this message]
2018-07-30 21:04                         ` Ben Peart
2018-08-04  5:37                         ` [PATCH v3 " Nguyễn Thái Ngọc Duy
2018-08-04  5:37                           ` [PATCH v3 1/4] unpack-trees: add performance tracing Nguyễn Thái Ngọc Duy
2018-08-04  5:37                           ` [PATCH v3 2/4] unpack-trees: optimize walking same trees with cache-tree Nguyễn Thái Ngọc Duy
2018-08-08 18:23                             ` Elijah Newren
2018-08-10 16:29                               ` Duy Nguyen
2018-08-10 18:48                                 ` Elijah Newren
2018-08-04  5:37                           ` [PATCH v3 3/4] unpack-trees: reduce malloc in cache-tree walk Nguyễn Thái Ngọc Duy
2018-08-08 18:30                             ` Elijah Newren
2018-08-04  5:37                           ` [PATCH v3 4/4] unpack-trees: cheaper index update when walking by cache-tree Nguyễn Thái Ngọc Duy
2018-08-06 15:48                           ` [PATCH v3 0/4] Speed up unpack_trees() Junio C Hamano
2018-08-06 15:59                             ` Duy Nguyen
2018-08-06 18:59                               ` Junio C Hamano
2018-08-08 17:00                                 ` Ben Peart
2018-08-08 17:46                               ` Junio C Hamano
2018-08-08 18:12                                 ` Junio C Hamano
2018-08-08 18:39                                   ` Junio C Hamano
2018-08-10 16:53                                     ` Duy Nguyen
2018-08-12  8:15                           ` [PATCH v4 0/5] " Nguyễn Thái Ngọc Duy
2018-08-12  8:15                             ` [PATCH v4 1/5] trace.h: support nested performance tracing Nguyễn Thái Ngọc Duy
2018-08-13 18:39                               ` Ben Peart
2018-08-12  8:15                             ` [PATCH v4 2/5] unpack-trees: add " Nguyễn Thái Ngọc Duy
2018-08-12 10:05                               ` Thomas Adam
2018-08-13 18:50                                 ` Junio C Hamano
2018-08-13 18:44                               ` Ben Peart
2018-08-13 19:25                               ` Jeff King
2018-08-13 19:36                                 ` Stefan Beller
2018-08-13 20:11                                   ` Ben Peart
2018-08-13 19:52                                 ` Duy Nguyen
2018-08-13 21:47                                   ` Jeff King
2018-08-13 22:41                                 ` Junio C Hamano
2018-08-14 18:19                                   ` Jeff Hostetler
2018-08-14 18:32                                     ` Duy Nguyen
2018-08-14 18:44                                       ` Stefan Beller
2018-08-14 18:51                                         ` Duy Nguyen
2018-08-14 19:54                                           ` Jeff King
2018-08-14 20:52                                           ` Junio C Hamano
2018-08-15 16:32                                             ` Duy Nguyen
2018-08-15 18:28                                               ` Junio C Hamano
2018-08-14 20:14                                         ` Jeff Hostetler
2018-08-12  8:15                             ` [PATCH v4 3/5] unpack-trees: optimize walking same trees with cache-tree Nguyễn Thái Ngọc Duy
2018-08-13 18:58                               ` Ben Peart
2018-08-15 16:38                                 ` Duy Nguyen
2018-08-12  8:15                             ` [PATCH v4 4/5] unpack-trees: reduce malloc in cache-tree walk Nguyễn Thái Ngọc Duy
2018-08-12  8:15                             ` [PATCH v4 5/5] unpack-trees: reuse (still valid) cache-tree from src_index Nguyễn Thái Ngọc Duy
2018-08-13 15:48                               ` Elijah Newren
2018-08-13 15:57                                 ` Duy Nguyen
2018-08-13 16:05                                 ` Ben Peart
2018-08-13 16:25                                   ` Duy Nguyen
2018-08-13 17:15                                     ` Ben Peart
2018-08-13 19:01                             ` [PATCH v4 0/5] Speed up unpack_trees() Junio C Hamano
2018-08-14 19:19                             ` Ben Peart
2018-08-18 14:41                             ` [PATCH v5 0/7] " Nguyễn Thái Ngọc Duy
2018-08-18 14:41                               ` [PATCH v5 1/7] trace.h: support nested performance tracing Nguyễn Thái Ngọc Duy
2018-08-18 14:41                               ` [PATCH v5 2/7] unpack-trees: add " Nguyễn Thái Ngọc Duy
2018-08-18 14:41                               ` [PATCH v5 3/7] unpack-trees: optimize walking same trees with cache-tree Nguyễn Thái Ngọc Duy
2018-08-20 12:43                                 ` Ben Peart
2018-08-18 14:41                               ` [PATCH v5 4/7] unpack-trees: reduce malloc in cache-tree walk Nguyễn Thái Ngọc Duy
2018-08-18 14:41                               ` [PATCH v5 5/7] unpack-trees: reuse (still valid) cache-tree from src_index Nguyễn Thái Ngọc Duy
2018-08-18 14:41                               ` [PATCH v5 6/7] unpack-trees: add missing cache invalidation Nguyễn Thái Ngọc Duy
2018-08-18 14:41                               ` [PATCH v5 7/7] cache-tree: verify valid cache-tree in the test suite Nguyễn Thái Ngọc Duy
2018-08-18 21:45                                 ` Elijah Newren
2018-08-18 22:01                               ` [PATCH v5 0/7] Speed up unpack_trees() Elijah Newren
2018-08-19  5:09                                 ` Duy Nguyen
2018-08-25 12:18                               ` [PATCH] Document update for nd/unpack-trees-with-cache-tree Nguyễn Thái Ngọc Duy
2018-08-25 12:31                                 ` Martin Ågren
2018-08-25 13:02                                 ` [PATCH v2] " Nguyễn Thái Ngọc Duy
2018-07-27 15:50                     ` [PATCH v1 0/3] [RFC] Speeding up checkout (and merge, rebase, etc) Ben Peart
2018-07-26 16:35               ` Duy Nguyen
2018-07-24  5:54         ` Junio C Hamano
2018-07-24 15:13         ` Duy Nguyen
2018-07-24 21:21           ` Jeff King
2018-07-25 16:09           ` Ben Peart
2018-07-24  4:27       ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CACsJy8B9=HL3mnBVuPEmAR=ukCysxGtpAbr3KY-duL7cQ4D=CQ@mail.gmail.com' \
    --to=pclouds@gmail.com \
    --cc=Ben.Peart@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=peartben@gmail.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).