git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Josh Steadmon <steadmon@google.com>
To: Jonathan Tan <jonathantanmy@google.com>
Cc: git@vger.kernel.org, stolee@gmail.com, peff@peff.net
Subject: Re: [PATCH v2 7/7] index-pack: make quantum of work smaller
Date: Thu, 27 Feb 2020 16:04:59 -0800	[thread overview]
Message-ID: <20200228000459.GE12115@google.com> (raw)
In-Reply-To: <2f2e36d3efede79f55347ce9d80d453bb05a4e15.1571343096.git.jonathantanmy@google.com>

On 2019.10.17 13:17, Jonathan Tan wrote:
> Currently, when index-pack resolves deltas, it does not split up delta
> trees into threads: each delta base root (an object that is not a
> REF_DELTA or OFS_DELTA) can go into its own thread, but all deltas on
> that root (direct or indirect) are processed in the same thread.
> 
> This is a problem when a repository contains a large text file (thus,
> delta-able) that is modified many times - delta resolution time during
> fetching is dominated by processing the deltas corresponding to that
> text file.
> 
> This patch contains a solution to that. When cloning using
> 
>   git -c core.deltabasecachelimit=1g clone \
>     https://fuchsia.googlesource.com/third_party/vulkan-cts
> 
> on my laptop, clone time improved from 3m2s to 2m5s (using 3 threads,
> which is the default).
> 
> The solution is to have a global work stack. This stack contains delta
> bases (objects, whether appearing directly in the packfile or generated
> by delta resolution, that themselves have delta children) that need to
> be processed; whenever a thread needs work, it peeks at the top of the
> stack and processes its next unprocessed child. If a thread finds the
> stack empty, it will look for more delta base roots to push on the stack
> instead.
> 
> The main weakness of having a global work stack is that more time is
> spent in the mutex, but profiling has shown that most time is spent in
> the resolution of the deltas themselves, so this shouldn't be an issue
> in practice. In any case, experimentation (as described in the clone
> command above) shows that this patch is a net improvement.
> 
> Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
> ---
>  builtin/index-pack.c | 336 ++++++++++++++++++++++++-------------------
>  1 file changed, 190 insertions(+), 146 deletions(-)
> 
> diff --git a/builtin/index-pack.c b/builtin/index-pack.c
> index 31607a77fc..072592a35d 100644
> --- a/builtin/index-pack.c
> +++ b/builtin/index-pack.c
> @@ -38,15 +38,49 @@ struct base_data {
>  	struct object_entry *obj;
>  	int ref_first, ref_last;
>  	int ofs_first, ofs_last;
> +	/*
> +	 * Threads should increment retain_data if they are about to call
> +	 * patch_delta() using this struct's data as a base, and decrement this
> +	 * when they are done. While retain_data is nonzero, this struct's data
> +	 * will not be freed even if the delta base cache limit is exceeded.
> +	 */
> +	int retain_data;
> +	/*
> +	 * The number of direct children that have not been fully processed
> +	 * (entered work_head, entered done_head, left done_head). When this
> +	 * number reaches zero, this struct base_data can be freed.
> +	 */
> +	int children_remaining;
>  
>  	/* Not initialized by make_base(). */
> +	struct list_head list;
>  	void *data;
>  	unsigned long size;
>  };

As a newb, the addition of a field just called "list" struck me as
needing a more descriptive name. But this makes more sense now after
looking at list.h. And it seems that most uses in other parts of the
code are similarly just named "list".


> +/*
> + * Stack of struct base_data that have unprocessed children.
> + * threaded_second_pass() uses this as a source of work (the other being the
> + * objects array).
> + */
> +LIST_HEAD(work_head);
> +
> +/*
> + * Stack of struct base_data that have children, all of whom have been
> + * processed or are being processed, and at least one child is being processed.
> + * These struct base_data must be kept around until the last child is
> + * processed.
> + */
> +LIST_HEAD(done_head);
> +
> +/*
> + * All threads share one delta base cache.
> + */
> +size_t base_cache_used;
> +size_t base_cache_limit;
> +
>  struct thread_local {
>  	pthread_t thread;
> -	size_t base_cache_used;
>  	int pack_fd;
>  };

Seeing several new shared variables made me paranoid about whether or
not these were thread-safe. There are already five mutexes in this file,
but there are no clear descriptions (apart from hints in the lock names)
about which locks protect which variables. It would be nice to make this
a bit more explicit.

I wish we had something similar to Abseil's "GUARDED_BY" annotations [1]
so that we could ensure thread safety with static analysis, but that's
obviously out of scope for this series.

[1]: https://abseil.io/docs/cpp/guides/synchronization#thread-annotations


> @@ -920,6 +948,8 @@ static struct base_data *make_base(struct object_entry *obj,
>  				&base->ref_first, &base->ref_last);
>  	find_ofs_delta_children(obj->idx.offset,
>  				&base->ofs_first, &base->ofs_last);
> +	base->children_remaining = base->ref_last - base->ref_first +
> +		base->ofs_last - base->ofs_first + 2;
>  	return base;
>  }

The extra +2 here confused me for a bit, but it makes sense now. You're
subtracting indices to get the length, so you want +1 to make the math
work out, and you're doing it across two different lists. In the edge
case where a list is empty, "first" is 0 and "last" is -1, so the math
still works out.


> @@ -1065,33 +1011,128 @@ static int compare_ref_delta_entry(const void *a, const void *b)
[snip]
>  		work_unlock();
>  
> -		resolve_base(&objects[i]);
> +		if (parent) {
> +			child = resolve_delta(child_obj, parent);
> +			if (!child->children_remaining)
> +				FREE_AND_NULL(child->data);
> +		} else {
> +			child = make_base(child_obj, NULL);
> +			if (child->children_remaining) {
> +				/*
> +				 * Since this child has its own delta children,
> +				 * we will need this data in the future.
> +				 * Inflate now so that future iterations will
> +				 * have access to this object's data while
> +				 * outside the work mutex.
> +				 */
> +				child->data = get_data_from_pack(child_obj);
> +				child->size = child_obj->size;
> +			}
> +		}
> +
> +		work_lock();

This part made me uneasy at first, because it looks like we might be
doing work on shared data outside of the work lock. But just prior to
unlocking, we parent->retain_data, so we know it won't be freed. And the
only part we modify is child, which I don't believe is shared with any
other threads.

  reply	other threads:[~2020-02-28  0:05 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-09 23:44 [RFC PATCH 0/6] Better threaded delta resolution in index-pack Jonathan Tan
2019-10-09 23:44 ` [PATCH 1/6] index-pack: unify threaded and unthreaded code Jonathan Tan
2019-10-17  6:20   ` Jeff King
2019-10-09 23:44 ` [PATCH 2/6] index-pack: remove redundant parameter Jonathan Tan
2019-10-17  6:21   ` Jeff King
2019-10-09 23:44 ` [PATCH 3/6] index-pack: remove redundant child field Jonathan Tan
2019-10-10 14:45   ` Derrick Stolee
2019-10-10 19:02     ` Jonathan Tan
2019-10-17  6:24       ` Jeff King
2019-10-17  6:26   ` Jeff King
2019-10-09 23:44 ` [PATCH 4/6] index-pack: calculate {ref,ofs}_{first,last} early Jonathan Tan
2019-10-17  6:30   ` Jeff King
2019-10-09 23:44 ` [PATCH 5/6] index-pack: make resolve_delta() assume base data Jonathan Tan
2019-10-17  6:32   ` Jeff King
2019-10-09 23:44 ` [PATCH 6/6] index-pack: make quantum of work smaller Jonathan Tan
2019-10-17  6:35   ` Jeff King
2019-10-17 20:17 ` [PATCH v2 0/7] Better threaded delta resolution in index-pack Jonathan Tan
2019-10-17 20:17   ` [PATCH v2 1/7] Documentation: deltaBaseCacheLimit is per-thread Jonathan Tan
2019-10-17 20:17   ` [PATCH v2 2/7] index-pack: unify threaded and unthreaded code Jonathan Tan
2019-10-17 20:17   ` [PATCH v2 3/7] index-pack: remove redundant parameter Jonathan Tan
2020-02-28  0:04     ` Josh Steadmon
2020-03-10 21:29       ` Jonathan Tan
2019-10-17 20:17   ` [PATCH v2 4/7] index-pack: remove redundant child field Jonathan Tan
2020-02-28  0:04     ` Josh Steadmon
2019-10-17 20:17   ` [PATCH v2 5/7] index-pack: calculate {ref,ofs}_{first,last} early Jonathan Tan
2019-10-17 20:17   ` [PATCH v2 6/7] index-pack: make resolve_delta() assume base data Jonathan Tan
2019-10-17 20:17   ` [PATCH v2 7/7] index-pack: make quantum of work smaller Jonathan Tan
2020-02-28  0:04     ` Josh Steadmon [this message]
2020-03-10 21:42       ` Jonathan Tan
2020-02-28  0:03   ` [PATCH v2 0/7] Better threaded delta resolution in index-pack Josh Steadmon
2020-03-10 21:45     ` Jonathan Tan
  -- strict thread matches above, loose matches on Subject: below --
2020-08-24 19:16 [PATCH 0/7] Better threaded delta resolution in index-pack (another try) Jonathan Tan
2020-09-08 19:48 ` [PATCH v2 " Jonathan Tan
2020-09-08 19:48   ` [PATCH v2 7/7] index-pack: make quantum of work smaller Jonathan Tan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200228000459.GE12115@google.com \
    --to=steadmon@google.com \
    --cc=git@vger.kernel.org \
    --cc=jonathantanmy@google.com \
    --cc=peff@peff.net \
    --cc=stolee@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).