git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: Eric Sunshine <sunshine@sunshineco.com>
Cc: Git List <git@vger.kernel.org>
Subject: Re: [PATCH 3/3] index-pack: adjust default threading cap
Date: Fri, 21 Aug 2020 14:41:52 -0400	[thread overview]
Message-ID: <20200821184152.GA3263614@coredump.intra.peff.net> (raw)
In-Reply-To: <CAPig+cRQG6EN7Zq_fYMQOM7y9a6rgwWORZhN=px21-7RorWNdg@mail.gmail.com>

On Fri, Aug 21, 2020 at 02:08:55PM -0400, Eric Sunshine wrote:

> On Fri, Aug 21, 2020 at 1:58 PM Jeff King <peff@peff.net> wrote:
> > So what's a good default value? It's clear that the current cap of 3 is
> > too low; our default values are 42% and 57% slower than the best times
> > on each machine. The results on the 40-core machine imply that 20
> > threads is an actual barrier regardless of the number of cores, so we'll
> > take that as a maximum. We get the best results on these machines at
> > half of the online-cpus value. That's presumably a result of the
> > hyperthreading. That's common on multi-core Intel processors, but not
> > necessarily elsewhere. But if we take it as an assumption, we can
> > perform optimally on hyperthreaded machines and still do much better
> > than the status quo on other machines, as long as we never half below
> > the current value of 3.
> 
> I'm not familiar with the index-pack machinery, so this response may
> be silly, but the first question which came to my mind was whether or
> not SSD vs. spinning-platter disk impacts these results, and which of
> the two you were using for the tests (which I don't think was
> mentioned in any of the commit messages). So, basically, I'm wondering
> about the implication of this change for those of us still stuck with
> old spinning-platter disks.

They were both SSD machines, but it wouldn't matter for these tests
because they easily fit the whole pack into memory anyway.

But in the general case, I don't think disk performance would be
relevant. Delta resolution is very CPU-bound, because it's
de-compressing data and then computing its SHA-1. So linux.git, for
instance, is looking at ~1.3GB on disk that expands to 87.5GB of bytes
to run through SHA-1.

And it would be pretty unlikely to hit the disk anyway, as the thing we
primarily index is incoming packs which we've literally just written. So
I'd expect them to be in cache.

Of course, if you can get different numbers from p5302, I'd be curious
to hear them. :)

A more plausible downside might be that memory usage would increase as
we operate on multiple deltas at once. But pack-objects is already much
more hungry here, as it runs online_cpus() delta-compression threads
simultaneously, each of which may have up to window_size entries in
memory at once.

-Peff

  reply	other threads:[~2020-08-21 18:41 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-21 17:51 [PATCH 0/3] index-pack threading defaults Jeff King
2020-08-21 17:53 ` [PATCH 1/3] p5302: disable thread-count parameter tests by default Jeff King
2020-08-21 17:54 ` [PATCH 2/3] p5302: count up to online-cpus for thread tests Jeff King
2020-08-21 17:58   ` Jeff King
2020-08-21 17:58 ` [PATCH 3/3] index-pack: adjust default threading cap Jeff King
2020-08-21 18:08   ` Eric Sunshine
2020-08-21 18:41     ` Jeff King [this message]
2020-08-22  1:16   ` brian m. carlson
2020-08-24 17:37     ` Jeff King
2020-08-24 17:55       ` Eric Sunshine
2020-08-21 18:44 ` [PATCH 0/3] index-pack threading defaults Jeff King
2020-08-21 18:59   ` Junio C Hamano
2020-08-21 19:14     ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200821184152.GA3263614@coredump.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=sunshine@sunshineco.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).