git@vger.kernel.org list mirror (unofficial, one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Cc: Eric Wong <e@80x24.org>, Git Mailing List <git@vger.kernel.org>,
	Junio C Hamano <gitster@pobox.com>,
	Vicent Marti <tanoku@gmail.com>
Subject: Re: [PATCH/RFC] gitperformance: add new documentation about git performance tuning
Date: Mon, 3 Apr 2017 22:19:21 -0400	[thread overview]
Message-ID: <20170404021921.cz6gz3lpanr2rwqv@sigill.intra.peff.net> (raw)
In-Reply-To: <CACBZZX7R5svNJ+Ak3LFh8+kY48i6V7Yo6JDS+PSDJCkZ5vHb6w@mail.gmail.com>

On Mon, Apr 03, 2017 at 11:57:51PM +0200, Ævar Arnfjörð Bjarmason wrote:

> >> +These features can be enabled on git servers, they won't help the
> >> +performance of the servers themselves,
> >
> > Is that true for bitmaps?  I thought they reduced CPU usage on
> > the server side...
> 
> I'm not sure, JK? From my reading of the repack.writeBitmaps docs it
> seems to only help clone/fetch for the client, but maybe they do more
> than that.

Bitmaps reduce the CPU required to do the "Counting" phase of
pack-objects. For serving a fetch or clone, the server side is happy
because they use less CPU, but the client is happy because the server
moves to the "Writing" phase more quickly.

Bitmaps also help with pushes, but this is usually less interesting. You
don't tend to push all of history over and over (whereas people _do_
tend to clone all of history over and over).

They don't speed up the counting portion of a regular repack. In theory
they could, but the resulting packs may grow less optimal over time
(e.g., we can't compute the same history-based write order, so over time
your objects would get jumbled, leading to worse cold-cache behavior).

You can also use bitmaps for other reachability computations, but we
don't do so currently. I have patches that I need to clean up to use
them for "git prune", doing ahead/behind checks, --contains, etc.

> I also see we should mention pack.writeBitmapHashCache, which
> according to my reading of v2.0.0-rc0~13^2~8 only helps clone/fetch.

Yes, it helps the delta search heuristic, so only pack-objects would
ever benefit. This should basically be turned on all the time, as
without it fetches from partially-bitmapped repos (i.e., when you've
gotten some pushes but haven't repacked yet) do a really bad job of
finding deltas (the waste too much time and deliver sub-optimal packs).

Arguably it should be the default. The initial patches made it optional
for strict JGit compatibility (I don't know if JGit ever implemented the
extension). We've had it on at GitHub since day one, so I don't have any
operational experience with turning it off (aside from the simulated
numbers in that commit message).

> > A sidenote: I wonder if bitmaps should be the default for bare
> > repos, since bare repos are likely used on servers.

That's an interesting notion. It's a net loss if you don't serve a lot
of fetches, because it's trying to amortize the extra CPU during the
repack with faster fetches and clones. So it makes sense for a hosting
site, but less for somebody pushing to a personal bare repo.

-Peff

  parent reply	other threads:[~2017-04-04  2:19 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-03 21:16 Ævar Arnfjörð Bjarmason
2017-04-03 21:34 ` Eric Wong
2017-04-03 21:57   ` Ævar Arnfjörð Bjarmason
2017-04-03 22:39     ` Eric Wong
2017-04-04 21:12       ` Ævar Arnfjörð Bjarmason
2017-04-04  2:19     ` Jeff King [this message]
2017-04-04 15:07 ` Jeff Hostetler
2017-04-04 15:18   ` Ævar Arnfjörð Bjarmason
2017-04-04 18:25     ` Jeff Hostetler
2017-04-05 12:56 ` Duy Nguyen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170404021921.cz6gz3lpanr2rwqv@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=avarab@gmail.com \
    --cc=e@80x24.org \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=tanoku@gmail.com \
    --subject='Re: [PATCH/RFC] gitperformance: add new documentation about git performance tuning' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Code repositories for project(s) associated with this inbox:

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).