git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "W. David Jarvis" <william.d.jarvis@gmail.com>
To: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Cc: Git <git@vger.kernel.org>
Subject: Re: Reducing CPU load on git server
Date: Mon, 29 Aug 2016 13:57:56 -0700	[thread overview]
Message-ID: <CAFMAO9wQD5GtGRGv-sMy=NA1q8kbu6n3FFbWuJ+W5-qnRDKW-w@mail.gmail.com> (raw)
In-Reply-To: <CACBZZX63DAmFt_ZiUHj-bs9dtwRd4MOxoLfM8r1uRi3q4Mwnkw@mail.gmail.com>

>  * Consider having that queue of yours just send the pushed payload
> instead of "pull this", see git-bundle. This can turn this sync entire
> thing into a static file distribution problem.

As far as I know, GHE doesn't support this out of the box. We've asked
them for essentially this, though. Due to the nature of our license we
may not be able to configure something like this on the server
instance ourselves.

>  * It's not clear from your post why you have to worry about all these
> branches, surely your Chef instances just need the "master" branch,
> just push that around.

We allow deployments from non-master branches, so we do need multiple
branches. We also use the replication fleet as the target for our
build system, which needs to be able to build essentially any branch
on any repository.

>  * If you do need branches consider archiving stale tags/branches
> after some time. I implemented this where I work, we just have a
> $REPO-archive.git with every tag/branch ever created for a given
> $REPO.git, and delete refs after a certain time.

This is something else that we're actively considering. Why did your
company implement this -- was it to reduce load, or just to clean up
your repositories? Did you notice any change in server load?

>  * If your problem is that you're CPU bound on the master have you
> considered maybe solving this with something like NFS, i.e. replace
> your ad-hoc replication with just a bunch of "slave" boxes that mount
> the remote filesystem.

This is definitely an interesting idea. It'd be a significant
architectural change, though, and not one I'm sure we'd be able to get
support for.

>  * Or, if you're willing to deal with occasional transitory repo
> corruption (just retry?): rsync.

I think this is a cost we're not necessarily okay with having to deal with.

>  * Theres's no reason for why your replication chain needs to be
> single-level if master CPU is really the issue. You could have master
> -> N slaves -> N*X slaves, or some combination thereof.

This was discussed above - if the primary driver of load is the first
fetch, then moving to a multi-tiered architecture will not solve our
problems.

>  * Does it really even matter that your "slave" machines are all
> up-to-date? We have something similar at work but it's just a minutely
> cronjob that does "git fetch" on some repos, since the downstream
> thing (e.g. the chef run) doesn't run more than once every 30m or
> whatever anyway.

It does, because we use the replication fleet for our build server.

 - V

-- 
============
venanti.us
203.918.2328
============

  reply	other threads:[~2016-08-29 20:58 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-08-28 19:42 Reducing CPU load on git server W. David Jarvis
2016-08-28 21:20 ` Jakub Narębski
2016-08-28 23:18   ` W. David Jarvis
2016-08-29  5:47 ` Jeff King
2016-08-29 10:46   ` Jakub Narębski
2016-08-29 17:18     ` Jeff King
2016-08-29 19:16   ` W. David Jarvis
2016-08-29 21:31     ` Jeff King
2016-08-29 22:41       ` W. David Jarvis
2016-08-31  6:02         ` Jeff King
2016-08-30 10:46       ` git blame <directory> [was: Reducing CPU load on git server] Jakub Narębski
2016-08-31  5:42         ` Jeff King
2016-08-31  7:28           ` Dennis Kaarsemaker
2016-08-29 20:14 ` Reducing CPU load on git server Ævar Arnfjörð Bjarmason
2016-08-29 20:57   ` W. David Jarvis [this message]
2016-08-29 21:31     ` Dennis Kaarsemaker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAFMAO9wQD5GtGRGv-sMy=NA1q8kbu6n3FFbWuJ+W5-qnRDKW-w@mail.gmail.com' \
    --to=william.d.jarvis@gmail.com \
    --cc=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).