git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* Fwd: Repack memory usage
       [not found] <CAGyf7-FWssYXkJQ_LignBPRuVchrAR9MFE7rX5d+vy8PLCY5Mw@mail.gmail.com>
@ 2020-04-28  5:15 ` Bryan Turner
  0 siblings, 0 replies; only message in thread
From: Bryan Turner @ 2020-04-28  5:15 UTC (permalink / raw)
  To: Git Users

I'm trying to help a user who has run into a situation where their
repository has gotten so "big" (to some dimension of "big"ness) that
they can no longer repack it on their hardware. Every time repack
runs, it ends up consuming so much RAM the Linux OOM killer kills it
before it can complete.

The repack command being used is: git repack -Adln
--unpack-unreachable=72.hours.ago

A count-objects on the repository (without -h, so these sizes are in
K) showed this:

count: 346911
size: 18367032
in-pack: 45117481
packs: 27
size-pack: 9462621
prune-packable: 22584
garbage: 0
size-garbage: 0

After the next repack was killed by the OOM killer, the numbers changed to this:

count: 427870
size: 22559712
in-pack: 90554922
packs: 37
size-pack: 18953693
prune-packable: 371285
garbage: 0
size-garbage: 0

The machine in question has 12 cores and 32GB of RAM. I've tried
setting pack.threads=1, but still hit the OOM killer. Bumping the RAM
for the machine to 64GB, paired with limiting repacking to a single
thread, finally produced a successful repack. (But it appears to be
unreliable, with some runs still failing. By the time the successful
repack completed the git process was showing over 60GB of RAM in
use--though that number is likely misleading.) There’s no setting
applied for pack.windowMemory, but my local testing (on a different
repository, I should note) didn’t show much change to repack’s overall
memory usage even with that set. Repacking a 500MB repository with
pack.windowMemory set to “100m” (and pack.threads set to 1) had still
allocated far more than 100MB of RAM before the process completed.

Unfortunately, I don't really know too much about the contents of the
repository. I _do_ know that it contains binaries, some of which may
be large, but that's all I can find out; the user is unable to share
the actual repository with me.

Does anyone have any suggestions on how I can constrain repack's
memory usage, so this user can get a successful repack without
requiring all the memory in the system to run (and bringing the system
to a complete standstill for all other processing while it’s running)?

Best regards,
Bryan Turner

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2020-04-28  5:16 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <CAGyf7-FWssYXkJQ_LignBPRuVchrAR9MFE7rX5d+vy8PLCY5Mw@mail.gmail.com>
2020-04-28  5:15 ` Fwd: Repack memory usage Bryan Turner

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).