git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Konstantin Ryabitsev <konstantin@linuxfoundation.org>
To: Jeff King <peff@peff.net>
Cc: git@vger.kernel.org
Subject: Re: Repacking a repository uses up all available disk space
Date: Sun, 12 Jun 2016 17:54:36 -0400	[thread overview]
Message-ID: <20160612215436.GB4584@gmail.com> (raw)
In-Reply-To: <20160612213804.GA5428@sigill.intra.peff.net>

[-- Attachment #1: Type: text/plain, Size: 1800 bytes --]

On Sun, Jun 12, 2016 at 05:38:04PM -0400, Jeff King wrote:
> > - When attempting to repack, creates millions of files and eventually
> >   eats up all available disk space
> 
> That means these objects fall into the unreachable category. Git will
> prune unreachable loose objects after a grace period based on the
> filesystem mtime of the objects; the default is 2 weeks.
> 
> For unreachable packed objects, their mtime is jumbled in with the rest
> of the objects in the packfile.  So Git's strategy is to "eject" such
> objects from the packfiles into individual loose objects, and let them
> "age out" of the grace period individually.
> 
> Generally this works just fine, but there are corner cases where you
> might have a very large number of such objects, and the loose storage is
> much more expensive than the packed (e.g., because each object is stored
> individually, not as a delta).
> 
> It sounds like this is the case you're running into.
> 
> The solution is to lower the grace period time, with something like:
> 
>   git gc --prune=5.minutes.ago
> 
> or even:
> 
>   git gc --prune=now

You are correct, this solves the problem, however I'm curious. The usual
maintenance for these repositories is a regular run of:

- git fsck --full
- git repack -Adl -b --pack-kept-objects
- git pack-refs --all
- git prune

The reason it's split into repack + prune instead of just gc is because
we use alternates to save on disk space and try not to prune repos that
are used as alternates by other repos in order to avoid potential
corruption.

Am I not doing something that needs to be doing in order to avoid the
same problem?

Thanks for your help.

Regards,
-- 
Konstantin Ryabitsev
Linux Foundation Collab Projects
Montréal, Québec

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

  reply	other threads:[~2016-06-12 21:54 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-06-12 21:25 Repacking a repository uses up all available disk space Konstantin Ryabitsev
2016-06-12 21:38 ` Jeff King
2016-06-12 21:54   ` Konstantin Ryabitsev [this message]
2016-06-12 22:13     ` Jeff King
2016-06-13  0:24       ` Duy Nguyen
2016-06-13  4:58         ` Jeff King
2016-06-13  1:43       ` Nasser Grainawi
2016-06-13  4:33         ` [PATCH 0/3] repack --keep-unreachable Jeff King
2016-06-13  4:33           ` [PATCH 1/3] repack: document --unpack-unreachable option Jeff King
2016-06-13  4:36           ` [PATCH 2/3] repack: add --keep-unreachable option Jeff King
2016-06-13  4:38           ` [PATCH 3/3] repack: extend --keep-unreachable to loose objects Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160612215436.GB4584@gmail.com \
    --to=konstantin@linuxfoundation.org \
    --cc=git@vger.kernel.org \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).