git@vger.kernel.org list mirror (unofficial, one of many)
 help / color / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: Git Mailing List <git@vger.kernel.org>
Subject: How to produce a loose ref+size explosion via pruning + git-gc
Date: Thu, 08 Feb 2018 16:37:32 +0100
Message-ID: <87fu6bmr0j.fsf@evledraar.gmail.com> (raw)

I'll probably submit docs for this eventually, but the docs in my
--prune-tags series were already hard enough to review. Try running this

    (
        rm -rf /tmp/git &&
        git clone https://github.com/git/git /tmp/git &&
        cd /tmp/git >/dev/null &&
        du -sh .git &&
        git rev-list --all origin/master.. | wc -l &&
        for clone in gitster peff avar chriscool mhagger pclouds Microsoft
        do
            git remote add $clone https://github.com/$clone/git &&
            git fetch -q $clone
        done &&
        git gc &&
        du -sh .git &&
        git rev-list --all origin/master.. | wc -l &&
        git fetch -q origin --prune 'refs/tags/*:refs/tags/*' &&
        for remote in $(git remote | grep -v origin)
        do
            git remote rm $remote
        done &&
        git gc &&
        du -sh .git &&
        git rev-list --all origin/master.. | wc -l
    )

The output is:

    108M    .git
    2222
    160M    .git
    62220
    1.9G    .git
    2222

I.e. a fresh clone of git.git is 108MB, add a few more repos that have
diverged quite a bit in its network ad it's 160MB repacked.

Now remove those remotes and "git gc" and it's 1.9GB, even though it's
divergent by the same 2222 commits from master as the 108MB, but after
running:

    git prune --expire=now

It becomes ~108MB again.

Now this is all expected behavior, we've made a bunch of objects
unreferenced, so they all get exploded into loose objects, which takes a
lot of space.

It's an interesting caveat when setting fetch.prune=true on checkouts
that didn't previously have it and might have lots of brances to be
pruned.

For reasons I won't go into I'd had that disabled for a while here at
work, and after re-enabling it we had some repos whose .git is usually
2.5G explode to 30G once git-gc ran.

The workaround is to set gc.pruneExpire low enough that when the gc hits
all those objects get deleted, I set it to 1 day (from the default of 2
weeks).

But it doesn't help with repos that have already run git-gc and exploded
in size, much to the confusion of users on those systems, those need a
manual git-prune.

Potential solutions to this have been discussed ad-nauseam here on
list. Let's not go into that (unless someone feels like it).

I mainly wanted to send this for later reference, and have some
searchable record in case someone's confused when they turn on prune and
their repo increases to 10x the previous size.

                 reply index

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87fu6bmr0j.fsf@evledraar.gmail.com \
    --to=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

git@vger.kernel.org list mirror (unofficial, one of many)

Archives are clonable:
	git clone --mirror https://public-inbox.org/git
	git clone --mirror http://ou63pmih66umazou.onion/git
	git clone --mirror http://czquwvybam4bgbro.onion/git
	git clone --mirror http://hjrcffqmbrq6wope.onion/git

Example config snippet for mirrors

Newsgroups are available over NNTP:
	nntp://news.public-inbox.org/inbox.comp.version-control.git
	nntp://ou63pmih66umazou.onion/inbox.comp.version-control.git
	nntp://czquwvybam4bgbro.onion/inbox.comp.version-control.git
	nntp://hjrcffqmbrq6wope.onion/inbox.comp.version-control.git
	nntp://news.gmane.io/gmane.comp.version-control.git

 note: .onion URLs require Tor: https://www.torproject.org/

AGPL code for this site: git clone https://public-inbox.org/public-inbox.git