git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Richard Hipp <drh@sqlite.org>
To: git@vger.kernel.org
Subject: git-fast-import yields huge packfile
Date: Sat, 16 Mar 2019 16:31:04 -0400	[thread overview]
Message-ID: <CALwJ=MzrqPUNw=jc0NRtaJaJG+ErXNb577JNSN66GiGY4UFtRw@mail.gmail.com> (raw)

I'm trying to transform a repository from another VCS into a Git
repository using "git fast-import".  It appears to work, but the
resulting Git repository is huge relative to the original - 18 times
larger. Most of the space seems to be taken up by a single large
packfile.  That packfile is about 967 MB which is about 1/4th the
total uncompressed size of all 41785 distinct Blobs in the original
repository.  The source VCS is able to compress this down to 52 MB by
comparison.

Maybe I'm doing something wrong with the fast-import stream that is
defeating Git's attempts at delta compression....

Are there any utility programs available for analyzing packfiles so
that I try to figure out where the inefficiencies are cropping up, so
that I can try to address them?

Anybody have any suggestions on what I should be looking for?

If anyone would care to see this oversized packfile and perhaps offer
suggestions on how I can make it more space-efficient, it can be
cloned from https://github.com/drhsqlite/fossil-mirror.git - at least
for now - surely I will delete that repo and regenerate it once I
figure out this problem.

-- 
D. Richard Hipp
drh@sqlite.org

             reply	other threads:[~2019-03-16 20:31 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-16 20:31 Richard Hipp [this message]
2019-03-16 21:04 ` git-fast-import yields huge packfile Linus Torvalds
2019-03-16 22:12   ` Mike Hommey
2019-03-16 23:22   ` Richard Hipp
2019-03-21 14:09 ` Johannes Schindelin
2019-03-21 14:23   ` Ævar Arnfjörð Bjarmason

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CALwJ=MzrqPUNw=jc0NRtaJaJG+ErXNb577JNSN66GiGY4UFtRw@mail.gmail.com' \
    --to=drh@sqlite.org \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).