git@vger.kernel.org list mirror (unofficial, one of many)
 help / color / mirror / code / Atom feed
From: Christian Walther <cwalther@gmx.ch>
To: Junio C Hamano <gitster@pobox.com>
Cc: Christian Walther via GitGitGadget <gitgitgadget@gmail.com>,
	git@vger.kernel.org
Subject: Re: [PATCH] doc: mention bigFileThreshold for packing
Date: Wed, 10 Feb 2021 22:43:14 +0100	[thread overview]
Message-ID: <F63929A8-7BC9-43A7-9E7B-118433F62588@gmx.ch> (raw)
In-Reply-To: <xmqqim70zzaf.fsf@gitster.c.googlers.com>

Junio C Hamano wrote:

> I doubt that the description of --window/--depth command line
> options, for both repack and pack-objects, is the best place to add
> this "Note".  Even if we were to add it as an appendix to these
> places, please do not break the flow of explanation by inserting it
> before the description of the default values of these options.

OK. That was where I would have looked for it, because it explains why --window wasn't effective in my attempts to get better compression, but I don't insist on it - any place would have worked, as I read both manpages back and forth several times.

In git-repack.txt, there is a "Configuration" section at the bottom, I guess it would fit there? There is none in git-pack-objects.txt, but I could add it. What do you think?


>>    I recently spent a lot of time trying to figure out why git repack would
>>    create huge packs on some clones of my repository and small ones on
>>    others
> 
> Not related to the contents of the patch, but I am somewhat curious
> to know what configuration resulted in the "huge" ones and "small"
> ones.  Documentation/config/core.txt::core.bigFileThreashold may be
> helped by addition of a success story, and the configuration for the
> "small" ones may be a good place to start.

The "huge" repository had bigFileThreshold = 1m. That was set by SubGit when converting from Subversion, for reasons unknown to me (see some discussion at https://support.tmatesoft.com/t/reduce-repository-size/2551 and https://issues.tmatesoft.com/issue/SGT-604). The result is a pack file of about 3 GB.

The "small" repository has it unset, so the default 512m applies, resulting in a pack file of about 50 MB.

What causes the huge difference is that the repository contains a "changelog" file that changes in almost every commit and has grown to 2.4 MB over 10000 commits. So it exists in about that many different versions, of which about 6000 are larger than 1 MB, but they only differ from each other by successive addition of small pieces.

I'm not sure if that makes for a good success story. 1m seems a rather extreme value to me. If you think so, I can try to come up with something.

Thanks

 Christian


  reply	other threads:[~2021-02-10 21:46 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-09 19:07 Christian Walther via GitGitGadget
2021-02-09 21:50 ` Junio C Hamano
2021-02-10 21:43   ` Christian Walther [this message]
2021-02-10 22:19     ` Junio C Hamano
2021-02-21 13:23 ` [PATCH v2] " Christian Walther via GitGitGadget

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=F63929A8-7BC9-43A7-9E7B-118433F62588@gmx.ch \
    --to=cwalther@gmx.ch \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=gitster@pobox.com \
    --subject='Re: [PATCH] doc: mention bigFileThreshold for packing' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Code repositories for project(s) associated with this inbox:

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).