From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: Janos Farkas <chexum@gmail.com>
Cc: git@vger.kernel.org, Jeff King <peff@peff.net>, Eric Wong <e@80x24.org>
Subject: Re: 2.22.0 repack -a duplicating pack contents
Date: Sun, 23 Jun 2019 16:54:50 +0200 [thread overview]
Message-ID: <875zow8i85.fsf@evledraar.gmail.com> (raw)
In-Reply-To: <CAEXt3sqno7RAtuwQ_OpD3aLkEORLaf6aNeNKGQL0BKezD+wWTw@mail.gmail.com>
On Sun, Jun 23 2019, Janos Farkas wrote:
> I'm using .keep files to... well.. keep packs to avoid some CPU time
> spent on repacking huge packs and make the process somewhat more
> incremental.
>
> Something changed with 22.2.0. Now .bitmap files are also created,
> and no simple repacks re-create the pack data in a completely new
> file, wasting quite some storage:
>
> 02d03::master> find objects/pack/pack* -type f|xargs ls -sht
> 108K objects/pack/pack-879f2c28d15e57d353eb8e0ddbcb540655c844c9.bitmap
> 524K objects/pack/pack-879f2c28d15e57d353eb8e0ddbcb540655c844c9.idx
> 4.7M objects/pack/pack-879f2c28d15e57d353eb8e0ddbcb540655c844c9.pack
> 108K objects/pack/pack-e7a7aebfc6dc6b1431f6f56bb8b2f7e730cc4a0c.bitmap
> 524K objects/pack/pack-e7a7aebfc6dc6b1431f6f56bb8b2f7e730cc4a0c.idx
> 4.6M objects/pack/pack-e7a7aebfc6dc6b1431f6f56bb8b2f7e730cc4a0c.pack
> 116K objects/pack/pack-994c76cb1999e3b29552677d05e6364e6be2ae5e.bitmap
> 524K objects/pack/pack-994c76cb1999e3b29552677d05e6364e6be2ae5e.idx
> 4.6M objects/pack/pack-994c76cb1999e3b29552677d05e6364e6be2ae5e.pack
> 0 objects/pack/pack-e5b8848e7c1096274dba2430323ccaf5320c6846.keep
> 108K objects/pack/pack-e5b8848e7c1096274dba2430323ccaf5320c6846.bitmap
> 524K objects/pack/pack-e5b8848e7c1096274dba2430323ccaf5320c6846.idx
> 4.6M objects/pack/pack-e5b8848e7c1096274dba2430323ccaf5320c6846.pack
> 02d03::master > git repack -af
> Enumerating objects: 19001, done.
> Counting objects: 100% (19001/19001), done.
> Delta compression using up to 2 threads
> Compressing objects: 100% (18952/18952), done.
> Writing objects: 100% (19001/19001), done.
> warning: ignoring extra bitmap file:
> ./objects/pack/pack-e7a7aebfc6dc6b1431f6f56bb8b2f7e730cc4a0c.pack
> warning: ignoring extra bitmap file:
> ./objects/pack/pack-994c76cb1999e3b29552677d05e6364e6be2ae5e.pack
> warning: ignoring extra bitmap file:
> ./objects/pack/pack-e5b8848e7c1096274dba2430323ccaf5320c6846.pack
> Reusing bitmaps: 104, done.
> Selecting bitmap commits: 2550, done.
> Building bitmaps: 100% (130/130), done.
> Total 19001 (delta 14837), reused 4162 (delta 0)
> 02d03::master > find objects/pack/pack* -type f|xargs ls -sht
> 108K objects/pack/pack-8702a2550b7e29940af8bc62bc6fca011ccbd455.bitmap
> 524K objects/pack/pack-8702a2550b7e29940af8bc62bc6fca011ccbd455.idx
> 4.6M objects/pack/pack-8702a2550b7e29940af8bc62bc6fca011ccbd455.pack <= ????
> 108K objects/pack/pack-879f2c28d15e57d353eb8e0ddbcb540655c844c9.bitmap
> 524K objects/pack/pack-879f2c28d15e57d353eb8e0ddbcb540655c844c9.idx
> 4.7M objects/pack/pack-879f2c28d15e57d353eb8e0ddbcb540655c844c9.pack
> 108K objects/pack/pack-e7a7aebfc6dc6b1431f6f56bb8b2f7e730cc4a0c.bitmap
> 524K objects/pack/pack-e7a7aebfc6dc6b1431f6f56bb8b2f7e730cc4a0c.idx
> 4.6M objects/pack/pack-e7a7aebfc6dc6b1431f6f56bb8b2f7e730cc4a0c.pack
> 116K objects/pack/pack-994c76cb1999e3b29552677d05e6364e6be2ae5e.bitmap
> 524K objects/pack/pack-994c76cb1999e3b29552677d05e6364e6be2ae5e.idx
> 4.6M objects/pack/pack-994c76cb1999e3b29552677d05e6364e6be2ae5e.pack
> 0 objects/pack/pack-e5b8848e7c1096274dba2430323ccaf5320c6846.keep
> 108K objects/pack/pack-e5b8848e7c1096274dba2430323ccaf5320c6846.bitmap
> 524K objects/pack/pack-e5b8848e7c1096274dba2430323ccaf5320c6846.idx
> 4.6M objects/pack/pack-e5b8848e7c1096274dba2430323ccaf5320c6846.pack
>
> The ccbd455 pack and its metadata seem quite pointless to be
> containing apparently all the data based on the size.
>
> If I use -ad, a new pack is still created,which, judging by the size,
> is essentially everything again, (but at least the extra packs are
> removed)
>
> 02d03::master> git repack -ad
> Enumerating objects: 19001, done.
> Counting objects: 100% (19001/19001), done.
> Delta compression using up to 2 threads
> Compressing objects: 100% (4114/4114), done.
> Writing objects: 100% (19001/19001), done.
> warning: ignoring extra bitmap file:
> ./objects/pack/pack-e5b8848e7c1096274dba2430323ccaf5320c6846.pack
> Reusing bitmaps: 104, done.
> Selecting bitmap commits: 2550, done.
> Building bitmaps: 100% (130/130), done.
> Total 19001 (delta 14838), reused 19001 (delta 14838)
> 02d03::master 9060> find objects/pack/pack* -type f|xargs ls -sht
> 116K objects/pack/pack-46ab64716d4220aac8d53b380d90a264d5293d3d.bitmap
> 524K objects/pack/pack-46ab64716d4220aac8d53b380d90a264d5293d3d.idx
> 4.6M objects/pack/pack-46ab64716d4220aac8d53b380d90a264d5293d3d.pack <= ????
> 0 objects/pack/pack-e5b8848e7c1096274dba2430323ccaf5320c6846.keep
> 108K objects/pack/pack-e5b8848e7c1096274dba2430323ccaf5320c6846.bitmap
> 524K objects/pack/pack-e5b8848e7c1096274dba2430323ccaf5320c6846.idx
> 4.6M objects/pack/pack-e5b8848e7c1096274dba2430323ccaf5320c6846.pack
>
> Previously, the kept pack would be kept, and no additional packs would
> be created if no new objects were born in the repro.
>
> With the .keep placeholder removed, the duplication does not happen,
> but all the repro is rewritten into a new pack, which does not look
> correct. Am I doing something unexpected?
I haven't looked at this for more than a couple of minutes (and don't
have more time now), but this is almost certainly due to 36eba0323d
("repack: enable bitmaps by default on bare repos", 2019-03-14). Can you
confirm when you re-run with repack.writeBitmaps=false in the config?
I.e. something in the "yes I want bitmaps" code implies "*.keep"
semantics changing from "keep" to "replace", which is obvious in
retrospect, since we can only have one *.bitmap per-repo.
next prev parent reply other threads:[~2019-06-23 14:54 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-06-23 12:15 2.22.0 repack -a duplicating pack contents Janos Farkas
2019-06-23 14:54 ` Ævar Arnfjörð Bjarmason [this message]
2019-06-23 15:38 ` Janos Farkas
2019-06-23 18:02 ` Jeff King
2019-06-23 18:08 ` Eric Wong
2019-06-23 22:42 ` Jeff King
2019-06-24 9:30 ` Ævar Arnfjörð Bjarmason
2019-07-03 17:40 ` Jeff King
2019-06-28 7:02 ` [PATCH] repack: disable bitmaps-by-default if .keep files exist Eric Wong
2019-06-28 7:21 ` Ævar Arnfjörð Bjarmason
2019-06-29 19:16 ` [PATCH 2/1] repack: warn if bitmaps are explicitly enabled with keep files Eric Wong
2019-07-01 18:15 ` Junio C Hamano
2019-07-03 17:38 ` Jeff King
2019-07-03 18:10 ` Junio C Hamano
2019-07-03 18:37 ` Junio C Hamano
2019-07-03 21:24 ` Jeff King
2019-07-03 21:23 ` Jeff King
2019-07-08 17:40 ` Junio C Hamano
2019-06-29 8:03 ` [PATCH] repack: disable bitmaps-by-default if .keep files exist SZEDER Gábor
2019-06-29 19:13 ` [PATCH v2] " Eric Wong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=875zow8i85.fsf@evledraar.gmail.com \
--to=avarab@gmail.com \
--cc=chexum@gmail.com \
--cc=e@80x24.org \
--cc=git@vger.kernel.org \
--cc=peff@peff.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).