git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: Janos Farkas <chexum@gmail.com>
Cc: git@vger.kernel.org, Jeff King <peff@peff.net>, Eric Wong <e@80x24.org>
Subject: Re: 2.22.0 repack -a duplicating pack contents
Date: Sun, 23 Jun 2019 16:54:50 +0200	[thread overview]
Message-ID: <875zow8i85.fsf@evledraar.gmail.com> (raw)
In-Reply-To: <CAEXt3sqno7RAtuwQ_OpD3aLkEORLaf6aNeNKGQL0BKezD+wWTw@mail.gmail.com>


On Sun, Jun 23 2019, Janos Farkas wrote:

> I'm using .keep files to... well.. keep packs to avoid some CPU time
> spent on repacking huge packs and make the process somewhat more
> incremental.
>
> Something changed with 22.2.0.  Now .bitmap files are also created,
> and no simple repacks re-create the pack data in a completely new
> file, wasting quite some storage:
>
> 02d03::master> find objects/pack/pack* -type f|xargs ls -sht
> 108K objects/pack/pack-879f2c28d15e57d353eb8e0ddbcb540655c844c9.bitmap
> 524K objects/pack/pack-879f2c28d15e57d353eb8e0ddbcb540655c844c9.idx
> 4.7M objects/pack/pack-879f2c28d15e57d353eb8e0ddbcb540655c844c9.pack
> 108K objects/pack/pack-e7a7aebfc6dc6b1431f6f56bb8b2f7e730cc4a0c.bitmap
> 524K objects/pack/pack-e7a7aebfc6dc6b1431f6f56bb8b2f7e730cc4a0c.idx
> 4.6M objects/pack/pack-e7a7aebfc6dc6b1431f6f56bb8b2f7e730cc4a0c.pack
> 116K objects/pack/pack-994c76cb1999e3b29552677d05e6364e6be2ae5e.bitmap
> 524K objects/pack/pack-994c76cb1999e3b29552677d05e6364e6be2ae5e.idx
> 4.6M objects/pack/pack-994c76cb1999e3b29552677d05e6364e6be2ae5e.pack
>    0 objects/pack/pack-e5b8848e7c1096274dba2430323ccaf5320c6846.keep
> 108K objects/pack/pack-e5b8848e7c1096274dba2430323ccaf5320c6846.bitmap
> 524K objects/pack/pack-e5b8848e7c1096274dba2430323ccaf5320c6846.idx
> 4.6M objects/pack/pack-e5b8848e7c1096274dba2430323ccaf5320c6846.pack
> 02d03::master > git repack -af
> Enumerating objects: 19001, done.
> Counting objects: 100% (19001/19001), done.
> Delta compression using up to 2 threads
> Compressing objects: 100% (18952/18952), done.
> Writing objects: 100% (19001/19001), done.
> warning: ignoring extra bitmap file:
> ./objects/pack/pack-e7a7aebfc6dc6b1431f6f56bb8b2f7e730cc4a0c.pack
> warning: ignoring extra bitmap file:
> ./objects/pack/pack-994c76cb1999e3b29552677d05e6364e6be2ae5e.pack
> warning: ignoring extra bitmap file:
> ./objects/pack/pack-e5b8848e7c1096274dba2430323ccaf5320c6846.pack
> Reusing bitmaps: 104, done.
> Selecting bitmap commits: 2550, done.
> Building bitmaps: 100% (130/130), done.
> Total 19001 (delta 14837), reused 4162 (delta 0)
> 02d03::master > find objects/pack/pack* -type f|xargs ls -sht
> 108K objects/pack/pack-8702a2550b7e29940af8bc62bc6fca011ccbd455.bitmap
> 524K objects/pack/pack-8702a2550b7e29940af8bc62bc6fca011ccbd455.idx
> 4.6M objects/pack/pack-8702a2550b7e29940af8bc62bc6fca011ccbd455.pack   <= ????
> 108K objects/pack/pack-879f2c28d15e57d353eb8e0ddbcb540655c844c9.bitmap
> 524K objects/pack/pack-879f2c28d15e57d353eb8e0ddbcb540655c844c9.idx
> 4.7M objects/pack/pack-879f2c28d15e57d353eb8e0ddbcb540655c844c9.pack
> 108K objects/pack/pack-e7a7aebfc6dc6b1431f6f56bb8b2f7e730cc4a0c.bitmap
> 524K objects/pack/pack-e7a7aebfc6dc6b1431f6f56bb8b2f7e730cc4a0c.idx
> 4.6M objects/pack/pack-e7a7aebfc6dc6b1431f6f56bb8b2f7e730cc4a0c.pack
> 116K objects/pack/pack-994c76cb1999e3b29552677d05e6364e6be2ae5e.bitmap
> 524K objects/pack/pack-994c76cb1999e3b29552677d05e6364e6be2ae5e.idx
> 4.6M objects/pack/pack-994c76cb1999e3b29552677d05e6364e6be2ae5e.pack
>    0 objects/pack/pack-e5b8848e7c1096274dba2430323ccaf5320c6846.keep
> 108K objects/pack/pack-e5b8848e7c1096274dba2430323ccaf5320c6846.bitmap
> 524K objects/pack/pack-e5b8848e7c1096274dba2430323ccaf5320c6846.idx
> 4.6M objects/pack/pack-e5b8848e7c1096274dba2430323ccaf5320c6846.pack
>
> The ccbd455 pack and its metadata seem quite pointless to be
> containing apparently all the data based on the size.
>
> If I use -ad, a new pack is still created,which, judging by the size,
> is essentially everything again, (but at least the extra packs are
> removed)
>
> 02d03::master> git repack -ad
> Enumerating objects: 19001, done.
> Counting objects: 100% (19001/19001), done.
> Delta compression using up to 2 threads
> Compressing objects: 100% (4114/4114), done.
> Writing objects: 100% (19001/19001), done.
> warning: ignoring extra bitmap file:
> ./objects/pack/pack-e5b8848e7c1096274dba2430323ccaf5320c6846.pack
> Reusing bitmaps: 104, done.
> Selecting bitmap commits: 2550, done.
> Building bitmaps: 100% (130/130), done.
> Total 19001 (delta 14838), reused 19001 (delta 14838)
> 02d03::master 9060> find objects/pack/pack* -type f|xargs ls -sht
> 116K objects/pack/pack-46ab64716d4220aac8d53b380d90a264d5293d3d.bitmap
> 524K objects/pack/pack-46ab64716d4220aac8d53b380d90a264d5293d3d.idx
> 4.6M objects/pack/pack-46ab64716d4220aac8d53b380d90a264d5293d3d.pack   <= ????
>    0 objects/pack/pack-e5b8848e7c1096274dba2430323ccaf5320c6846.keep
> 108K objects/pack/pack-e5b8848e7c1096274dba2430323ccaf5320c6846.bitmap
> 524K objects/pack/pack-e5b8848e7c1096274dba2430323ccaf5320c6846.idx
> 4.6M objects/pack/pack-e5b8848e7c1096274dba2430323ccaf5320c6846.pack
>
> Previously, the kept pack would be kept, and no additional packs would
> be created if no new objects were born in the repro.
>
> With the .keep placeholder removed, the duplication does not happen,
> but all the repro is rewritten into a new pack, which does not look
> correct.  Am I doing something unexpected?

I haven't looked at this for more than a couple of minutes (and don't
have more time now), but this is almost certainly due to 36eba0323d
("repack: enable bitmaps by default on bare repos", 2019-03-14). Can you
confirm when you re-run with repack.writeBitmaps=false in the config?

I.e. something in the "yes I want bitmaps" code implies "*.keep"
semantics changing from "keep" to "replace", which is obvious in
retrospect, since we can only have one *.bitmap per-repo.

  reply	other threads:[~2019-06-23 14:54 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-23 12:15 2.22.0 repack -a duplicating pack contents Janos Farkas
2019-06-23 14:54 ` Ævar Arnfjörð Bjarmason [this message]
2019-06-23 15:38   ` Janos Farkas
2019-06-23 18:02   ` Jeff King
2019-06-23 18:08     ` Eric Wong
2019-06-23 22:42       ` Jeff King
2019-06-24  9:30         ` Ævar Arnfjörð Bjarmason
2019-07-03 17:40           ` Jeff King
2019-06-28  7:02         ` [PATCH] repack: disable bitmaps-by-default if .keep files exist Eric Wong
2019-06-28  7:21           ` Ævar Arnfjörð Bjarmason
2019-06-29 19:16             ` [PATCH 2/1] repack: warn if bitmaps are explicitly enabled with keep files Eric Wong
2019-07-01 18:15               ` Junio C Hamano
2019-07-03 17:38                 ` Jeff King
2019-07-03 18:10                   ` Junio C Hamano
2019-07-03 18:37                     ` Junio C Hamano
2019-07-03 21:24                       ` Jeff King
2019-07-03 21:23                     ` Jeff King
2019-07-08 17:40                       ` Junio C Hamano
2019-06-29  8:03           ` [PATCH] repack: disable bitmaps-by-default if .keep files exist SZEDER Gábor
2019-06-29 19:13             ` [PATCH v2] " Eric Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=875zow8i85.fsf@evledraar.gmail.com \
    --to=avarab@gmail.com \
    --cc=chexum@gmail.com \
    --cc=e@80x24.org \
    --cc=git@vger.kernel.org \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).