git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* 2.22.0 repack -a duplicating pack contents
@ 2019-06-23 12:15 Janos Farkas
  2019-06-23 14:54 ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 20+ messages in thread
From: Janos Farkas @ 2019-06-23 12:15 UTC (permalink / raw)
  To: git

I'm using .keep files to... well.. keep packs to avoid some CPU time
spent on repacking huge packs and make the process somewhat more
incremental.

Something changed with 22.2.0.  Now .bitmap files are also created,
and no simple repacks re-create the pack data in a completely new
file, wasting quite some storage:

02d03::master> find objects/pack/pack* -type f|xargs ls -sht
108K objects/pack/pack-879f2c28d15e57d353eb8e0ddbcb540655c844c9.bitmap
524K objects/pack/pack-879f2c28d15e57d353eb8e0ddbcb540655c844c9.idx
4.7M objects/pack/pack-879f2c28d15e57d353eb8e0ddbcb540655c844c9.pack
108K objects/pack/pack-e7a7aebfc6dc6b1431f6f56bb8b2f7e730cc4a0c.bitmap
524K objects/pack/pack-e7a7aebfc6dc6b1431f6f56bb8b2f7e730cc4a0c.idx
4.6M objects/pack/pack-e7a7aebfc6dc6b1431f6f56bb8b2f7e730cc4a0c.pack
116K objects/pack/pack-994c76cb1999e3b29552677d05e6364e6be2ae5e.bitmap
524K objects/pack/pack-994c76cb1999e3b29552677d05e6364e6be2ae5e.idx
4.6M objects/pack/pack-994c76cb1999e3b29552677d05e6364e6be2ae5e.pack
   0 objects/pack/pack-e5b8848e7c1096274dba2430323ccaf5320c6846.keep
108K objects/pack/pack-e5b8848e7c1096274dba2430323ccaf5320c6846.bitmap
524K objects/pack/pack-e5b8848e7c1096274dba2430323ccaf5320c6846.idx
4.6M objects/pack/pack-e5b8848e7c1096274dba2430323ccaf5320c6846.pack
02d03::master > git repack -af
Enumerating objects: 19001, done.
Counting objects: 100% (19001/19001), done.
Delta compression using up to 2 threads
Compressing objects: 100% (18952/18952), done.
Writing objects: 100% (19001/19001), done.
warning: ignoring extra bitmap file:
./objects/pack/pack-e7a7aebfc6dc6b1431f6f56bb8b2f7e730cc4a0c.pack
warning: ignoring extra bitmap file:
./objects/pack/pack-994c76cb1999e3b29552677d05e6364e6be2ae5e.pack
warning: ignoring extra bitmap file:
./objects/pack/pack-e5b8848e7c1096274dba2430323ccaf5320c6846.pack
Reusing bitmaps: 104, done.
Selecting bitmap commits: 2550, done.
Building bitmaps: 100% (130/130), done.
Total 19001 (delta 14837), reused 4162 (delta 0)
02d03::master > find objects/pack/pack* -type f|xargs ls -sht
108K objects/pack/pack-8702a2550b7e29940af8bc62bc6fca011ccbd455.bitmap
524K objects/pack/pack-8702a2550b7e29940af8bc62bc6fca011ccbd455.idx
4.6M objects/pack/pack-8702a2550b7e29940af8bc62bc6fca011ccbd455.pack   <= ????
108K objects/pack/pack-879f2c28d15e57d353eb8e0ddbcb540655c844c9.bitmap
524K objects/pack/pack-879f2c28d15e57d353eb8e0ddbcb540655c844c9.idx
4.7M objects/pack/pack-879f2c28d15e57d353eb8e0ddbcb540655c844c9.pack
108K objects/pack/pack-e7a7aebfc6dc6b1431f6f56bb8b2f7e730cc4a0c.bitmap
524K objects/pack/pack-e7a7aebfc6dc6b1431f6f56bb8b2f7e730cc4a0c.idx
4.6M objects/pack/pack-e7a7aebfc6dc6b1431f6f56bb8b2f7e730cc4a0c.pack
116K objects/pack/pack-994c76cb1999e3b29552677d05e6364e6be2ae5e.bitmap
524K objects/pack/pack-994c76cb1999e3b29552677d05e6364e6be2ae5e.idx
4.6M objects/pack/pack-994c76cb1999e3b29552677d05e6364e6be2ae5e.pack
   0 objects/pack/pack-e5b8848e7c1096274dba2430323ccaf5320c6846.keep
108K objects/pack/pack-e5b8848e7c1096274dba2430323ccaf5320c6846.bitmap
524K objects/pack/pack-e5b8848e7c1096274dba2430323ccaf5320c6846.idx
4.6M objects/pack/pack-e5b8848e7c1096274dba2430323ccaf5320c6846.pack

The ccbd455 pack and its metadata seem quite pointless to be
containing apparently all the data based on the size.

If I use -ad, a new pack is still created,which, judging by the size,
is essentially everything again, (but at least the extra packs are
removed)

02d03::master> git repack -ad
Enumerating objects: 19001, done.
Counting objects: 100% (19001/19001), done.
Delta compression using up to 2 threads
Compressing objects: 100% (4114/4114), done.
Writing objects: 100% (19001/19001), done.
warning: ignoring extra bitmap file:
./objects/pack/pack-e5b8848e7c1096274dba2430323ccaf5320c6846.pack
Reusing bitmaps: 104, done.
Selecting bitmap commits: 2550, done.
Building bitmaps: 100% (130/130), done.
Total 19001 (delta 14838), reused 19001 (delta 14838)
02d03::master 9060> find objects/pack/pack* -type f|xargs ls -sht
116K objects/pack/pack-46ab64716d4220aac8d53b380d90a264d5293d3d.bitmap
524K objects/pack/pack-46ab64716d4220aac8d53b380d90a264d5293d3d.idx
4.6M objects/pack/pack-46ab64716d4220aac8d53b380d90a264d5293d3d.pack   <= ????
   0 objects/pack/pack-e5b8848e7c1096274dba2430323ccaf5320c6846.keep
108K objects/pack/pack-e5b8848e7c1096274dba2430323ccaf5320c6846.bitmap
524K objects/pack/pack-e5b8848e7c1096274dba2430323ccaf5320c6846.idx
4.6M objects/pack/pack-e5b8848e7c1096274dba2430323ccaf5320c6846.pack

Previously, the kept pack would be kept, and no additional packs would
be created if no new objects were born in the repro.

With the .keep placeholder removed, the duplication does not happen,
but all the repro is rewritten into a new pack, which does not look
correct.  Am I doing something unexpected?

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: 2.22.0 repack -a duplicating pack contents
  2019-06-23 12:15 2.22.0 repack -a duplicating pack contents Janos Farkas
@ 2019-06-23 14:54 ` Ævar Arnfjörð Bjarmason
  2019-06-23 15:38   ` Janos Farkas
  2019-06-23 18:02   ` Jeff King
  0 siblings, 2 replies; 20+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2019-06-23 14:54 UTC (permalink / raw)
  To: Janos Farkas; +Cc: git, Jeff King, Eric Wong


On Sun, Jun 23 2019, Janos Farkas wrote:

> I'm using .keep files to... well.. keep packs to avoid some CPU time
> spent on repacking huge packs and make the process somewhat more
> incremental.
>
> Something changed with 22.2.0.  Now .bitmap files are also created,
> and no simple repacks re-create the pack data in a completely new
> file, wasting quite some storage:
>
> 02d03::master> find objects/pack/pack* -type f|xargs ls -sht
> 108K objects/pack/pack-879f2c28d15e57d353eb8e0ddbcb540655c844c9.bitmap
> 524K objects/pack/pack-879f2c28d15e57d353eb8e0ddbcb540655c844c9.idx
> 4.7M objects/pack/pack-879f2c28d15e57d353eb8e0ddbcb540655c844c9.pack
> 108K objects/pack/pack-e7a7aebfc6dc6b1431f6f56bb8b2f7e730cc4a0c.bitmap
> 524K objects/pack/pack-e7a7aebfc6dc6b1431f6f56bb8b2f7e730cc4a0c.idx
> 4.6M objects/pack/pack-e7a7aebfc6dc6b1431f6f56bb8b2f7e730cc4a0c.pack
> 116K objects/pack/pack-994c76cb1999e3b29552677d05e6364e6be2ae5e.bitmap
> 524K objects/pack/pack-994c76cb1999e3b29552677d05e6364e6be2ae5e.idx
> 4.6M objects/pack/pack-994c76cb1999e3b29552677d05e6364e6be2ae5e.pack
>    0 objects/pack/pack-e5b8848e7c1096274dba2430323ccaf5320c6846.keep
> 108K objects/pack/pack-e5b8848e7c1096274dba2430323ccaf5320c6846.bitmap
> 524K objects/pack/pack-e5b8848e7c1096274dba2430323ccaf5320c6846.idx
> 4.6M objects/pack/pack-e5b8848e7c1096274dba2430323ccaf5320c6846.pack
> 02d03::master > git repack -af
> Enumerating objects: 19001, done.
> Counting objects: 100% (19001/19001), done.
> Delta compression using up to 2 threads
> Compressing objects: 100% (18952/18952), done.
> Writing objects: 100% (19001/19001), done.
> warning: ignoring extra bitmap file:
> ./objects/pack/pack-e7a7aebfc6dc6b1431f6f56bb8b2f7e730cc4a0c.pack
> warning: ignoring extra bitmap file:
> ./objects/pack/pack-994c76cb1999e3b29552677d05e6364e6be2ae5e.pack
> warning: ignoring extra bitmap file:
> ./objects/pack/pack-e5b8848e7c1096274dba2430323ccaf5320c6846.pack
> Reusing bitmaps: 104, done.
> Selecting bitmap commits: 2550, done.
> Building bitmaps: 100% (130/130), done.
> Total 19001 (delta 14837), reused 4162 (delta 0)
> 02d03::master > find objects/pack/pack* -type f|xargs ls -sht
> 108K objects/pack/pack-8702a2550b7e29940af8bc62bc6fca011ccbd455.bitmap
> 524K objects/pack/pack-8702a2550b7e29940af8bc62bc6fca011ccbd455.idx
> 4.6M objects/pack/pack-8702a2550b7e29940af8bc62bc6fca011ccbd455.pack   <= ????
> 108K objects/pack/pack-879f2c28d15e57d353eb8e0ddbcb540655c844c9.bitmap
> 524K objects/pack/pack-879f2c28d15e57d353eb8e0ddbcb540655c844c9.idx
> 4.7M objects/pack/pack-879f2c28d15e57d353eb8e0ddbcb540655c844c9.pack
> 108K objects/pack/pack-e7a7aebfc6dc6b1431f6f56bb8b2f7e730cc4a0c.bitmap
> 524K objects/pack/pack-e7a7aebfc6dc6b1431f6f56bb8b2f7e730cc4a0c.idx
> 4.6M objects/pack/pack-e7a7aebfc6dc6b1431f6f56bb8b2f7e730cc4a0c.pack
> 116K objects/pack/pack-994c76cb1999e3b29552677d05e6364e6be2ae5e.bitmap
> 524K objects/pack/pack-994c76cb1999e3b29552677d05e6364e6be2ae5e.idx
> 4.6M objects/pack/pack-994c76cb1999e3b29552677d05e6364e6be2ae5e.pack
>    0 objects/pack/pack-e5b8848e7c1096274dba2430323ccaf5320c6846.keep
> 108K objects/pack/pack-e5b8848e7c1096274dba2430323ccaf5320c6846.bitmap
> 524K objects/pack/pack-e5b8848e7c1096274dba2430323ccaf5320c6846.idx
> 4.6M objects/pack/pack-e5b8848e7c1096274dba2430323ccaf5320c6846.pack
>
> The ccbd455 pack and its metadata seem quite pointless to be
> containing apparently all the data based on the size.
>
> If I use -ad, a new pack is still created,which, judging by the size,
> is essentially everything again, (but at least the extra packs are
> removed)
>
> 02d03::master> git repack -ad
> Enumerating objects: 19001, done.
> Counting objects: 100% (19001/19001), done.
> Delta compression using up to 2 threads
> Compressing objects: 100% (4114/4114), done.
> Writing objects: 100% (19001/19001), done.
> warning: ignoring extra bitmap file:
> ./objects/pack/pack-e5b8848e7c1096274dba2430323ccaf5320c6846.pack
> Reusing bitmaps: 104, done.
> Selecting bitmap commits: 2550, done.
> Building bitmaps: 100% (130/130), done.
> Total 19001 (delta 14838), reused 19001 (delta 14838)
> 02d03::master 9060> find objects/pack/pack* -type f|xargs ls -sht
> 116K objects/pack/pack-46ab64716d4220aac8d53b380d90a264d5293d3d.bitmap
> 524K objects/pack/pack-46ab64716d4220aac8d53b380d90a264d5293d3d.idx
> 4.6M objects/pack/pack-46ab64716d4220aac8d53b380d90a264d5293d3d.pack   <= ????
>    0 objects/pack/pack-e5b8848e7c1096274dba2430323ccaf5320c6846.keep
> 108K objects/pack/pack-e5b8848e7c1096274dba2430323ccaf5320c6846.bitmap
> 524K objects/pack/pack-e5b8848e7c1096274dba2430323ccaf5320c6846.idx
> 4.6M objects/pack/pack-e5b8848e7c1096274dba2430323ccaf5320c6846.pack
>
> Previously, the kept pack would be kept, and no additional packs would
> be created if no new objects were born in the repro.
>
> With the .keep placeholder removed, the duplication does not happen,
> but all the repro is rewritten into a new pack, which does not look
> correct.  Am I doing something unexpected?

I haven't looked at this for more than a couple of minutes (and don't
have more time now), but this is almost certainly due to 36eba0323d
("repack: enable bitmaps by default on bare repos", 2019-03-14). Can you
confirm when you re-run with repack.writeBitmaps=false in the config?

I.e. something in the "yes I want bitmaps" code implies "*.keep"
semantics changing from "keep" to "replace", which is obvious in
retrospect, since we can only have one *.bitmap per-repo.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: 2.22.0 repack -a duplicating pack contents
  2019-06-23 14:54 ` Ævar Arnfjörð Bjarmason
@ 2019-06-23 15:38   ` Janos Farkas
  2019-06-23 18:02   ` Jeff King
  1 sibling, 0 replies; 20+ messages in thread
From: Janos Farkas @ 2019-06-23 15:38 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git, Jeff King, Eric Wong

Thanks for looking at this, yes, it seems it's purely the .bitmap
writing.  With disabled repack.writeBitmaps, it all behaves as before
(and yes, these are bare repros for keeping an intermediate backup).

With the easy workaround in place, I'm back at 2.22 for now, thanks!

On Sun, Jun 23, 2019 at 3:54 PM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
>
>
> On Sun, Jun 23 2019, Janos Farkas wrote:
>
> > I'm using .keep files to... well.. keep packs to avoid some CPU time
> > spent on repacking huge packs and make the process somewhat more
> > incremental.
> >
> > Something changed with 22.2.0.  Now .bitmap files are also created,
> > and no simple repacks re-create the pack data in a completely new
> > file, wasting quite some storage:
> >
> > 02d03::master> find objects/pack/pack* -type f|xargs ls -sht
> > 108K objects/pack/pack-879f2c28d15e57d353eb8e0ddbcb540655c844c9.bitmap
> > 524K objects/pack/pack-879f2c28d15e57d353eb8e0ddbcb540655c844c9.idx
> > 4.7M objects/pack/pack-879f2c28d15e57d353eb8e0ddbcb540655c844c9.pack
> > 108K objects/pack/pack-e7a7aebfc6dc6b1431f6f56bb8b2f7e730cc4a0c.bitmap
> > 524K objects/pack/pack-e7a7aebfc6dc6b1431f6f56bb8b2f7e730cc4a0c.idx
> > 4.6M objects/pack/pack-e7a7aebfc6dc6b1431f6f56bb8b2f7e730cc4a0c.pack
> > 116K objects/pack/pack-994c76cb1999e3b29552677d05e6364e6be2ae5e.bitmap
> > 524K objects/pack/pack-994c76cb1999e3b29552677d05e6364e6be2ae5e.idx
> > 4.6M objects/pack/pack-994c76cb1999e3b29552677d05e6364e6be2ae5e.pack
> >    0 objects/pack/pack-e5b8848e7c1096274dba2430323ccaf5320c6846.keep
> > 108K objects/pack/pack-e5b8848e7c1096274dba2430323ccaf5320c6846.bitmap
> > 524K objects/pack/pack-e5b8848e7c1096274dba2430323ccaf5320c6846.idx
> > 4.6M objects/pack/pack-e5b8848e7c1096274dba2430323ccaf5320c6846.pack
> > 02d03::master > git repack -af
> > Enumerating objects: 19001, done.
> > Counting objects: 100% (19001/19001), done.
> > Delta compression using up to 2 threads
> > Compressing objects: 100% (18952/18952), done.
> > Writing objects: 100% (19001/19001), done.
> > warning: ignoring extra bitmap file:
> > ./objects/pack/pack-e7a7aebfc6dc6b1431f6f56bb8b2f7e730cc4a0c.pack
> > warning: ignoring extra bitmap file:
> > ./objects/pack/pack-994c76cb1999e3b29552677d05e6364e6be2ae5e.pack
> > warning: ignoring extra bitmap file:
> > ./objects/pack/pack-e5b8848e7c1096274dba2430323ccaf5320c6846.pack
> > Reusing bitmaps: 104, done.
> > Selecting bitmap commits: 2550, done.
> > Building bitmaps: 100% (130/130), done.
> > Total 19001 (delta 14837), reused 4162 (delta 0)
> > 02d03::master > find objects/pack/pack* -type f|xargs ls -sht
> > 108K objects/pack/pack-8702a2550b7e29940af8bc62bc6fca011ccbd455.bitmap
> > 524K objects/pack/pack-8702a2550b7e29940af8bc62bc6fca011ccbd455.idx
> > 4.6M objects/pack/pack-8702a2550b7e29940af8bc62bc6fca011ccbd455.pack   <= ????
> > 108K objects/pack/pack-879f2c28d15e57d353eb8e0ddbcb540655c844c9.bitmap
> > 524K objects/pack/pack-879f2c28d15e57d353eb8e0ddbcb540655c844c9.idx
> > 4.7M objects/pack/pack-879f2c28d15e57d353eb8e0ddbcb540655c844c9.pack
> > 108K objects/pack/pack-e7a7aebfc6dc6b1431f6f56bb8b2f7e730cc4a0c.bitmap
> > 524K objects/pack/pack-e7a7aebfc6dc6b1431f6f56bb8b2f7e730cc4a0c.idx
> > 4.6M objects/pack/pack-e7a7aebfc6dc6b1431f6f56bb8b2f7e730cc4a0c.pack
> > 116K objects/pack/pack-994c76cb1999e3b29552677d05e6364e6be2ae5e.bitmap
> > 524K objects/pack/pack-994c76cb1999e3b29552677d05e6364e6be2ae5e.idx
> > 4.6M objects/pack/pack-994c76cb1999e3b29552677d05e6364e6be2ae5e.pack
> >    0 objects/pack/pack-e5b8848e7c1096274dba2430323ccaf5320c6846.keep
> > 108K objects/pack/pack-e5b8848e7c1096274dba2430323ccaf5320c6846.bitmap
> > 524K objects/pack/pack-e5b8848e7c1096274dba2430323ccaf5320c6846.idx
> > 4.6M objects/pack/pack-e5b8848e7c1096274dba2430323ccaf5320c6846.pack
> >
> > The ccbd455 pack and its metadata seem quite pointless to be
> > containing apparently all the data based on the size.
> >
> > If I use -ad, a new pack is still created,which, judging by the size,
> > is essentially everything again, (but at least the extra packs are
> > removed)
> >
> > 02d03::master> git repack -ad
> > Enumerating objects: 19001, done.
> > Counting objects: 100% (19001/19001), done.
> > Delta compression using up to 2 threads
> > Compressing objects: 100% (4114/4114), done.
> > Writing objects: 100% (19001/19001), done.
> > warning: ignoring extra bitmap file:
> > ./objects/pack/pack-e5b8848e7c1096274dba2430323ccaf5320c6846.pack
> > Reusing bitmaps: 104, done.
> > Selecting bitmap commits: 2550, done.
> > Building bitmaps: 100% (130/130), done.
> > Total 19001 (delta 14838), reused 19001 (delta 14838)
> > 02d03::master 9060> find objects/pack/pack* -type f|xargs ls -sht
> > 116K objects/pack/pack-46ab64716d4220aac8d53b380d90a264d5293d3d.bitmap
> > 524K objects/pack/pack-46ab64716d4220aac8d53b380d90a264d5293d3d.idx
> > 4.6M objects/pack/pack-46ab64716d4220aac8d53b380d90a264d5293d3d.pack   <= ????
> >    0 objects/pack/pack-e5b8848e7c1096274dba2430323ccaf5320c6846.keep
> > 108K objects/pack/pack-e5b8848e7c1096274dba2430323ccaf5320c6846.bitmap
> > 524K objects/pack/pack-e5b8848e7c1096274dba2430323ccaf5320c6846.idx
> > 4.6M objects/pack/pack-e5b8848e7c1096274dba2430323ccaf5320c6846.pack
> >
> > Previously, the kept pack would be kept, and no additional packs would
> > be created if no new objects were born in the repro.
> >
> > With the .keep placeholder removed, the duplication does not happen,
> > but all the repro is rewritten into a new pack, which does not look
> > correct.  Am I doing something unexpected?
>
> I haven't looked at this for more than a couple of minutes (and don't
> have more time now), but this is almost certainly due to 36eba0323d
> ("repack: enable bitmaps by default on bare repos", 2019-03-14). Can you
> confirm when you re-run with repack.writeBitmaps=false in the config?
>
> I.e. something in the "yes I want bitmaps" code implies "*.keep"
> semantics changing from "keep" to "replace", which is obvious in
> retrospect, since we can only have one *.bitmap per-repo.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: 2.22.0 repack -a duplicating pack contents
  2019-06-23 14:54 ` Ævar Arnfjörð Bjarmason
  2019-06-23 15:38   ` Janos Farkas
@ 2019-06-23 18:02   ` Jeff King
  2019-06-23 18:08     ` Eric Wong
  1 sibling, 1 reply; 20+ messages in thread
From: Jeff King @ 2019-06-23 18:02 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: Janos Farkas, git, Eric Wong

On Sun, Jun 23, 2019 at 04:54:50PM +0200, Ævar Arnfjörð Bjarmason wrote:

> I haven't looked at this for more than a couple of minutes (and don't
> have more time now), but this is almost certainly due to 36eba0323d
> ("repack: enable bitmaps by default on bare repos", 2019-03-14). Can you
> confirm when you re-run with repack.writeBitmaps=false in the config?
> 
> I.e. something in the "yes I want bitmaps" code implies "*.keep"
> semantics changing from "keep" to "replace", which is obvious in
> retrospect, since we can only have one *.bitmap per-repo.

Yeah, the .keep behavior with bitmaps is intended, though it means a
funny implication for the bitmap-by-default strategy.

Basically, you never want to have .keep files if you have bitmaps turned
on, and so the default for "respect .keeps" is based on whether bitmaps
are in use. See ee34a2bead (repack: add `repack.packKeptObjects` config
var, 2014-03-03).

I'm not sure of the right solution. For maximal backwards-compatibility,
the default for bitmaps could become "if not bare and if there are no
.keep files". But that would mean bitmaps sometimes not getting
generated because of the problems that ee34a2bead was trying to solve.

That's probably OK, though; you can always flip the bitmap config to
"true" yourself if you _must_ have bitmaps.

-Peff

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: 2.22.0 repack -a duplicating pack contents
  2019-06-23 18:02   ` Jeff King
@ 2019-06-23 18:08     ` Eric Wong
  2019-06-23 22:42       ` Jeff King
  0 siblings, 1 reply; 20+ messages in thread
From: Eric Wong @ 2019-06-23 18:08 UTC (permalink / raw)
  To: Jeff King; +Cc: Ævar Arnfjörð Bjarmason, Janos Farkas, git

Jeff King <peff@peff.net> wrote:
> On Sun, Jun 23, 2019 at 04:54:50PM +0200, Ævar Arnfjörð Bjarmason wrote:
> 
> > I haven't looked at this for more than a couple of minutes (and don't
> > have more time now), but this is almost certainly due to 36eba0323d
> > ("repack: enable bitmaps by default on bare repos", 2019-03-14). Can you
> > confirm when you re-run with repack.writeBitmaps=false in the config?
> > 
> > I.e. something in the "yes I want bitmaps" code implies "*.keep"
> > semantics changing from "keep" to "replace", which is obvious in
> > retrospect, since we can only have one *.bitmap per-repo.
> 
> Yeah, the .keep behavior with bitmaps is intended, though it means a
> funny implication for the bitmap-by-default strategy.
> 
> Basically, you never want to have .keep files if you have bitmaps turned
> on, and so the default for "respect .keeps" is based on whether bitmaps
> are in use. See ee34a2bead (repack: add `repack.packKeptObjects` config
> var, 2014-03-03).
> 
> I'm not sure of the right solution. For maximal backwards-compatibility,
> the default for bitmaps could become "if not bare and if there are no
> .keep files". But that would mean bitmaps sometimes not getting
> generated because of the problems that ee34a2bead was trying to solve.
> 
> That's probably OK, though; you can always flip the bitmap config to
> "true" yourself if you _must_ have bitmaps.

What about something like this?  Needs tests but I need to leave, now.

diff --git a/builtin/repack.c b/builtin/repack.c
index caca113927..1d99fb449b 100644
--- a/builtin/repack.c
+++ b/builtin/repack.c
@@ -89,6 +89,25 @@ static void remove_pack_on_signal(int signo)
 	raise(signo);
 }
 
+static int has_pack_keep_file(void)
+{
+	DIR *dir;
+	struct dirent *e;
+	int found = 0;
+
+	if (!(dir = opendir(packdir)))
+		return found;
+
+	while ((e = readdir(dir)) != NULL) {
+		if (ends_with(e->d_name, ".keep")) {
+			found = 1;
+			break;
+		}
+	}
+	closedir(dir);
+	return found;
+}
+
 /*
  * Adds all packs hex strings to the fname list, which do not
  * have a corresponding .keep file. These packs are not to
@@ -343,16 +362,20 @@ int cmd_repack(int argc, const char **argv, const char *prefix)
 	    (unpack_unreachable || (pack_everything & LOOSEN_UNREACHABLE)))
 		die(_("--keep-unreachable and -A are incompatible"));
 
-	if (write_bitmaps < 0)
+	packdir = mkpathdup("%s/pack", get_object_directory());
+
+	if (write_bitmaps < 0) {
 		write_bitmaps = (pack_everything & ALL_INTO_ONE) &&
-				 is_bare_repository();
+				 is_bare_repository() &&
+				 keep_pack_list.nr == 0 &&
+				 !has_pack_keep_file();
+	}
 	if (pack_kept_objects < 0)
 		pack_kept_objects = write_bitmaps;
 
 	if (write_bitmaps && !(pack_everything & ALL_INTO_ONE))
 		die(_(incremental_bitmap_conflict_error));
 
-	packdir = mkpathdup("%s/pack", get_object_directory());
 	packtmp = mkpathdup("%s/.tmp-%d-pack", packdir, (int)getpid());
 
 	sigchain_push_common(remove_pack_on_signal);

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: 2.22.0 repack -a duplicating pack contents
  2019-06-23 18:08     ` Eric Wong
@ 2019-06-23 22:42       ` Jeff King
  2019-06-24  9:30         ` Ævar Arnfjörð Bjarmason
  2019-06-28  7:02         ` [PATCH] repack: disable bitmaps-by-default if .keep files exist Eric Wong
  0 siblings, 2 replies; 20+ messages in thread
From: Jeff King @ 2019-06-23 22:42 UTC (permalink / raw)
  To: Eric Wong; +Cc: Ævar Arnfjörð Bjarmason, Janos Farkas, git

On Sun, Jun 23, 2019 at 06:08:25PM +0000, Eric Wong wrote:

> > I'm not sure of the right solution. For maximal backwards-compatibility,
> > the default for bitmaps could become "if not bare and if there are no
> > .keep files". But that would mean bitmaps sometimes not getting
> > generated because of the problems that ee34a2bead was trying to solve.
> > 
> > That's probably OK, though; you can always flip the bitmap config to
> > "true" yourself if you _must_ have bitmaps.
> 
> What about something like this?  Needs tests but I need to leave, now.

Yeah, I think that's the right direction.

Though...

> +static int has_pack_keep_file(void)
> +{
> +	DIR *dir;
> +	struct dirent *e;
> +	int found = 0;
> +
> +	if (!(dir = opendir(packdir)))
> +		return found;
> +
> +	while ((e = readdir(dir)) != NULL) {
> +		if (ends_with(e->d_name, ".keep")) {
> +			found = 1;
> +			break;
> +		}
> +	}
> +	closedir(dir);
> +	return found;
> +}

I think this can be replaced with just checking p->pack_keep for each
item in the packed_git list.

That's racy, but then so is your code here, since it's really the child
pack-objects which is going to deal with the .keep. I don't think we
need to care much about the race, though. Either:

  1. Somebody has made an old intentional .keep, which would not be
     racy. We'd see it in both places.

  2. Somebody _just_ made an intentional .keep; we'll race with that and
     maybe duplicate objects from the kept pack. But this is a rare
     occurrence, and there's no real ordering promise here anyway with
     somebody creating .keep files alongside a running repack.

  3. An incoming fetch/push may create a .keep file as a temporary lock,
     which we see here but which goes away by the time pack-objects
     runs. That's OK; we err on the side of not generating bitmaps, but
     they're an optimization anyway (and if you really insist on having
     them, you should tell Git to definitely make them instead of
     relying on this default behavior).

  4. Like (3), but we _don't _see the temporary .keep here but _do_ see
     it during pack-objects. That's OK, because we'll have told
     pack-objects to pack those objects anyway, which is the right
     thing.

-Peff

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: 2.22.0 repack -a duplicating pack contents
  2019-06-23 22:42       ` Jeff King
@ 2019-06-24  9:30         ` Ævar Arnfjörð Bjarmason
  2019-07-03 17:40           ` Jeff King
  2019-06-28  7:02         ` [PATCH] repack: disable bitmaps-by-default if .keep files exist Eric Wong
  1 sibling, 1 reply; 20+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2019-06-24  9:30 UTC (permalink / raw)
  To: Jeff King; +Cc: Eric Wong, Janos Farkas, git


On Mon, Jun 24 2019, Jeff King wrote:

> On Sun, Jun 23, 2019 at 06:08:25PM +0000, Eric Wong wrote:
>
>> > I'm not sure of the right solution. For maximal backwards-compatibility,
>> > the default for bitmaps could become "if not bare and if there are no
>> > .keep files". But that would mean bitmaps sometimes not getting
>> > generated because of the problems that ee34a2bead was trying to solve.
>> >
>> > That's probably OK, though; you can always flip the bitmap config to
>> > "true" yourself if you _must_ have bitmaps.
>>
>> What about something like this?  Needs tests but I need to leave, now.
>
> Yeah, I think that's the right direction.
>
> Though...
>
>> +static int has_pack_keep_file(void)
>> +{
>> +	DIR *dir;
>> +	struct dirent *e;
>> +	int found = 0;
>> +
>> +	if (!(dir = opendir(packdir)))
>> +		return found;
>> +
>> +	while ((e = readdir(dir)) != NULL) {
>> +		if (ends_with(e->d_name, ".keep")) {
>> +			found = 1;
>> +			break;
>> +		}
>> +	}
>> +	closedir(dir);
>> +	return found;
>> +}
>
> I think this can be replaced with just checking p->pack_keep for each
> item in the packed_git list.
>
> That's racy, but then so is your code here, since it's really the child
> pack-objects which is going to deal with the .keep. I don't think we
> need to care much about the race, though. Either:
>
>   1. Somebody has made an old intentional .keep, which would not be
>      racy. We'd see it in both places.
>
>   2. Somebody _just_ made an intentional .keep; we'll race with that and
>      maybe duplicate objects from the kept pack. But this is a rare
>      occurrence, and there's no real ordering promise here anyway with
>      somebody creating .keep files alongside a running repack.
>
>   3. An incoming fetch/push may create a .keep file as a temporary lock,
>      which we see here but which goes away by the time pack-objects
>      runs. That's OK; we err on the side of not generating bitmaps, but
>      they're an optimization anyway (and if you really insist on having
>      them, you should tell Git to definitely make them instead of
>      relying on this default behavior).

This sort of thing (#3) strikes me as a fairly pathological case we
should try to avoid. Now what we've turned on bitmaps by default people
will take the sort of performance increase noted in [1] for granted.

So they'll be happily running with that & then get a CPU/IO spike as the
*.bitmap files they'd been implicitly relying on for years in their
default config goes away, only to have it re-appear when "repack" runs
next.

I can't think of some great solution for this case, some thoughts:

 a. Perhaps we should split the *.keep flag into two things or
    more.

    We're using it for all of "I want this *.pack forever"
    (e.g. debugging) and "I want only this *.pack to contain the data
    found in it" (I/O & CPU optimization, what Janos wants) and "I'm
    git.git code avoiding a race with myself" (what you describe in #3).

    So maybe for the last of those we could also use and understand
    *.tmp-keep, at which point we wouldn't have this race described in
    #3. The 1st of those is a *.noprune and the 2nd is *.highlander (but
    whether it's worth splitting all that out v.s. just having
    *.tmp-keep is another matter).

 b) Shouldn't we at least print some warning to STDERR in this case so
    e.g. gc.log will note the performance degradation of the repo in its
    current configuration?

>   4. Like (3), but we _don't _see the temporary .keep here but _do_ see
>      it during pack-objects. That's OK, because we'll have told
>      pack-objects to pack those objects anyway, which is the right
>      thing.
>
> -Peff

1. https://github.blog/2015-09-22-counting-objects/

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH] repack: disable bitmaps-by-default if .keep files exist
  2019-06-23 22:42       ` Jeff King
  2019-06-24  9:30         ` Ævar Arnfjörð Bjarmason
@ 2019-06-28  7:02         ` Eric Wong
  2019-06-28  7:21           ` Ævar Arnfjörð Bjarmason
  2019-06-29  8:03           ` [PATCH] repack: disable bitmaps-by-default if .keep files exist SZEDER Gábor
  1 sibling, 2 replies; 20+ messages in thread
From: Eric Wong @ 2019-06-28  7:02 UTC (permalink / raw)
  To: Jeff King; +Cc: Ævar Arnfjörð Bjarmason, Janos Farkas, git

Jeff King <peff@peff.net> wrote:
> On Sun, Jun 23, 2019 at 06:08:25PM +0000, Eric Wong wrote:
> 
> > > I'm not sure of the right solution. For maximal backwards-compatibility,
> > > the default for bitmaps could become "if not bare and if there are no
> > > .keep files". But that would mean bitmaps sometimes not getting
> > > generated because of the problems that ee34a2bead was trying to solve.
> > > 
> > > That's probably OK, though; you can always flip the bitmap config to
> > > "true" yourself if you _must_ have bitmaps.
> > 
> > What about something like this?  Needs tests but I need to leave, now.
> 
> Yeah, I think that's the right direction.

OK.  I have a real patch with one additional test, below.
(don't have a lot of time for hacking)

> Though...
> 
> > +static int has_pack_keep_file(void)
> > +{
> > +	DIR *dir;
> > +	struct dirent *e;
> > +	int found = 0;
> > +
> > +	if (!(dir = opendir(packdir)))
> > +		return found;
> > +
> > +	while ((e = readdir(dir)) != NULL) {
> > +		if (ends_with(e->d_name, ".keep")) {
> > +			found = 1;
> > +			break;
> > +		}
> > +	}
> > +	closedir(dir);
> > +	return found;
> > +}
> 
> I think this can be replaced with just checking p->pack_keep for each
> item in the packed_git list.

Good point, I tend to forget git C API internals as soon as I
learn them :x

> That's racy, but then so is your code here, since it's really the child
> pack-objects which is going to deal with the .keep. I don't think we
> need to care much about the race, though. Either:

Agreed.  <snip>

--------8<-------
Subject: [PATCH] repack: disable bitmaps-by-default if .keep files exist

Bitmaps aren't useful with multiple packs, and users with
.keep files ended up with redundant packs when bitmaps
got enabled by default in bare repos.

So detect when .keep files exist and stop enabling bitmaps
by default in that case.

Wasteful (but otherwise harmless) race conditions with .keep files
documented by Jeff King still apply and there's a chance we'd
still end up with redundant data on the FS:

  https://public-inbox.org/git/20190623224244.GB1100@sigill.intra.peff.net/

Fixes: 36eba0323d3288a8 ("repack: enable bitmaps by default on bare repos")
Signed-off-by: Eric Wong <e@80x24.org>
Helped-by: Jeff King <peff@peff.net>
Reported-by: Janos Farkas <chexum@gmail.com>
---
 builtin/repack.c  | 18 ++++++++++++++++--
 t/t7700-repack.sh | 10 ++++++++++
 2 files changed, 26 insertions(+), 2 deletions(-)

diff --git a/builtin/repack.c b/builtin/repack.c
index caca113927..a9529d1afc 100644
--- a/builtin/repack.c
+++ b/builtin/repack.c
@@ -89,6 +89,17 @@ static void remove_pack_on_signal(int signo)
 	raise(signo);
 }
 
+static int has_pack_keep_file(void)
+{
+	struct packed_git *p;
+
+	for (p = get_packed_git(the_repository); p; p = p->next) {
+		if (p->pack_keep)
+			return 1;
+	}
+	return 0;
+}
+
 /*
  * Adds all packs hex strings to the fname list, which do not
  * have a corresponding .keep file. These packs are not to
@@ -343,9 +354,12 @@ int cmd_repack(int argc, const char **argv, const char *prefix)
 	    (unpack_unreachable || (pack_everything & LOOSEN_UNREACHABLE)))
 		die(_("--keep-unreachable and -A are incompatible"));
 
-	if (write_bitmaps < 0)
+	if (write_bitmaps < 0) {
 		write_bitmaps = (pack_everything & ALL_INTO_ONE) &&
-				 is_bare_repository();
+				 is_bare_repository() &&
+				 keep_pack_list.nr == 0 &&
+				 !has_pack_keep_file();
+	}
 	if (pack_kept_objects < 0)
 		pack_kept_objects = write_bitmaps;
 
diff --git a/t/t7700-repack.sh b/t/t7700-repack.sh
index 86d05160a3..0acde3b1f8 100755
--- a/t/t7700-repack.sh
+++ b/t/t7700-repack.sh
@@ -239,4 +239,14 @@ test_expect_success 'bitmaps can be disabled on bare repos' '
 	test -z "$bitmap"
 '
 
+test_expect_success 'no bitmaps created if .keep files present' '
+	pack=$(ls bare.git/objects/pack/*.pack) &&
+	test_path_is_file "$pack" &&
+	keep=${pack%.pack}.keep &&
+	>"$keep" &&
+	git -C bare.git repack -ad &&
+	bitmap=$(ls bare.git/objects/pack/*.bitmap 2>/dev/null || :) &&
+	test -z "$bitmap"
+'
+
 test_done
-- 
EW

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH] repack: disable bitmaps-by-default if .keep files exist
  2019-06-28  7:02         ` [PATCH] repack: disable bitmaps-by-default if .keep files exist Eric Wong
@ 2019-06-28  7:21           ` Ævar Arnfjörð Bjarmason
  2019-06-29 19:16             ` [PATCH 2/1] repack: warn if bitmaps are explicitly enabled with keep files Eric Wong
  2019-06-29  8:03           ` [PATCH] repack: disable bitmaps-by-default if .keep files exist SZEDER Gábor
  1 sibling, 1 reply; 20+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2019-06-28  7:21 UTC (permalink / raw)
  To: Eric Wong; +Cc: Jeff King, Janos Farkas, git


On Fri, Jun 28 2019, Eric Wong wrote:

> Jeff King <peff@peff.net> wrote:
>> On Sun, Jun 23, 2019 at 06:08:25PM +0000, Eric Wong wrote:
>>
>> > > I'm not sure of the right solution. For maximal backwards-compatibility,
>> > > the default for bitmaps could become "if not bare and if there are no
>> > > .keep files". But that would mean bitmaps sometimes not getting
>> > > generated because of the problems that ee34a2bead was trying to solve.
>> > >
>> > > That's probably OK, though; you can always flip the bitmap config to
>> > > "true" yourself if you _must_ have bitmaps.
>> >
>> > What about something like this?  Needs tests but I need to leave, now.
>>
>> Yeah, I think that's the right direction.
>
> OK.  I have a real patch with one additional test, below.
> (don't have a lot of time for hacking)
>
>> Though...
>>
>> > +static int has_pack_keep_file(void)
>> > +{
>> > +	DIR *dir;
>> > +	struct dirent *e;
>> > +	int found = 0;
>> > +
>> > +	if (!(dir = opendir(packdir)))
>> > +		return found;
>> > +
>> > +	while ((e = readdir(dir)) != NULL) {
>> > +		if (ends_with(e->d_name, ".keep")) {
>> > +			found = 1;
>> > +			break;
>> > +		}
>> > +	}
>> > +	closedir(dir);
>> > +	return found;
>> > +}
>>
>> I think this can be replaced with just checking p->pack_keep for each
>> item in the packed_git list.
>
> Good point, I tend to forget git C API internals as soon as I
> learn them :x
>
>> That's racy, but then so is your code here, since it's really the child
>> pack-objects which is going to deal with the .keep. I don't think we
>> need to care much about the race, though. Either:
>
> Agreed.  <snip>
>
> --------8<-------
> Subject: [PATCH] repack: disable bitmaps-by-default if .keep files exist
>
> Bitmaps aren't useful with multiple packs, and users with
> .keep files ended up with redundant packs when bitmaps
> got enabled by default in bare repos.
>
> So detect when .keep files exist and stop enabling bitmaps
> by default in that case.
>
> Wasteful (but otherwise harmless) race conditions with .keep files
> documented by Jeff King still apply and there's a chance we'd
> still end up with redundant data on the FS:
>
>   https://public-inbox.org/git/20190623224244.GB1100@sigill.intra.peff.net/
>
> Fixes: 36eba0323d3288a8 ("repack: enable bitmaps by default on bare repos")
> Signed-off-by: Eric Wong <e@80x24.org>
> Helped-by: Jeff King <peff@peff.net>
> Reported-by: Janos Farkas <chexum@gmail.com>
> ---
>  builtin/repack.c  | 18 ++++++++++++++++--
>  t/t7700-repack.sh | 10 ++++++++++
>  2 files changed, 26 insertions(+), 2 deletions(-)
>
> diff --git a/builtin/repack.c b/builtin/repack.c
> index caca113927..a9529d1afc 100644
> --- a/builtin/repack.c
> +++ b/builtin/repack.c
> @@ -89,6 +89,17 @@ static void remove_pack_on_signal(int signo)
>  	raise(signo);
>  }
>
> +static int has_pack_keep_file(void)
> +{
> +	struct packed_git *p;
> +
> +	for (p = get_packed_git(the_repository); p; p = p->next) {
> +		if (p->pack_keep)
> +			return 1;
> +	}
> +	return 0;
> +}
> +
>  /*
>   * Adds all packs hex strings to the fname list, which do not
>   * have a corresponding .keep file. These packs are not to
> @@ -343,9 +354,12 @@ int cmd_repack(int argc, const char **argv, const char *prefix)
>  	    (unpack_unreachable || (pack_everything & LOOSEN_UNREACHABLE)))
>  		die(_("--keep-unreachable and -A are incompatible"));
>
> -	if (write_bitmaps < 0)
> +	if (write_bitmaps < 0) {
>  		write_bitmaps = (pack_everything & ALL_INTO_ONE) &&
> -				 is_bare_repository();
> +				 is_bare_repository() &&
> +				 keep_pack_list.nr == 0 &&
> +				 !has_pack_keep_file();
> +	}
>  	if (pack_kept_objects < 0)
>  		pack_kept_objects = write_bitmaps;
>
> diff --git a/t/t7700-repack.sh b/t/t7700-repack.sh
> index 86d05160a3..0acde3b1f8 100755
> --- a/t/t7700-repack.sh
> +++ b/t/t7700-repack.sh
> @@ -239,4 +239,14 @@ test_expect_success 'bitmaps can be disabled on bare repos' '
>  	test -z "$bitmap"
>  '

I have the feedback I posted before this patch in
https://public-inbox.org/git/874l4f8h4c.fsf@evledraar.gmail.com/

In particular "b" there since "a" is clearly more work. I.e. shouldn't
we at least in interactive mode on a "gc" print something about skipping
what we'd otherwise do.

Maybe that's tricky with the gc.log functionality, but I think we should
at least document this before the next guy shows up with "sometimes my
.bitmap files aren't generated...".


> +test_expect_success 'no bitmaps created if .keep files present' '
> +	pack=$(ls bare.git/objects/pack/*.pack) &&
> +	test_path_is_file "$pack" &&
> +	keep=${pack%.pack}.keep &&
> +	>"$keep" &&
> +	git -C bare.git repack -ad &&
> +	bitmap=$(ls bare.git/objects/pack/*.bitmap 2>/dev/null || :) &&
> +	test -z "$bitmap"

Maybe more readable as:

	find .git/objects/pack/ -type f -name '*.bitmap' >actual &&
	test_must_be_empty actual

I.e. it avoids the "|| :" & subshell.

> +'
> +
>  test_done

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] repack: disable bitmaps-by-default if .keep files exist
  2019-06-28  7:02         ` [PATCH] repack: disable bitmaps-by-default if .keep files exist Eric Wong
  2019-06-28  7:21           ` Ævar Arnfjörð Bjarmason
@ 2019-06-29  8:03           ` SZEDER Gábor
  2019-06-29 19:13             ` [PATCH v2] " Eric Wong
  1 sibling, 1 reply; 20+ messages in thread
From: SZEDER Gábor @ 2019-06-29  8:03 UTC (permalink / raw)
  To: Eric Wong
  Cc: Jeff King, Derrick Stolee, Ævar Arnfjörð Bjarmason,
	Janos Farkas, git

On Fri, Jun 28, 2019 at 07:02:11AM +0000, Eric Wong wrote:
> Jeff King <peff@peff.net> wrote:
> > On Sun, Jun 23, 2019 at 06:08:25PM +0000, Eric Wong wrote:
> > 
> > > > I'm not sure of the right solution. For maximal backwards-compatibility,
> > > > the default for bitmaps could become "if not bare and if there are no
> > > > .keep files". But that would mean bitmaps sometimes not getting
> > > > generated because of the problems that ee34a2bead was trying to solve.
> > > > 
> > > > That's probably OK, though; you can always flip the bitmap config to
> > > > "true" yourself if you _must_ have bitmaps.
> > > 
> > > What about something like this?  Needs tests but I need to leave, now.
> > 
> > Yeah, I think that's the right direction.
> 
> OK.  I have a real patch with one additional test, below.
> (don't have a lot of time for hacking)
> 
> > Though...
> > 
> > > +static int has_pack_keep_file(void)
> > > +{
> > > +	DIR *dir;
> > > +	struct dirent *e;
> > > +	int found = 0;
> > > +
> > > +	if (!(dir = opendir(packdir)))
> > > +		return found;
> > > +
> > > +	while ((e = readdir(dir)) != NULL) {
> > > +		if (ends_with(e->d_name, ".keep")) {
> > > +			found = 1;
> > > +			break;
> > > +		}
> > > +	}
> > > +	closedir(dir);
> > > +	return found;
> > > +}
> > 
> > I think this can be replaced with just checking p->pack_keep for each
> > item in the packed_git list.
> 
> Good point, I tend to forget git C API internals as soon as I
> learn them :x
> 
> > That's racy, but then so is your code here, since it's really the child
> > pack-objects which is going to deal with the .keep. I don't think we
> > need to care much about the race, though. Either:
> 
> Agreed.  <snip>
> 
> --------8<-------
> Subject: [PATCH] repack: disable bitmaps-by-default if .keep files exist
> 
> Bitmaps aren't useful with multiple packs, and users with
> .keep files ended up with redundant packs when bitmaps
> got enabled by default in bare repos.
> 
> So detect when .keep files exist and stop enabling bitmaps
> by default in that case.
> 
> Wasteful (but otherwise harmless) race conditions with .keep files
> documented by Jeff King still apply and there's a chance we'd
> still end up with redundant data on the FS:
> 
>   https://public-inbox.org/git/20190623224244.GB1100@sigill.intra.peff.net/
> 
> Fixes: 36eba0323d3288a8 ("repack: enable bitmaps by default on bare repos")
> Signed-off-by: Eric Wong <e@80x24.org>
> Helped-by: Jeff King <peff@peff.net>
> Reported-by: Janos Farkas <chexum@gmail.com>
> ---
>  builtin/repack.c  | 18 ++++++++++++++++--
>  t/t7700-repack.sh | 10 ++++++++++
>  2 files changed, 26 insertions(+), 2 deletions(-)
> 
> diff --git a/builtin/repack.c b/builtin/repack.c
> index caca113927..a9529d1afc 100644
> --- a/builtin/repack.c
> +++ b/builtin/repack.c
> @@ -89,6 +89,17 @@ static void remove_pack_on_signal(int signo)
>  	raise(signo);
>  }
>  
> +static int has_pack_keep_file(void)
> +{
> +	struct packed_git *p;
> +
> +	for (p = get_packed_git(the_repository); p; p = p->next) {
> +		if (p->pack_keep)
> +			return 1;
> +	}
> +	return 0;
> +}
> +
>  /*
>   * Adds all packs hex strings to the fname list, which do not
>   * have a corresponding .keep file. These packs are not to
> @@ -343,9 +354,12 @@ int cmd_repack(int argc, const char **argv, const char *prefix)
>  	    (unpack_unreachable || (pack_everything & LOOSEN_UNREACHABLE)))
>  		die(_("--keep-unreachable and -A are incompatible"));
>  
> -	if (write_bitmaps < 0)
> +	if (write_bitmaps < 0) {
>  		write_bitmaps = (pack_everything & ALL_INTO_ONE) &&
> -				 is_bare_repository();
> +				 is_bare_repository() &&
> +				 keep_pack_list.nr == 0 &&
> +				 !has_pack_keep_file();
> +	}
>  	if (pack_kept_objects < 0)
>  		pack_kept_objects = write_bitmaps;
>  
> diff --git a/t/t7700-repack.sh b/t/t7700-repack.sh
> index 86d05160a3..0acde3b1f8 100755
> --- a/t/t7700-repack.sh
> +++ b/t/t7700-repack.sh
> @@ -239,4 +239,14 @@ test_expect_success 'bitmaps can be disabled on bare repos' '
>  	test -z "$bitmap"
>  '
>  
> +test_expect_success 'no bitmaps created if .keep files present' '
> +	pack=$(ls bare.git/objects/pack/*.pack) &&
> +	test_path_is_file "$pack" &&
> +	keep=${pack%.pack}.keep &&
> +	>"$keep" &&
> +	git -C bare.git repack -ad &&
> +	bitmap=$(ls bare.git/objects/pack/*.bitmap 2>/dev/null || :) &&
> +	test -z "$bitmap"

This test fails when run with 'GIT_TEST_MULTI_PACK_INDEX=1':

  + ls bare.git/objects/pack/pack-51a6a758692f03f9e1f31fd087bba30584afad2f.pack
  + pack=bare.git/objects/pack/pack-51a6a758692f03f9e1f31fd087bba30584afad2f.pack
  + test_path_is_file bare.git/objects/pack/pack-51a6a758692f03f9e1f31fd087bba30584afad2f.pack
  + test -f bare.git/objects/pack/pack-51a6a758692f03f9e1f31fd087bba30584afad2f.pack
  + keep=bare.git/objects/pack/pack-51a6a758692f03f9e1f31fd087bba30584afad2f.keep
  +
  + git -C bare.git repack -ad
  + ls bare.git/objects/pack/pack-51a6a758692f03f9e1f31fd087bba30584afad2f.bitmap
  + bitmap=bare.git/objects/pack/pack-51a6a758692f03f9e1f31fd087bba30584afad2f.bitmap
  + test -z bare.git/objects/pack/pack-51a6a758692f03f9e1f31fd087bba30584afad2f.bitmap
  error: last command exited with $?=1
  not ok 15 - no bitmaps created if .keep files present

When the new has_pack_keep_file() helper function calls
get_packed_git(the_repository) in the loop it returns with NULL
already on the first iteration, so the keep file goes unnoticed.

> +'
> +
>  test_done
> -- 
> EW

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v2] repack: disable bitmaps-by-default if .keep files exist
  2019-06-29  8:03           ` [PATCH] repack: disable bitmaps-by-default if .keep files exist SZEDER Gábor
@ 2019-06-29 19:13             ` Eric Wong
  0 siblings, 0 replies; 20+ messages in thread
From: Eric Wong @ 2019-06-29 19:13 UTC (permalink / raw)
  To: SZEDER Gábor
  Cc: Jeff King, Derrick Stolee, Ævar Arnfjörð Bjarmason,
	Janos Farkas, git

SZEDER Gábor <szeder.dev@gmail.com> wrote:
> On Fri, Jun 28, 2019 at 07:02:11AM +0000, Eric Wong wrote:
> > diff --git a/t/t7700-repack.sh b/t/t7700-repack.sh
> > index 86d05160a3..0acde3b1f8 100755
> > --- a/t/t7700-repack.sh
> > +++ b/t/t7700-repack.sh
> > @@ -239,4 +239,14 @@ test_expect_success 'bitmaps can be disabled on bare repos' '
> >  	test -z "$bitmap"
> >  '
> >  
> > +test_expect_success 'no bitmaps created if .keep files present' '
> > +	pack=$(ls bare.git/objects/pack/*.pack) &&
> > +	test_path_is_file "$pack" &&
> > +	keep=${pack%.pack}.keep &&
> > +	>"$keep" &&
> > +	git -C bare.git repack -ad &&
> > +	bitmap=$(ls bare.git/objects/pack/*.bitmap 2>/dev/null || :) &&
> > +	test -z "$bitmap"
> 
> This test fails when run with 'GIT_TEST_MULTI_PACK_INDEX=1':
> 
>   + ls bare.git/objects/pack/pack-51a6a758692f03f9e1f31fd087bba30584afad2f.pack
>   + pack=bare.git/objects/pack/pack-51a6a758692f03f9e1f31fd087bba30584afad2f.pack
>   + test_path_is_file bare.git/objects/pack/pack-51a6a758692f03f9e1f31fd087bba30584afad2f.pack
>   + test -f bare.git/objects/pack/pack-51a6a758692f03f9e1f31fd087bba30584afad2f.pack
>   + keep=bare.git/objects/pack/pack-51a6a758692f03f9e1f31fd087bba30584afad2f.keep
>   +
>   + git -C bare.git repack -ad
>   + ls bare.git/objects/pack/pack-51a6a758692f03f9e1f31fd087bba30584afad2f.bitmap
>   + bitmap=bare.git/objects/pack/pack-51a6a758692f03f9e1f31fd087bba30584afad2f.bitmap
>   + test -z bare.git/objects/pack/pack-51a6a758692f03f9e1f31fd087bba30584afad2f.bitmap
>   error: last command exited with $?=1
>   not ok 15 - no bitmaps created if .keep files present
> 
> When the new has_pack_keep_file() helper function calls
> get_packed_git(the_repository) in the loop it returns with NULL
> already on the first iteration, so the keep file goes unnoticed.

Oops, s/get_packed_git/get_all_packs/ seems to fix it.

The following also improves readability as suggested by Ævar
by avoiding a subshell.
-------8<---------
Subject: [PATCH] repack: disable bitmaps-by-default if .keep files exist

Bitmaps aren't useful with multiple packs, and users with
.keep files ended up with redundant packs when bitmaps
got enabled by default in bare repos.

So detect when .keep files exist and stop enabling bitmaps
by default in that case.

Wasteful (but otherwise harmless) race conditions with .keep files
documented by Jeff King still apply and there's a chance we'd
still end up with redundant data on the FS:

  https://public-inbox.org/git/20190623224244.GB1100@sigill.intra.peff.net/

v2: avoid subshell in test case, be multi-index aware

Fixes: 36eba0323d3288a8 ("repack: enable bitmaps by default on bare repos")
Signed-off-by: Eric Wong <e@80x24.org>
Helped-by: Jeff King <peff@peff.net>
Reported-by: Janos Farkas <chexum@gmail.com>
---
Interdiff:
  diff --git a/builtin/repack.c b/builtin/repack.c
  index a9529d1afc..73250b2431 100644
  --- a/builtin/repack.c
  +++ b/builtin/repack.c
  @@ -93,7 +93,7 @@ static int has_pack_keep_file(void)
   {
   	struct packed_git *p;
   
  -	for (p = get_packed_git(the_repository); p; p = p->next) {
  +	for (p = get_all_packs(the_repository); p; p = p->next) {
   		if (p->pack_keep)
   			return 1;
   	}
  diff --git a/t/t7700-repack.sh b/t/t7700-repack.sh
  index 0acde3b1f8..0e9af832c9 100755
  --- a/t/t7700-repack.sh
  +++ b/t/t7700-repack.sh
  @@ -245,8 +245,8 @@ test_expect_success 'no bitmaps created if .keep files present' '
   	keep=${pack%.pack}.keep &&
   	>"$keep" &&
   	git -C bare.git repack -ad &&
  -	bitmap=$(ls bare.git/objects/pack/*.bitmap 2>/dev/null || :) &&
  -	test -z "$bitmap"
  +	find bare.git/objects/pack/ -type f -name "*.bitmap" >actual &&
  +	test_must_be_empty actual
   '
   
   test_done

 builtin/repack.c  | 18 ++++++++++++++++--
 t/t7700-repack.sh | 10 ++++++++++
 2 files changed, 26 insertions(+), 2 deletions(-)

diff --git a/builtin/repack.c b/builtin/repack.c
index caca113927..73250b2431 100644
--- a/builtin/repack.c
+++ b/builtin/repack.c
@@ -89,6 +89,17 @@ static void remove_pack_on_signal(int signo)
 	raise(signo);
 }
 
+static int has_pack_keep_file(void)
+{
+	struct packed_git *p;
+
+	for (p = get_all_packs(the_repository); p; p = p->next) {
+		if (p->pack_keep)
+			return 1;
+	}
+	return 0;
+}
+
 /*
  * Adds all packs hex strings to the fname list, which do not
  * have a corresponding .keep file. These packs are not to
@@ -343,9 +354,12 @@ int cmd_repack(int argc, const char **argv, const char *prefix)
 	    (unpack_unreachable || (pack_everything & LOOSEN_UNREACHABLE)))
 		die(_("--keep-unreachable and -A are incompatible"));
 
-	if (write_bitmaps < 0)
+	if (write_bitmaps < 0) {
 		write_bitmaps = (pack_everything & ALL_INTO_ONE) &&
-				 is_bare_repository();
+				 is_bare_repository() &&
+				 keep_pack_list.nr == 0 &&
+				 !has_pack_keep_file();
+	}
 	if (pack_kept_objects < 0)
 		pack_kept_objects = write_bitmaps;
 
diff --git a/t/t7700-repack.sh b/t/t7700-repack.sh
index 86d05160a3..0e9af832c9 100755
--- a/t/t7700-repack.sh
+++ b/t/t7700-repack.sh
@@ -239,4 +239,14 @@ test_expect_success 'bitmaps can be disabled on bare repos' '
 	test -z "$bitmap"
 '
 
+test_expect_success 'no bitmaps created if .keep files present' '
+	pack=$(ls bare.git/objects/pack/*.pack) &&
+	test_path_is_file "$pack" &&
+	keep=${pack%.pack}.keep &&
+	>"$keep" &&
+	git -C bare.git repack -ad &&
+	find bare.git/objects/pack/ -type f -name "*.bitmap" >actual &&
+	test_must_be_empty actual
+'
+
 test_done
-- 
EW

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 2/1] repack: warn if bitmaps are explicitly enabled with keep files
  2019-06-28  7:21           ` Ævar Arnfjörð Bjarmason
@ 2019-06-29 19:16             ` Eric Wong
  2019-07-01 18:15               ` Junio C Hamano
  0 siblings, 1 reply; 20+ messages in thread
From: Eric Wong @ 2019-06-29 19:16 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: Jeff King, Janos Farkas, git

Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:
> I have the feedback I posted before this patch in
> https://public-inbox.org/git/874l4f8h4c.fsf@evledraar.gmail.com/
> 
> In particular "b" there since "a" is clearly more work. I.e. shouldn't
> we at least in interactive mode on a "gc" print something about skipping
> what we'd otherwise do.
> 
> Maybe that's tricky with the gc.log functionality, but I think we should
> at least document this before the next guy shows up with "sometimes my
> .bitmap files aren't generated...".

I'm not sure if the warning should be present by default;
because we'll silently stop using bitmaps, now.  But warning
if '-b' is specified seems right:

-------8<----------
Subject: [PATCH] repack: warn if bitmaps are explicitly enabled with keep
 files

If a user explicitly enables bitmaps, we should warn if .keep
files exist or are specified via --keep-pack

Signed-off-by: Eric Wong <e@80x24.org>
---
 builtin/repack.c  |  8 ++++++++
 t/t7700-repack.sh | 16 ++++++++++++++++
 2 files changed, 24 insertions(+)

diff --git a/builtin/repack.c b/builtin/repack.c
index 73250b2431..b1eeee88a7 100644
--- a/builtin/repack.c
+++ b/builtin/repack.c
@@ -359,7 +359,15 @@ int cmd_repack(int argc, const char **argv, const char *prefix)
 				 is_bare_repository() &&
 				 keep_pack_list.nr == 0 &&
 				 !has_pack_keep_file();
+	} else if (write_bitmaps > 0) {
+		if (keep_pack_list.nr)
+			fprintf(stderr,
+				_("WARNING: --keep-pack is incompatible with bitmaps\n"));
+		if (has_pack_keep_file())
+			fprintf(stderr,
+				_("WARNING: .keep files are incompatible with bitmaps\n"));
 	}
+
 	if (pack_kept_objects < 0)
 		pack_kept_objects = write_bitmaps;
 
diff --git a/t/t7700-repack.sh b/t/t7700-repack.sh
index 0e9af832c9..839484c7dc 100755
--- a/t/t7700-repack.sh
+++ b/t/t7700-repack.sh
@@ -249,4 +249,20 @@ test_expect_success 'no bitmaps created if .keep files present' '
 	test_must_be_empty actual
 '
 
+test_expect_success '-b warns with .keep files present' '
+	pack=$(ls bare.git/objects/pack/*.pack) &&
+	test_path_is_file "$pack" &&
+	keep=${pack%.pack}.keep &&
+	>"$keep" &&
+	git -C bare.git repack -adb 2>err &&
+	test_i18ngrep -F ".keep files are incompatible" err &&
+	rm -f "$keep"
+'
+
+test_expect_success '-b warns with --keep-pack specified' '
+	keep=$(cd bare.git/objects/pack/ && ls *.pack) &&
+	git -C bare.git repack -adb --keep-pack="$keep" 2>err &&
+	test_i18ngrep -F "keep-pack is incompatible" err
+'
+
 test_done
-- 
EW

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/1] repack: warn if bitmaps are explicitly enabled with keep files
  2019-06-29 19:16             ` [PATCH 2/1] repack: warn if bitmaps are explicitly enabled with keep files Eric Wong
@ 2019-07-01 18:15               ` Junio C Hamano
  2019-07-03 17:38                 ` Jeff King
  0 siblings, 1 reply; 20+ messages in thread
From: Junio C Hamano @ 2019-07-01 18:15 UTC (permalink / raw)
  To: Eric Wong
  Cc: Ævar Arnfjörð Bjarmason, Jeff King, Janos Farkas,
	git

Eric Wong <e@80x24.org> writes:

> Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:
>> I have the feedback I posted before this patch in
>> https://public-inbox.org/git/874l4f8h4c.fsf@evledraar.gmail.com/
>> 
>> In particular "b" there since "a" is clearly more work. I.e. shouldn't
>> we at least in interactive mode on a "gc" print something about skipping
>> what we'd otherwise do.
>> 
>> Maybe that's tricky with the gc.log functionality, but I think we should
>> at least document this before the next guy shows up with "sometimes my
>> .bitmap files aren't generated...".
>
> I'm not sure if the warning should be present by default;
> because we'll silently stop using bitmaps, now.  But warning
> if '-b' is specified seems right:

Hmph...  write_bitmaps can come either from the command line or from
the configuration.  When repack.writebitmaps configuration is set,
and some .keep files exist, the user probably is not even aware of
doing something that is unsupported.

The keep_pack_list will not be empty only if the "--keep-pack"
option is explicitly given from the command line, so making it and
an explicit "-b" on the command line a condition to trigger the
warning (or even "fatal: can't do both at the same time") is safe
enough, I think.  But anything more than that, I am not so sure.

Also, is it safe to unconditionally emit strings to the standard
error stream from the gc codepath these days?  I recall that once we
had trouble with the logic to skip repeatedly failing "gc --auto"
vaguely, but maybe I am remembering something else.

Thanks.


> -------8<----------
> Subject: [PATCH] repack: warn if bitmaps are explicitly enabled with keep
>  files
>
> If a user explicitly enables bitmaps, we should warn if .keep
> files exist or are specified via --keep-pack
>
> Signed-off-by: Eric Wong <e@80x24.org>
> ---
>  builtin/repack.c  |  8 ++++++++
>  t/t7700-repack.sh | 16 ++++++++++++++++
>  2 files changed, 24 insertions(+)
>
> diff --git a/builtin/repack.c b/builtin/repack.c
> index 73250b2431..b1eeee88a7 100644
> --- a/builtin/repack.c
> +++ b/builtin/repack.c
> @@ -359,7 +359,15 @@ int cmd_repack(int argc, const char **argv, const char *prefix)
>  				 is_bare_repository() &&
>  				 keep_pack_list.nr == 0 &&
>  				 !has_pack_keep_file();
> +	} else if (write_bitmaps > 0) {
> +		if (keep_pack_list.nr)
> +			fprintf(stderr,
> +				_("WARNING: --keep-pack is incompatible with bitmaps\n"));



> +		if (has_pack_keep_file())
> +			fprintf(stderr,
> +				_("WARNING: .keep files are incompatible with bitmaps\n"));
>  	}
> +
>  	if (pack_kept_objects < 0)
>  		pack_kept_objects = write_bitmaps;
>  
> diff --git a/t/t7700-repack.sh b/t/t7700-repack.sh
> index 0e9af832c9..839484c7dc 100755
> --- a/t/t7700-repack.sh
> +++ b/t/t7700-repack.sh
> @@ -249,4 +249,20 @@ test_expect_success 'no bitmaps created if .keep files present' '
>  	test_must_be_empty actual
>  '
>  
> +test_expect_success '-b warns with .keep files present' '
> +	pack=$(ls bare.git/objects/pack/*.pack) &&
> +	test_path_is_file "$pack" &&
> +	keep=${pack%.pack}.keep &&
> +	>"$keep" &&
> +	git -C bare.git repack -adb 2>err &&
> +	test_i18ngrep -F ".keep files are incompatible" err &&
> +	rm -f "$keep"
> +'
> +
> +test_expect_success '-b warns with --keep-pack specified' '
> +	keep=$(cd bare.git/objects/pack/ && ls *.pack) &&
> +	git -C bare.git repack -adb --keep-pack="$keep" 2>err &&
> +	test_i18ngrep -F "keep-pack is incompatible" err
> +'
> +
>  test_done

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/1] repack: warn if bitmaps are explicitly enabled with keep files
  2019-07-01 18:15               ` Junio C Hamano
@ 2019-07-03 17:38                 ` Jeff King
  2019-07-03 18:10                   ` Junio C Hamano
  0 siblings, 1 reply; 20+ messages in thread
From: Jeff King @ 2019-07-03 17:38 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Eric Wong, Ævar Arnfjörð Bjarmason, Janos Farkas,
	git

On Mon, Jul 01, 2019 at 11:15:38AM -0700, Junio C Hamano wrote:

> >> Maybe that's tricky with the gc.log functionality, but I think we should
> >> at least document this before the next guy shows up with "sometimes my
> >> .bitmap files aren't generated...".
> >
> > I'm not sure if the warning should be present by default;
> > because we'll silently stop using bitmaps, now.  But warning
> > if '-b' is specified seems right:
> 
> Hmph...  write_bitmaps can come either from the command line or from
> the configuration.  When repack.writebitmaps configuration is set,
> and some .keep files exist, the user probably is not even aware of
> doing something that is unsupported.

I think one tricky thing here is that we do not know if the .keep files
are meant to be there, or if they are simply transient locks.

The whole point of the current behavior is that we should be ignoring
the transient locks, and if you are explicitly using bitmaps you
understand the tradeoff you are making.

I think the important case to cover is the one where the user didn't
explicitly ask for them, and the initial patch (to disable them when
there's a .keep around) covers that.


A much more robust solution would be to stop conflating user-provided
permanent .keep files with temporary locks. I think that was a mistaken
design added many years ago. We probably could introduce a different
filename for the temporary locks (though I am not entirely convinced
they are necessary in the first place, as gc expiration-times would
generally save a racily-written packfile anyway).

Or perhaps we could differentiate our temporary locks from "real" .keep
files by looking at the content; I think our locks always say something
like "(receive|receive)-pack \d+ on .*", and it wouldn't be too onerous
to commit to that, I think (or even adjust it to something even more
unambiguous).

It does muddy the meaning of packed_git.pack_keep a bit.  Some callers
would want to consider it kept in either case (i.e., for purposes of
pruning, we delete neither) and some would want it kept only for
non-locks (for packing, duplicating the objects is OK). So I think we'd
end up with two bits there, and callers would have to use one or the
other as appropriate.

-Peff

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: 2.22.0 repack -a duplicating pack contents
  2019-06-24  9:30         ` Ævar Arnfjörð Bjarmason
@ 2019-07-03 17:40           ` Jeff King
  0 siblings, 0 replies; 20+ messages in thread
From: Jeff King @ 2019-07-03 17:40 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: Eric Wong, Janos Farkas, git

On Mon, Jun 24, 2019 at 11:30:59AM +0200, Ævar Arnfjörð Bjarmason wrote:

> I can't think of some great solution for this case, some thoughts:
> 
>  a. Perhaps we should split the *.keep flag into two things or
>     more.
> 
>     We're using it for all of "I want this *.pack forever"
>     (e.g. debugging) and "I want only this *.pack to contain the data
>     found in it" (I/O & CPU optimization, what Janos wants) and "I'm
>     git.git code avoiding a race with myself" (what you describe in #3).
> 
>     So maybe for the last of those we could also use and understand
>     *.tmp-keep, at which point we wouldn't have this race described in
>     #3. The 1st of those is a *.noprune and the 2nd is *.highlander (but
>     whether it's worth splitting all that out v.s. just having
>     *.tmp-keep is another matter).

Yeah, this is exactly the crux of the problem. We have two very
different things that are not distinguished by the reader. I left some
thoughts elsewhere in the thread:

  https://public-inbox.org/git/20190703173814.GA29348@sigill.intra.peff.net/

-Peff

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/1] repack: warn if bitmaps are explicitly enabled with keep files
  2019-07-03 17:38                 ` Jeff King
@ 2019-07-03 18:10                   ` Junio C Hamano
  2019-07-03 18:37                     ` Junio C Hamano
  2019-07-03 21:23                     ` Jeff King
  0 siblings, 2 replies; 20+ messages in thread
From: Junio C Hamano @ 2019-07-03 18:10 UTC (permalink / raw)
  To: Jeff King
  Cc: Eric Wong, Ævar Arnfjörð Bjarmason, Janos Farkas,
	git

Jeff King <peff@peff.net> writes:

>
> A much more robust solution would be to stop conflating user-provided
> permanent .keep files with temporary locks. I think that was a mistaken
> design added many years ago. We probably could introduce a different
> filename for the temporary locks (though I am not entirely convinced
> they are necessary in the first place, as gc expiration-times would
> generally save a racily-written packfile anyway).

True, true (and I tend to agree).

> Or perhaps we could differentiate our temporary locks from "real" .keep
> files by looking at the content; I think our locks always say something
> like "(receive|receive)-pack \d+ on .*", and it wouldn't be too onerous
> to commit to that, I think (or even adjust it to something even more
> unambiguous).

True, but it may be overkill to open and read.

> It does muddy the meaning of packed_git.pack_keep a bit.  Some callers
> would want to consider it kept in either case (i.e., for purposes of
> pruning, we delete neither) and some would want it kept only for
> non-locks (for packing, duplicating the objects is OK). So I think we'd
> end up with two bits there, and callers would have to use one or the
> other as appropriate.

Yeah, I agree that we'd need to treat them separately in the longer
run.

Thanks.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/1] repack: warn if bitmaps are explicitly enabled with keep files
  2019-07-03 18:10                   ` Junio C Hamano
@ 2019-07-03 18:37                     ` Junio C Hamano
  2019-07-03 21:24                       ` Jeff King
  2019-07-03 21:23                     ` Jeff King
  1 sibling, 1 reply; 20+ messages in thread
From: Junio C Hamano @ 2019-07-03 18:37 UTC (permalink / raw)
  To: Jeff King
  Cc: Eric Wong, Ævar Arnfjörð Bjarmason, Janos Farkas,
	git

Junio C Hamano <gitster@pobox.com> writes:

> Jeff King <peff@peff.net> writes:
>
>>
>> A much more robust solution would be to stop conflating user-provided
>> permanent .keep files with temporary locks. I think that was a mistaken
>> design added many years ago. We probably could introduce a different
>> filename for the temporary locks (though I am not entirely convinced
>> they are necessary in the first place, as gc expiration-times would
>> generally save a racily-written packfile anyway).
>
> True, true (and I tend to agree).
>
>> Or perhaps we could differentiate our temporary locks from "real" .keep
>> files by looking at the content; I think our locks always say something
>> like "(receive|receive)-pack \d+ on .*", and it wouldn't be too onerous
>> to commit to that, I think (or even adjust it to something even more
>> unambiguous).
>
> True, but it may be overkill to open and read.
>
>> It does muddy the meaning of packed_git.pack_keep a bit.  Some callers
>> would want to consider it kept in either case (i.e., for purposes of
>> pruning, we delete neither) and some would want it kept only for
>> non-locks (for packing, duplicating the objects is OK). So I think we'd
>> end up with two bits there, and callers would have to use one or the
>> other as appropriate.
>
> Yeah, I agree that we'd need to treat them separately in the longer
> run.
>
> Thanks.

In the meantime, this is about patch 2/1; the underlying "when .keep
is there, disable bitmaps" is probably good to go, still.

-- >8 --
From: Eric Wong <e@80x24.org>
Date: Sat, 29 Jun 2019 19:13:59 +0000
Subject: [PATCH] repack: disable bitmaps-by-default if .keep files exist

Bitmaps aren't useful with multiple packs, and users with
.keep files ended up with redundant packs when bitmaps
got enabled by default in bare repos.

So detect when .keep files exist and stop enabling bitmaps
by default in that case.

Wasteful (but otherwise harmless) race conditions with .keep files
documented by Jeff King still apply and there's a chance we'd
still end up with redundant data on the FS:

  https://public-inbox.org/git/20190623224244.GB1100@sigill.intra.peff.net/

v2: avoid subshell in test case, be multi-index aware

Fixes: 36eba0323d3288a8 ("repack: enable bitmaps by default on bare repos")
Signed-off-by: Eric Wong <e@80x24.org>
Helped-by: Jeff King <peff@peff.net>
Reported-by: Janos Farkas <chexum@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/repack.c  | 18 ++++++++++++++++--
 t/t7700-repack.sh | 10 ++++++++++
 2 files changed, 26 insertions(+), 2 deletions(-)

diff --git a/builtin/repack.c b/builtin/repack.c
index caca113927..73250b2431 100644
--- a/builtin/repack.c
+++ b/builtin/repack.c
@@ -89,6 +89,17 @@ static void remove_pack_on_signal(int signo)
 	raise(signo);
 }
 
+static int has_pack_keep_file(void)
+{
+	struct packed_git *p;
+
+	for (p = get_all_packs(the_repository); p; p = p->next) {
+		if (p->pack_keep)
+			return 1;
+	}
+	return 0;
+}
+
 /*
  * Adds all packs hex strings to the fname list, which do not
  * have a corresponding .keep file. These packs are not to
@@ -343,9 +354,12 @@ int cmd_repack(int argc, const char **argv, const char *prefix)
 	    (unpack_unreachable || (pack_everything & LOOSEN_UNREACHABLE)))
 		die(_("--keep-unreachable and -A are incompatible"));
 
-	if (write_bitmaps < 0)
+	if (write_bitmaps < 0) {
 		write_bitmaps = (pack_everything & ALL_INTO_ONE) &&
-				 is_bare_repository();
+				 is_bare_repository() &&
+				 keep_pack_list.nr == 0 &&
+				 !has_pack_keep_file();
+	}
 	if (pack_kept_objects < 0)
 		pack_kept_objects = write_bitmaps;
 
diff --git a/t/t7700-repack.sh b/t/t7700-repack.sh
index 86d05160a3..0e9af832c9 100755
--- a/t/t7700-repack.sh
+++ b/t/t7700-repack.sh
@@ -239,4 +239,14 @@ test_expect_success 'bitmaps can be disabled on bare repos' '
 	test -z "$bitmap"
 '
 
+test_expect_success 'no bitmaps created if .keep files present' '
+	pack=$(ls bare.git/objects/pack/*.pack) &&
+	test_path_is_file "$pack" &&
+	keep=${pack%.pack}.keep &&
+	>"$keep" &&
+	git -C bare.git repack -ad &&
+	find bare.git/objects/pack/ -type f -name "*.bitmap" >actual &&
+	test_must_be_empty actual
+'
+
 test_done
-- 
2.22.0-214-g8dca754b1e



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/1] repack: warn if bitmaps are explicitly enabled with keep files
  2019-07-03 18:10                   ` Junio C Hamano
  2019-07-03 18:37                     ` Junio C Hamano
@ 2019-07-03 21:23                     ` Jeff King
  2019-07-08 17:40                       ` Junio C Hamano
  1 sibling, 1 reply; 20+ messages in thread
From: Jeff King @ 2019-07-03 21:23 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Eric Wong, Ævar Arnfjörð Bjarmason, Janos Farkas,
	git

On Wed, Jul 03, 2019 at 11:10:22AM -0700, Junio C Hamano wrote:

> > Or perhaps we could differentiate our temporary locks from "real" .keep
> > files by looking at the content; I think our locks always say something
> > like "(receive|receive)-pack \d+ on .*", and it wouldn't be too onerous
> > to commit to that, I think (or even adjust it to something even more
> > unambiguous).
> 
> True, but it may be overkill to open and read.

Yeah, that cross my mind as well, but:

  1. We'd only need to open them when we _see_ them. And they're pretty
     rare anyway.

  2. Effort-wise, we're already opening and mmap-ing the .idx files, so
     this is on par.

  3. Most callers don't care about keep-files anyway. We could turn
     packed_git.pack_keep into:

       enum {
         PACK_KEEP_NONE,
         PACK_KEEP_LOCK,
	 PACK_KEEP_USER
       } check_packed_keep(struct packed_git *pack);

     and then most programs wouldn't pay anything.

Just some thoughts. I don't have immediate plans to work on it, but
maybe somebody else is excited about it. :)

-Peff

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/1] repack: warn if bitmaps are explicitly enabled with keep files
  2019-07-03 18:37                     ` Junio C Hamano
@ 2019-07-03 21:24                       ` Jeff King
  0 siblings, 0 replies; 20+ messages in thread
From: Jeff King @ 2019-07-03 21:24 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Eric Wong, Ævar Arnfjörð Bjarmason, Janos Farkas,
	git

On Wed, Jul 03, 2019 at 11:37:52AM -0700, Junio C Hamano wrote:

> In the meantime, this is about patch 2/1; the underlying "when .keep
> is there, disable bitmaps" is probably good to go, still.
> 
> -- >8 --
> From: Eric Wong <e@80x24.org>
> Date: Sat, 29 Jun 2019 19:13:59 +0000
> Subject: [PATCH] repack: disable bitmaps-by-default if .keep files exist

Yeah, this one looks good to me. Thanks for keeping things moving.

-Peff

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/1] repack: warn if bitmaps are explicitly enabled with keep files
  2019-07-03 21:23                     ` Jeff King
@ 2019-07-08 17:40                       ` Junio C Hamano
  0 siblings, 0 replies; 20+ messages in thread
From: Junio C Hamano @ 2019-07-08 17:40 UTC (permalink / raw)
  To: Jeff King
  Cc: Eric Wong, Ævar Arnfjörð Bjarmason, Janos Farkas,
	git

Jeff King <peff@peff.net> writes:

> On Wed, Jul 03, 2019 at 11:10:22AM -0700, Junio C Hamano wrote:
>
>> > Or perhaps we could differentiate our temporary locks from "real" .keep
>> > files by looking at the content; I think our locks always say something
>> > like "(receive|receive)-pack \d+ on .*", and it wouldn't be too onerous
>> > to commit to that, I think (or even adjust it to something even more
>> > unambiguous).
>> 
>> True, but it may be overkill to open and read.
>
> Yeah, that cross my mind as well, but:
>
>   1. We'd only need to open them when we _see_ them. And they're pretty
>      rare anyway.
>
>   2. Effort-wise, we're already opening and mmap-ing the .idx files, so
>      this is on par.
>
>   3. Most callers don't care about keep-files anyway. We could turn
>      packed_git.pack_keep into:
>
>        enum {
>          PACK_KEEP_NONE,
>          PACK_KEEP_LOCK,
> 	 PACK_KEEP_USER
>        } check_packed_keep(struct packed_git *pack);
>
>      and then most programs wouldn't pay anything.
>
> Just some thoughts. I don't have immediate plans to work on it, but
> maybe somebody else is excited about it. :)

OK.  I do agree that .keep would be rare enough to justify spending
a bit more extra cycles, as long as the benefit is big enough (and
in this case it may be a good trade-off).

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2019-07-08 17:41 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-23 12:15 2.22.0 repack -a duplicating pack contents Janos Farkas
2019-06-23 14:54 ` Ævar Arnfjörð Bjarmason
2019-06-23 15:38   ` Janos Farkas
2019-06-23 18:02   ` Jeff King
2019-06-23 18:08     ` Eric Wong
2019-06-23 22:42       ` Jeff King
2019-06-24  9:30         ` Ævar Arnfjörð Bjarmason
2019-07-03 17:40           ` Jeff King
2019-06-28  7:02         ` [PATCH] repack: disable bitmaps-by-default if .keep files exist Eric Wong
2019-06-28  7:21           ` Ævar Arnfjörð Bjarmason
2019-06-29 19:16             ` [PATCH 2/1] repack: warn if bitmaps are explicitly enabled with keep files Eric Wong
2019-07-01 18:15               ` Junio C Hamano
2019-07-03 17:38                 ` Jeff King
2019-07-03 18:10                   ` Junio C Hamano
2019-07-03 18:37                     ` Junio C Hamano
2019-07-03 21:24                       ` Jeff King
2019-07-03 21:23                     ` Jeff King
2019-07-08 17:40                       ` Junio C Hamano
2019-06-29  8:03           ` [PATCH] repack: disable bitmaps-by-default if .keep files exist SZEDER Gábor
2019-06-29 19:13             ` [PATCH v2] " Eric Wong

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).