git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Cc: Derrick Stolee <dstolee@microsoft.com>,
	Junio C Hamano <gitster@pobox.com>, Eric Wong <e@80x24.org>,
	git@vger.kernel.org
Subject: Re: [PATCH v3] repack: enable bitmaps by default on bare repos
Date: Fri, 24 May 2019 04:26:06 -0400	[thread overview]
Message-ID: <20190524082605.GB9082@sigill.intra.peff.net> (raw)
In-Reply-To: <87r28offs6.fsf@evledraar.gmail.com>

On Fri, May 24, 2019 at 09:55:05AM +0200, Ævar Arnfjörð Bjarmason wrote:

> > I'm not sure what tickles the bitmap-writer to fail so hard. Is it
> > having too many refs? Weird patterns in the graph? Just a ton of
> > commits?
> 
> Ah, why did only this ancient (big) pack have a bitmap?
> 
> The bitmap writer had never failed, this was just a repository where
> some automation (on a dev/staging box) cloned a repo, and someone had
> once run a manual "repack" to make make a pack with a bitmap.

Just to be clear, by "fail" I didn't mean that the writer failed to
produce. I just meant it had poor commit selection for its bitmap
coverage. But...

> Then as time passed that pack stayed around, and re-looking at this that
> could have only happened because I had gc.bigPackThreshold turned on.
> 
> I.e. without that we'd have eventually done a full repack, so the bitmap
> would have gone away.
> 
> So getting the repo into that state was a series of unlikely events.

Ah, now that's interesting. The issue may have just been that we had a
ton of un-bitmapped commits because they weren't in the bitmapped pack
at all. The logic that Stolee pointed out earlier:

          /*
           * if we have a HAVES list, but none of those haves is contained
           * in the packfile that has a bitmap, we don't have anything to
           * optimize here
           */
          if (haves && !in_bitmapped_pack(bitmap_git, haves))
                  goto cleanup;

is pretty feeble. If you have even _one_ UNINTERESTING tip that's in the
pack, we'll try to use bitmaps. And in your case, you almost certainly
had old tags on both the client and the server there were in the old
bitmapped pack, but then a huge swath of history that had happened since
then. And it was that newer part of the graph that we had to walk to
fill in the bitmap.

So all of this makes pretty good sense to me now. Bitmaps work
incrementally as you add new, un-bitmapped history. But the cost gets
higher and higher the longer you go before repacking and generating new
bitmaps. So your case was very similar to a repo that uses bitmaps but
just hadn't packed in a really long time.

> I think to the extent that this is an issue we can reproduce in the
> future the proper fix for it in lieu of some easy fix in the bitmap code
> would be to just teach "gc" to unlink old *.bitmap files if we detect
> they're too stale.

Yeah. This could happen if you simply accumulated history without ever
running "gc". But in general we can probably assume that "gc" will run
periodically (though there is a real blind spot if you push up a very
huge chunk of history in one pack, since gc counts packs, not objects).

I agree that if gc is deciding to leave a big pack, it should probably
ditch the bitmaps for it.

> I.e. we don't need to deal gracefully with some case where the bitmaps
> just cover some tiny part of the graph, we can just teach "gc" to either
> update them, or (if we're not currently writing them) unlink them.

Unfortunately I don't think we can update them, because all of the
reachable objects need to be in the same pack. So any scheme that isn't
doing a periodic all-into-one repack probably shouldn't be using
bitmaps. I wonder if we need to tweak Eric's bitmaps-by-default logic to
better account for this.

> That seems to me to be a good idea in general, not just with bitmaps but
> also the commit graph. If we're doing a GC and our current settings
> aren't such that we'd update those files, shouldn't we just unlink them?

I don't think the commit graph would suffer from this. It's not tied to
a specific pack, so it can be freely updated on any gc. You still have
the problem when gc does not run. It's possible that auto-gc should have
separate thresholds for different activities (e.g., number of packs
should tell us when to repack objects, number of loose refs should tell
us when to pack refs, number of un-annotated commits should tell us when
to update the commit graph). The trouble is that some of those checks
are non-trivial.

In the long run, I think the plan is for the commit graph to allow cheap
incremental updating, which may make it reasonable to just update it
preemptively after every fetch/push.

-Peff

  reply	other threads:[~2019-05-24  8:26 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-14  4:31 [PATCH 0/3] some prune optimizations Jeff King
2019-02-14  4:35 ` [PATCH 1/3] prune: lazily perform reachability traversal Jeff King
2019-02-14 10:54   ` Eric Sunshine
2019-02-14 11:07     ` Jeff King
2019-02-14  4:37 ` [PATCH 2/3] prune: use bitmaps for " Jeff King
2019-03-09  2:49   ` bitmaps by default? [was: prune: use bitmaps for reachability traversal] Eric Wong
2019-03-10 23:39     ` Jeff King
2019-03-12  3:13       ` [PATCH] repack: enable bitmaps by default on bare repos Eric Wong
2019-03-12  9:07         ` Ævar Arnfjörð Bjarmason
2019-03-12 10:49         ` Jeff King
2019-03-12 12:05           ` Jeff King
2019-03-13  1:51           ` Eric Wong
2019-03-13 14:54             ` Jeff King
2019-03-14  9:12               ` [PATCH v3] " Eric Wong
2019-03-14 16:02                 ` Jeff King
2019-03-15  6:21                   ` [PATCH 0/2] enable bitmap hash-cache by default Jeff King
2019-03-15  6:22                     ` [PATCH 1/2] t5310: correctly remove bitmaps for jgit test Jeff King
2019-03-15 13:25                       ` SZEDER Gábor
2019-03-15 18:36                         ` Jeff King
2019-03-15  6:25                     ` [PATCH 2/2] pack-objects: default to writing bitmap hash-cache Jeff King
2019-04-09 15:10                 ` [PATCH v3] repack: enable bitmaps by default on bare repos Ævar Arnfjörð Bjarmason
2019-04-10 22:57                   ` Jeff King
2019-04-25  7:16                     ` Junio C Hamano
2019-05-04  1:37                       ` Jeff King
2019-05-04  6:52                         ` Ævar Arnfjörð Bjarmason
2019-05-04 13:23                           ` SZEDER Gábor
2019-05-08 20:17                             ` Ævar Arnfjörð Bjarmason
2019-05-09  4:24                               ` Junio C Hamano
2019-05-07  7:45                           ` Jeff King
2019-05-07  8:12                             ` Ævar Arnfjörð Bjarmason
2019-05-08  7:11                               ` Jeff King
2019-05-08 14:20                                 ` Derrick Stolee
2019-05-08 16:13                                 ` Ævar Arnfjörð Bjarmason
2019-05-08 22:25                                   ` Jeff King
2019-05-23 11:30                     ` Jeff King
2019-05-23 12:53                       ` Derrick Stolee
2019-05-24  7:24                         ` Jeff King
2019-05-24 10:33                           ` Derrick Stolee
2019-05-23 19:26                       ` Ævar Arnfjörð Bjarmason
2019-05-24  7:27                         ` Jeff King
2019-05-24  7:55                           ` Ævar Arnfjörð Bjarmason
2019-05-24  8:26                             ` Jeff King [this message]
2019-05-24  9:01                               ` Ævar Arnfjörð Bjarmason
2019-05-24  9:29                                 ` SZEDER Gábor
2019-05-24 11:17                                   ` Ævar Arnfjörð Bjarmason
2019-05-24 11:41                                     ` SZEDER Gábor
2019-05-24 11:58                                       ` Ævar Arnfjörð Bjarmason
2019-05-24 12:34                                         ` SZEDER Gábor
2019-05-24 13:41                                           ` Ævar Arnfjörð Bjarmason
2019-05-24 11:31                       ` [PATCH] pack-bitmap: look for an uninteresting bitmap Derrick Stolee
2019-04-15 15:00   ` [PATCH 2/3] prune: use bitmaps for reachability traversal Derrick Stolee
2019-04-18 19:49     ` Jeff King
2019-04-18 20:08       ` [PATCH] t5304: add a test for pruning with bitmaps Jeff King
2019-04-20  1:01         ` Derrick Stolee
2019-04-20  3:24           ` Jeff King
2019-04-20 21:01             ` Derrick Stolee
2019-02-14  4:38 ` [PATCH 3/3] prune: check SEEN flag for reachability Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190524082605.GB9082@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=avarab@gmail.com \
    --cc=dstolee@microsoft.com \
    --cc=e@80x24.org \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).