From: Duy Nguyen <pclouds@gmail.com>
To: Jeff King <peff@peff.net>
Cc: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
Ulrich.Windl@rz.uni-regensburg.de,
"Git Mailing List" <git@vger.kernel.org>
Subject: Re: non-smooth progress indication for git fsck and git gc
Date: Fri, 17 Aug 2018 16:39:56 +0200 [thread overview]
Message-ID: <CACsJy8Aycxf3S9zARuv_BeKLyh667ewcB1dr3X9VY3i3meR9hg@mail.gmail.com> (raw)
In-Reply-To: <20180816210657.GA9291@sigill.intra.peff.net>
On Thu, Aug 16, 2018 at 11:08 PM Jeff King <peff@peff.net> wrote:
>
> On Thu, Aug 16, 2018 at 04:55:56PM -0400, Jeff King wrote:
>
> > > * We spend the majority of the ~30s on this:
> > > https://github.com/git/git/blob/63749b2dea5d1501ff85bab7b8a7f64911d21dea/pack-check.c#L70-L79
> >
> > This is hashing the actual packfile. This is potentially quite long,
> > especially if you have a ton of big objects.
> >
> > I wonder if we need to do this as a separate step anyway, though. Our
> > verification is based on index-pack these days, which means it's going
> > to walk over the whole content as part of the "Indexing objects" step to
> > expand base objects and mark deltas for later. Could we feed this hash
> > as part of that walk over the data? It's not going to save us 30s, but
> > it's likely to be more efficient. And it would fold the effort naturally
> > into the existing progress meter.
>
> Actually, I take it back. That's the nice, modern way we do it in
> git-verify-pack. But git-fsck uses the ancient "just walk over all of
> the idx entries method". It at least sorts in pack order, which is good,
> but:
>
> - it's not multi-threaded, like index-pack/verify-pack
>
> - the index-pack way is actually more efficient than pack-ordering for
> the delta-base cache, because it actually walks the delta-graph in
> the optimal order
>
I actually tried to make git-fsck use index-pack --verify at one
point. The only thing that stopped it from working was index-pack
automatically wrote the newer index version if I remember correctly,
and that would fail the final hash check. fsck performance was not a
big deal so I dropped it. Just saying it should be possible, if
someone's interested in that direction.
--
Duy
next prev parent reply other threads:[~2018-08-17 14:40 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-08-16 6:54 non-smooth progress indication for git fsck and git gc Ulrich Windl
2018-08-16 15:18 ` Duy Nguyen
2018-08-16 16:05 ` Jeff King
2018-08-20 8:27 ` Antw: " Ulrich Windl
2018-08-16 15:57 ` Jeff King
2018-08-16 20:02 ` Jeff King
2018-08-16 22:10 ` Junio C Hamano
2018-08-16 20:35 ` Ævar Arnfjörð Bjarmason
2018-08-16 20:55 ` Jeff King
2018-08-16 21:06 ` Jeff King
2018-08-17 14:39 ` Duy Nguyen [this message]
2018-08-20 8:33 ` Antw: " Ulrich Windl
2018-08-20 8:57 ` Ævar Arnfjörð Bjarmason
2018-08-20 9:37 ` Ulrich Windl
2018-08-21 1:07 ` Jeff King
2018-08-21 6:20 ` Ulrich Windl
2018-08-21 15:21 ` Duy Nguyen
2018-09-01 12:53 ` Ævar Arnfjörð Bjarmason
2018-09-01 13:52 ` Ævar Arnfjörð Bjarmason
2018-09-02 7:46 ` Jeff King
2018-09-02 7:55 ` Jeff King
2018-09-02 8:55 ` Jeff King
2018-09-03 16:48 ` Ævar Arnfjörð Bjarmason
2018-09-07 3:30 ` Jeff King
2018-09-04 15:53 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CACsJy8Aycxf3S9zARuv_BeKLyh667ewcB1dr3X9VY3i3meR9hg@mail.gmail.com \
--to=pclouds@gmail.com \
--cc=Ulrich.Windl@rz.uni-regensburg.de \
--cc=avarab@gmail.com \
--cc=git@vger.kernel.org \
--cc=peff@peff.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).