git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: Alexandr Miloslavskiy <alexandr.miloslavskiy@syntevo.com>
Cc: git@vger.kernel.org
Subject: Re: Suggestion: "verify/repair" option for 'git gc'
Date: Thu, 14 Oct 2021 03:19:10 +0200	[thread overview]
Message-ID: <87h7dkh04o.fsf@evledraar.gmail.com> (raw)
In-Reply-To: <e288dbe1-b7c7-5a2e-5271-404a14de836a@syntevo.com>


On Wed, Oct 13 2021, Alexandr Miloslavskiy wrote:

> Suggestion
> ----------
> 1) It would be nice if 'git gc' had an option to also verify
>    (like 'git fsck') the repo and report corruption. I think that it's
>    a good idea to have it in 'gc' for performance reasons, because
>    'git gc' already reads things.
>
> 2) It would be nice if git could automatically download blobs from
>    remote if local blob is corrupted. Maybe it was already implemented,
>    see story 3 below.
>
> Motivation
> ----------
>
> -- Story 1 --
> Just a few days ago I encountered another secretly broken repo which
> caused some small bugs in the git UI I'm using. The repo worked mostly
> fine, that's why I had no idea that it's corrupted.
>
> My git UI invokes 'git gc' sometimes and if that detected the
> corruption, I wouldn't have to spend time hunting the bug in UI.
>
> Specifically, it reports these errors on `git fsck`
>   error: object 0189425cc210555c36383293c468df5da73acc48 is a commit,
>   not a blob
>   error in tree 1d571d7354f99b726bbcc0cb232b3f47846c71a1: broken links
>   error: object 0189425cc210555c36383293c468df5da73acc48 is a commit,
>   not a blob
>   error in tree 2808b286c2a933e88735d97416e29b9514fc6af2: broken links
>   error: object 0189425cc210555c36383293c468df5da73acc48 is a commit,
>   not a blob
>   error in tree 604f6f6c4fbf8da7a593708e863e68f8c5a27d07: broken links
>   error: object 0189425cc210555c36383293c468df5da73acc48 is a commit,
>   not a blob
>   error in tree 6a2c4a5ef0b0ee7aa85d88c3147b7558a6a7c29f: broken links
>
> The repo is not confidential and I could share it if needed.
> I "solved" the problem by cloning a new copy.

I'd be interested in a copy of it, I've been slowly trying to improve
these sorts of corruption cases.

> -- Story 2 --
> A few years ago, I had another repo that wasn't used for a couple years
> and had corrupted blobs. The repo looked fine until I tried to clone
> from it. Unfortunately it was the only copy and I had to write some
> code to "guess" the blob's contents to repair the repo.
>
> If 'git gc' detected corruption, I would have known about the problem
> earlier,
> when I still had other copies around.

I wonder if this and other issues you encountered wouldn't need a full
"fsck", but merely gc triggering a complete repack. Which is not to say
that some regular background "fsck" wouldn't be a good idea...

> -- Story 3 --
> Also a few years ago, I had a repo with a single corrupted blob. I don't
> remember why, but simply re-cloning it was a headache. I managed to fix repo
> by issuing a command to re-download a blob from remote. Git could totally do
> that itself, I think.

Yes, we still definitely have cases where dealing with this sort of
thing can be very painful.

  reply	other threads:[~2021-10-14  1:21 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-13 15:47 Suggestion: "verify/repair" option for 'git gc' Alexandr Miloslavskiy
2021-10-14  1:19 ` Ævar Arnfjörð Bjarmason [this message]
2021-10-14 12:47   ` Alexandr Miloslavskiy
2021-10-14 15:17     ` Ævar Arnfjörð Bjarmason
2021-10-14 20:23       ` Alexandr Miloslavskiy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87h7dkh04o.fsf@evledraar.gmail.com \
    --to=avarab@gmail.com \
    --cc=alexandr.miloslavskiy@syntevo.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).