From: Jeff King <peff@peff.net>
To: git@vger.kernel.org
Cc: Jonathan Tan <jonathantanmy@google.com>
Subject: [PATCH 0/16] enabling GIT_REF_PARANOIA by default
Date: Fri, 24 Sep 2021 14:30:16 -0400 [thread overview]
Message-ID: <YU4ZOF9+ubmoItmK@coredump.intra.peff.net> (raw)
I recently ran into a situation where dealing with a corrupted
repository was more confusing than necessary, because Git by default
ignores corrupted refs in many commands.
A while ago we introduced GIT_REF_PARANOIA, which works by including
broken refs in iteration, which then typically causes later operations
to fail (e.g., during repacking, you'd prefer to barf loudly when trying
to access the missing object rather than incorrectly assume the objects
from the broken ref aren't reachable).
I think this is a better default for Git to have in general, not just
for a few select operations (we turn it on by default for pruning and
some repacks). We shouldn't see corruptions in general, and complaining
loudly when we do is the safest option. The reason we held back when the
knob was introduced was mostly out of deference to the historical
behavior.
So this series started as a patch to just flip that default, but I found
some interesting things:
- there are a couple of tests that get confused. IMHO this is
vindicating the idea of flipping the default, beacuse in each case
these tests were poorly written (either corruptions they didn't
realize they had, or doing questionable operations on an incomplete
set of refs)
- the existing GIT_REF_PARANOIA is over-eager to complain about
dangling symrefs, even though they're perfectly fine
- as usual, there was some obvious cleanup along the way. ;)
Even if you don't buy the argument that we should flip the default, I
think everything up through patch 11 is a worthwhile cleanup on its own.
Note that this conflicts with jt/no-abuse-alternate-odb-for-submodules,
since it is touching the innards of DO_FOR_EACH_REF_INCLUDE_BROKEN, too.
I left a note on that series about how I think that could be reconciled
(i.e., the conflict is just around how the code is written, and not
inherent to the goals).
In the end I left GIT_REF_PARANOIA as a knob, just defaulting to "1". I
think it's possibly useful as an escape hatch when dealing with a
corrupt repo. But we _could_ go all the way and basically drop
DO_FOR_EACH_REF_INCLUDE_BROKEN's do-we-have-the-object check entirely.
That would totally sever the relationship between the ref store and the
object store, which would make things conceptually a lot simpler (and I
saw was discussed in some of those earlier threads).
Just a breakdown of the series:
[01/16]: t7900: clean up some more broken refs
[02/16]: t5516: don't use HEAD ref for invalid ref-deletion tests
[03/16]: t5600: provide detached HEAD for corruption failures
[04/16]: t5312: drop "verbose" helper
[05/16]: t5312: create bogus ref as necessary
[06/16]: t5312: test non-destructive repack
[07/16]: t5312: be more assertive about command failure
Test cleanups. Necessary for the default flip, but I think each
stands on its own.
[08/16]: refs-internal.h: move DO_FOR_EACH_* flags next to each other
[09/16]: refs-internal.h: reorganize DO_FOR_EACH_* flag documentation
Cleanup of existing features.
[10/16]: refs: add DO_FOR_EACH_OMIT_DANGLING_SYMREFS flag
[11/16]: refs: omit dangling symrefs when using GIT_REF_PARANOIA
Fixing the current over-eager behavior of GIT_REF_PARANOIA.
[12/16]: refs: turn on GIT_REF_PARANOIA by default
The actual flip.
[13/16]: repack, prune: drop GIT_REF_PARANOIA settings
[14/16]: ref-filter: stop setting FILTER_REFS_INCLUDE_BROKEN
[15/16]: ref-filter: drop broken-ref code entirely
[16/16]: refs: drop "broken" flag from for_each_fullref_in()
Some small cleanups we can do as a result.
Documentation/git.txt | 19 ++++++------
builtin/branch.c | 2 +-
builtin/for-each-ref.c | 2 +-
builtin/prune.c | 1 -
builtin/repack.c | 3 --
builtin/rev-parse.c | 4 +--
cache.h | 8 -----
environment.c | 1 -
ls-refs.c | 2 +-
ref-filter.c | 22 ++++++--------
ref-filter.h | 1 -
refs.c | 42 +++++++++++++-------------
refs.h | 9 ++----
refs/files-backend.c | 5 ++++
refs/refs-internal.h | 56 ++++++++++++++++++++++-------------
revision.c | 2 +-
t/t1430-bad-ref-name.sh | 2 +-
t/t5312-prune-corruption.sh | 48 ++++++++++++++++++++++--------
t/t5516-fetch-push.sh | 19 ++++++------
t/t5600-clone-fail-cleanup.sh | 4 ++-
t/t7900-maintenance.sh | 6 +++-
21 files changed, 142 insertions(+), 116 deletions(-)
-Peff
next reply other threads:[~2021-09-24 18:30 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-09-24 18:30 Jeff King [this message]
2021-09-24 18:32 ` [PATCH 01/16] t7900: clean up some more broken refs Jeff King
2021-09-27 17:38 ` Jonathan Tan
2021-09-27 19:49 ` Jeff King
2021-09-24 18:33 ` [PATCH 02/16] t5516: don't use HEAD ref for invalid ref-deletion tests Jeff King
2021-09-24 18:34 ` [PATCH 03/16] t5600: provide detached HEAD for corruption failures Jeff King
2021-09-24 18:35 ` [PATCH 04/16] t5312: drop "verbose" helper Jeff King
2021-09-24 18:36 ` [PATCH 05/16] t5312: create bogus ref as necessary Jeff King
2021-09-24 18:36 ` [PATCH 06/16] t5312: test non-destructive repack Jeff King
2021-09-24 18:37 ` [PATCH 07/16] t5312: be more assertive about command failure Jeff King
2021-09-24 18:37 ` [PATCH 08/16] refs-internal.h: move DO_FOR_EACH_* flags next to each other Jeff King
2021-09-24 18:39 ` [PATCH 09/16] refs-internal.h: reorganize DO_FOR_EACH_* flag documentation Jeff King
2021-09-24 18:41 ` [PATCH 10/16] refs: add DO_FOR_EACH_OMIT_DANGLING_SYMREFS flag Jeff King
2021-09-24 18:42 ` [PATCH 11/16] refs: omit dangling symrefs when using GIT_REF_PARANOIA Jeff King
2021-09-24 18:46 ` [PATCH 12/16] refs: turn on GIT_REF_PARANOIA by default Jeff King
2021-09-27 17:42 ` Jonathan Tan
2021-09-24 18:46 ` [PATCH 13/16] repack, prune: drop GIT_REF_PARANOIA settings Jeff King
2021-09-24 18:48 ` [PATCH 14/16] ref-filter: stop setting FILTER_REFS_INCLUDE_BROKEN Jeff King
2021-09-24 18:48 ` [PATCH 15/16] ref-filter: drop broken-ref code entirely Jeff King
2021-09-24 18:48 ` [PATCH 16/16] refs: drop "broken" flag from for_each_fullref_in() Jeff King
2021-09-27 17:47 ` Jonathan Tan
2021-09-24 20:22 ` [PATCH 0/16] enabling GIT_REF_PARANOIA by default Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YU4ZOF9+ubmoItmK@coredump.intra.peff.net \
--to=peff@peff.net \
--cc=git@vger.kernel.org \
--cc=jonathantanmy@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).