* [PATCH 0/2] Fix fsck --name-objects bug
@ 2021-02-10 18:01 Johannes Schindelin via GitGitGadget
2021-02-10 18:01 ` [PATCH 1/2] t1450: robustify `remove_object()` Johannes Schindelin via GitGitGadget
2021-02-10 18:01 ` [PATCH 2/2] fsck --name-objects: be more careful parsing generation numbers Johannes Schindelin via GitGitGadget
0 siblings, 2 replies; 7+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2021-02-10 18:01 UTC (permalink / raw)
To: git; +Cc: Johannes Schindelin
As described in https://github.com/gitgitgadget/git/pull/873, I managed to
corrupt my Git checkout in a rather thorough manner yesterday, and it took
me a long time to undo the damage. One of my tools for that should have been
git fsck --name-objects, but that command produced bogus output: It said
that a commit with the name ~2 had a missing tree, but ~2 is not even a
legal rev name.
Turns out that this is an ancient bug, and the fact that nobody complained
about it suggests to me that the --name-objects probably has exactly one
user, and he uses it only every four years, when he manages to hose his Git
checkout.
Johannes Schindelin (2):
t1450: robustify `remove_object()`
fsck --name-objects: be more careful parsing generation numbers
fsck.c | 5 +++++
t/t1450-fsck.sh | 26 ++++++++++++--------------
2 files changed, 17 insertions(+), 14 deletions(-)
base-commit: 7397ca33730626f682845f8691b39c305535611e
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-874%2Fdscho%2Ffix-fsck-name-objects-generation-parsing-bug-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-874/dscho/fix-fsck-name-objects-generation-parsing-bug-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/874
--
gitgitgadget
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 1/2] t1450: robustify `remove_object()`
2021-02-10 18:01 [PATCH 0/2] Fix fsck --name-objects bug Johannes Schindelin via GitGitGadget
@ 2021-02-10 18:01 ` Johannes Schindelin via GitGitGadget
2021-02-10 20:36 ` Junio C Hamano
2021-02-10 18:01 ` [PATCH 2/2] fsck --name-objects: be more careful parsing generation numbers Johannes Schindelin via GitGitGadget
1 sibling, 1 reply; 7+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2021-02-10 18:01 UTC (permalink / raw)
To: git; +Cc: Johannes Schindelin, Johannes Schindelin
From: Johannes Schindelin <johannes.schindelin@gmx.de>
This function can be simplified by using the `test_oid_to_path()`
helper, which incidentally also makes it more robust by not relying on
the exact file system layout of the loose object files.
While at it, do not define those functions in a test case, it buys us
nothing.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
t/t1450-fsck.sh | 16 ++++++----------
1 file changed, 6 insertions(+), 10 deletions(-)
diff --git a/t/t1450-fsck.sh b/t/t1450-fsck.sh
index 02478bc4ece2..779f700ac4a0 100755
--- a/t/t1450-fsck.sh
+++ b/t/t1450-fsck.sh
@@ -41,17 +41,13 @@ test_expect_success 'HEAD is part of refs, valid objects appear valid' '
# specific corruption you test afterwards, lest a later test trip over
# it.
-test_expect_success 'setup: helpers for corruption tests' '
- sha1_file() {
- remainder=${1#??} &&
- firsttwo=${1%$remainder} &&
- echo ".git/objects/$firsttwo/$remainder"
- } &&
+sha1_file () {
+ git rev-parse --git-path objects/$(test_oid_to_path "$1")
+}
- remove_object() {
- rm "$(sha1_file "$1")"
- }
-'
+remove_object() {
+ rm "$(sha1_file "$1")"
+}
test_expect_success 'object with bad sha1' '
sha=$(echo blob | git hash-object -w --stdin) &&
--
gitgitgadget
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 2/2] fsck --name-objects: be more careful parsing generation numbers
2021-02-10 18:01 [PATCH 0/2] Fix fsck --name-objects bug Johannes Schindelin via GitGitGadget
2021-02-10 18:01 ` [PATCH 1/2] t1450: robustify `remove_object()` Johannes Schindelin via GitGitGadget
@ 2021-02-10 18:01 ` Johannes Schindelin via GitGitGadget
2021-02-10 20:38 ` Junio C Hamano
1 sibling, 1 reply; 7+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2021-02-10 18:01 UTC (permalink / raw)
To: git; +Cc: Johannes Schindelin, Johannes Schindelin
From: Johannes Schindelin <johannes.schindelin@gmx.de>
In 7b35efd734e (fsck_walk(): optionally name objects on the go,
2016-07-17), the `fsck` machinery learned to optionally name the
objects, so that it is easier to see what part of the repository is in a
bad shape, say, when objects are missing.
To save on complexity, this machinery uses a parser to determine the
name of a parent given a commit's name: any `~<n>` suffix is parsed and
the parent's name is formed from the prefix together with `~<n+1>`.
However, this parser has a bug: if it finds a suffix `<n>` that is _not_
`~<n>`, it will mistake the empty string for the prefix and `<n>` for
the generation number. In other words, it will generate a name of the
form `~<bogus-number>`.
Let's fix this.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
fsck.c | 5 +++++
t/t1450-fsck.sh | 10 ++++++----
2 files changed, 11 insertions(+), 4 deletions(-)
diff --git a/fsck.c b/fsck.c
index 73f30773f28a..83d727c6fe33 100644
--- a/fsck.c
+++ b/fsck.c
@@ -461,6 +461,11 @@ static int fsck_walk_commit(struct commit *commit, void *data, struct fsck_optio
generation += power * (name[--len] - '0');
if (power > 1 && len && name[len - 1] == '~')
name_prefix_len = len - 1;
+ else {
+ /* Maybe a non-first parent, e.g. HEAD^2 */
+ generation = 0;
+ name_prefix_len = len;
+ }
}
}
diff --git a/t/t1450-fsck.sh b/t/t1450-fsck.sh
index 779f700ac4a0..bfa3588f37ab 100755
--- a/t/t1450-fsck.sh
+++ b/t/t1450-fsck.sh
@@ -607,13 +607,15 @@ test_expect_success 'fsck --name-objects' '
git init name-objects &&
(
cd name-objects &&
+ git config core.logAllRefUpdates false &&
test_commit julius caesar.t &&
- test_commit augustus &&
- test_commit caesar &&
+ test_commit augustus44 &&
+ test_commit caesar &&
remove_object $(git rev-parse julius:caesar.t) &&
- test_must_fail git fsck --name-objects >out &&
tree=$(git rev-parse --verify julius:) &&
- test_i18ngrep "$tree (refs/tags/julius:" out
+ git tag -d julius &&
+ test_must_fail git fsck --name-objects >out &&
+ test_i18ngrep "$tree (refs/tags/augustus44\\^:" out
)
'
--
gitgitgadget
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 1/2] t1450: robustify `remove_object()`
2021-02-10 18:01 ` [PATCH 1/2] t1450: robustify `remove_object()` Johannes Schindelin via GitGitGadget
@ 2021-02-10 20:36 ` Junio C Hamano
2021-02-10 23:20 ` Taylor Blau
0 siblings, 1 reply; 7+ messages in thread
From: Junio C Hamano @ 2021-02-10 20:36 UTC (permalink / raw)
To: Johannes Schindelin via GitGitGadget; +Cc: git, Johannes Schindelin
"Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
writes:
> -test_expect_success 'setup: helpers for corruption tests' '
> - sha1_file() {
> - remainder=${1#??} &&
> - firsttwo=${1%$remainder} &&
> - echo ".git/objects/$firsttwo/$remainder"
> - } &&
> +sha1_file () {
> + git rev-parse --git-path objects/$(test_oid_to_path "$1")
> +}
Yeah, back when 90cf590f (fsck: optionally show more helpful info
for broken links, 2016-07-17) originally introduced this pattern,
we didn't have nicely abstracted helper, but now we do, and there
is no reason not to use it. Nice.
> - remove_object() {
> - rm "$(sha1_file "$1")"
> - }
> -'
> +remove_object() {
Just like you did for the other one, let's insert SP before () for
consistency here.
> + rm "$(sha1_file "$1")"
> +}
>
> test_expect_success 'object with bad sha1' '
> sha=$(echo blob | git hash-object -w --stdin) &&
Nicely done.
Reviewed-by: Junio C Hamano <gitster@pobox.com>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 2/2] fsck --name-objects: be more careful parsing generation numbers
2021-02-10 18:01 ` [PATCH 2/2] fsck --name-objects: be more careful parsing generation numbers Johannes Schindelin via GitGitGadget
@ 2021-02-10 20:38 ` Junio C Hamano
0 siblings, 0 replies; 7+ messages in thread
From: Junio C Hamano @ 2021-02-10 20:38 UTC (permalink / raw)
To: Johannes Schindelin via GitGitGadget; +Cc: git, Johannes Schindelin
"Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
writes:
> From: Johannes Schindelin <johannes.schindelin@gmx.de>
>
> In 7b35efd734e (fsck_walk(): optionally name objects on the go,
> 2016-07-17), the `fsck` machinery learned to optionally name the
> objects, so that it is easier to see what part of the repository is in a
> bad shape, say, when objects are missing.
>
> To save on complexity, this machinery uses a parser to determine the
> name of a parent given a commit's name: any `~<n>` suffix is parsed and
> the parent's name is formed from the prefix together with `~<n+1>`.
>
> However, this parser has a bug: if it finds a suffix `<n>` that is _not_
> `~<n>`, it will mistake the empty string for the prefix and `<n>` for
> the generation number. In other words, it will generate a name of the
> form `~<bogus-number>`.
>
> Let's fix this.
Thanks; will queue.
>
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---
> fsck.c | 5 +++++
> t/t1450-fsck.sh | 10 ++++++----
> 2 files changed, 11 insertions(+), 4 deletions(-)
>
> diff --git a/fsck.c b/fsck.c
> index 73f30773f28a..83d727c6fe33 100644
> --- a/fsck.c
> +++ b/fsck.c
> @@ -461,6 +461,11 @@ static int fsck_walk_commit(struct commit *commit, void *data, struct fsck_optio
> generation += power * (name[--len] - '0');
> if (power > 1 && len && name[len - 1] == '~')
> name_prefix_len = len - 1;
> + else {
> + /* Maybe a non-first parent, e.g. HEAD^2 */
> + generation = 0;
> + name_prefix_len = len;
> + }
> }
> }
>
> diff --git a/t/t1450-fsck.sh b/t/t1450-fsck.sh
> index 779f700ac4a0..bfa3588f37ab 100755
> --- a/t/t1450-fsck.sh
> +++ b/t/t1450-fsck.sh
> @@ -607,13 +607,15 @@ test_expect_success 'fsck --name-objects' '
> git init name-objects &&
> (
> cd name-objects &&
> + git config core.logAllRefUpdates false &&
> test_commit julius caesar.t &&
> - test_commit augustus &&
> - test_commit caesar &&
> + test_commit augustus44 &&
> + test_commit caesar &&
> remove_object $(git rev-parse julius:caesar.t) &&
> - test_must_fail git fsck --name-objects >out &&
> tree=$(git rev-parse --verify julius:) &&
> - test_i18ngrep "$tree (refs/tags/julius:" out
> + git tag -d julius &&
> + test_must_fail git fsck --name-objects >out &&
> + test_i18ngrep "$tree (refs/tags/augustus44\\^:" out
> )
> '
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 1/2] t1450: robustify `remove_object()`
2021-02-10 20:36 ` Junio C Hamano
@ 2021-02-10 23:20 ` Taylor Blau
2021-02-11 0:10 ` Junio C Hamano
0 siblings, 1 reply; 7+ messages in thread
From: Taylor Blau @ 2021-02-10 23:20 UTC (permalink / raw)
To: Junio C Hamano
Cc: Johannes Schindelin via GitGitGadget, git, Johannes Schindelin
On Wed, Feb 10, 2021 at 12:36:19PM -0800, Junio C Hamano wrote:
> "Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
> writes:
>
> > -test_expect_success 'setup: helpers for corruption tests' '
> > - sha1_file() {
> > - remainder=${1#??} &&
> > - firsttwo=${1%$remainder} &&
> > - echo ".git/objects/$firsttwo/$remainder"
> > - } &&
> > +sha1_file () {
> > + git rev-parse --git-path objects/$(test_oid_to_path "$1")
> > +}
>
> Yeah, back when 90cf590f (fsck: optionally show more helpful info
> for broken links, 2016-07-17) originally introduced this pattern,
> we didn't have nicely abstracted helper, but now we do, and there
> is no reason not to use it. Nice.
This has nothing to do with this series, but I do notice a number of
other uses of test_oid_to_path that are doing this exact thing. In fact,
many of them don't use "git rev-parse --git-path", which would be
better.
I wonder if it's worth a clean-up on top to consolidate all of those
"combine the loose object path of the object with xyz OID and the
$GIT_DIR/objects directory".
In either case -- and I think I'm pretty clearly being pedantic at this
point -- do you suppose it's worthwhile to rename sha1_file to something
that doesn't have sha1 in it?
Thanks,
Taylor
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 1/2] t1450: robustify `remove_object()`
2021-02-10 23:20 ` Taylor Blau
@ 2021-02-11 0:10 ` Junio C Hamano
0 siblings, 0 replies; 7+ messages in thread
From: Junio C Hamano @ 2021-02-11 0:10 UTC (permalink / raw)
To: Taylor Blau
Cc: Johannes Schindelin via GitGitGadget, git, Johannes Schindelin
Taylor Blau <me@ttaylorr.com> writes:
> On Wed, Feb 10, 2021 at 12:36:19PM -0800, Junio C Hamano wrote:
>> "Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
>> writes:
>>
>> > -test_expect_success 'setup: helpers for corruption tests' '
>> > - sha1_file() {
>> > - remainder=${1#??} &&
>> > - firsttwo=${1%$remainder} &&
>> > - echo ".git/objects/$firsttwo/$remainder"
>> > - } &&
>> > +sha1_file () {
>> > + git rev-parse --git-path objects/$(test_oid_to_path "$1")
>> > +}
>>
>> Yeah, back when 90cf590f (fsck: optionally show more helpful info
>> for broken links, 2016-07-17) originally introduced this pattern,
>> we didn't have nicely abstracted helper, but now we do, and there
>> is no reason not to use it. Nice.
>
> This has nothing to do with this series, but I do notice a number of
> other uses of test_oid_to_path that are doing this exact thing. In fact,
> many of them don't use "git rev-parse --git-path", which would be
> better.
>
> I wonder if it's worth a clean-up on top to consolidate all of those
> "combine the loose object path of the object with xyz OID and the
> $GIT_DIR/objects directory".
>
> In either case -- and I think I'm pretty clearly being pedantic at this
> point -- do you suppose it's worthwhile to rename sha1_file to something
> that doesn't have sha1 in it?
Possibly. That is probably outside the scope of this topic, but we
see such SHA -> HASH clean-up patches in different places, and this
certainly is a fair game for such a clean-up, I would think.
Thanks.
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2021-02-11 0:11 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-10 18:01 [PATCH 0/2] Fix fsck --name-objects bug Johannes Schindelin via GitGitGadget
2021-02-10 18:01 ` [PATCH 1/2] t1450: robustify `remove_object()` Johannes Schindelin via GitGitGadget
2021-02-10 20:36 ` Junio C Hamano
2021-02-10 23:20 ` Taylor Blau
2021-02-11 0:10 ` Junio C Hamano
2021-02-10 18:01 ` [PATCH 2/2] fsck --name-objects: be more careful parsing generation numbers Johannes Schindelin via GitGitGadget
2021-02-10 20:38 ` Junio C Hamano
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).