git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH v3] multi-pack-index: fix *.rev cleanups with --object-dir
@ 2021-08-23 17:10 Johannes Berg
  2021-08-23 22:44 ` Junio C Hamano
  2021-08-24  0:22 ` Taylor Blau
  0 siblings, 2 replies; 6+ messages in thread
From: Johannes Berg @ 2021-08-23 17:10 UTC (permalink / raw)
  To: git; +Cc: Taylor Blau, Derrick Stolee, Jeff King

If using --object-dir to point into a repo while the current
working dir is outside, such as

  git init /repo
  git -C /repo ... # add some objects
  cd /non-repo
  git multi-pack-index --object-dir /repo/.git/objects/ write

the binary will segfault trying to access the object-dir via
the repo it found, but that's not fully initialized. Fix it
to use the object_dir properly to clean up the *.rev files,
this avoids the crash and cleans up the *.rev files for the
now rewritten multi-pack-index properly.

Fixes: 38ff7cabb6b8 ("pack-revindex: write multi-pack reverse indexes")
Cc: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
---
v3:
 - use nongit
---
 midx.c                      | 10 +++++-----
 t/t5319-multi-pack-index.sh | 15 +++++++++++++++
 2 files changed, 20 insertions(+), 5 deletions(-)

diff --git a/midx.c b/midx.c
index 321c6fdd2f18..902e1a7a7d9d 100644
--- a/midx.c
+++ b/midx.c
@@ -882,7 +882,7 @@ static void write_midx_reverse_index(char *midx_name, unsigned char *midx_hash,
 	strbuf_release(&buf);
 }
 
-static void clear_midx_files_ext(struct repository *r, const char *ext,
+static void clear_midx_files_ext(const char *object_dir, const char *ext,
 				 unsigned char *keep_hash);
 
 static int midx_checksum_valid(struct multi_pack_index *m)
@@ -1086,7 +1086,7 @@ static int write_midx_internal(const char *object_dir, struct multi_pack_index *
 
 	if (flags & MIDX_WRITE_REV_INDEX)
 		write_midx_reverse_index(midx_name, midx_hash, &ctx);
-	clear_midx_files_ext(the_repository, ".rev", midx_hash);
+	clear_midx_files_ext(object_dir, ".rev", midx_hash);
 
 	commit_lock_file(&lk);
 
@@ -1135,7 +1135,7 @@ static void clear_midx_file_ext(const char *full_path, size_t full_path_len,
 		die_errno(_("failed to remove %s"), full_path);
 }
 
-static void clear_midx_files_ext(struct repository *r, const char *ext,
+static void clear_midx_files_ext(const char *object_dir, const char *ext,
 				 unsigned char *keep_hash)
 {
 	struct clear_midx_data data;
@@ -1146,7 +1146,7 @@ static void clear_midx_files_ext(struct repository *r, const char *ext,
 				    hash_to_hex(keep_hash), ext);
 	data.ext = ext;
 
-	for_each_file_in_pack_dir(r->objects->odb->path,
+	for_each_file_in_pack_dir(object_dir,
 				  clear_midx_file_ext,
 				  &data);
 
@@ -1165,7 +1165,7 @@ void clear_midx_file(struct repository *r)
 	if (remove_path(midx))
 		die(_("failed to clear multi-pack-index at %s"), midx);
 
-	clear_midx_files_ext(r, ".rev", NULL);
+	clear_midx_files_ext(r->objects->odb->path, ".rev", NULL);
 
 	free(midx);
 }
diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh
index 3d4d9f10c31b..665c6d64a0ab 100755
--- a/t/t5319-multi-pack-index.sh
+++ b/t/t5319-multi-pack-index.sh
@@ -201,6 +201,21 @@ test_expect_success 'write midx with twelve packs' '
 
 compare_results_with_midx "twelve packs"
 
+test_expect_success 'multi-pack-index *.rev cleanup with --object-dir' '
+	git init objdir-test-repo &&
+	test_when_finished "rm -rf objdir-test-repo" &&
+	(
+		cd objdir-test-repo &&
+		test_commit base &&
+		git repack -d
+	) &&
+	rev="objdir-test-repo/$objdir/pack/multi-pack-index-abcdef123456.rev" &&
+	touch $rev &&
+	nongit git multi-pack-index --object-dir="$(pwd)/objdir-test-repo/$objdir" write &&
+	test_path_is_file objdir-test-repo/$objdir/pack/multi-pack-index &&
+	test_path_is_missing $rev
+'
+
 test_expect_success 'warn on improper hash version' '
 	git init --object-format=sha1 sha1 &&
 	(
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v3] multi-pack-index: fix *.rev cleanups with --object-dir
  2021-08-23 17:10 [PATCH v3] multi-pack-index: fix *.rev cleanups with --object-dir Johannes Berg
@ 2021-08-23 22:44 ` Junio C Hamano
  2021-08-24  7:59   ` Johannes Berg
  2021-08-24  0:22 ` Taylor Blau
  1 sibling, 1 reply; 6+ messages in thread
From: Junio C Hamano @ 2021-08-23 22:44 UTC (permalink / raw)
  To: Johannes Berg; +Cc: git, Taylor Blau, Derrick Stolee, Jeff King

Johannes Berg <johannes@sipsolutions.net> writes:

> If using --object-dir to point into a repo while the current
> working dir is outside, such as
>
>   git init /repo
>   git -C /repo ... # add some objects
>   cd /non-repo
>   git multi-pack-index --object-dir /repo/.git/objects/ write
>
> the binary will segfault trying to access the object-dir via
> the repo it found, but that's not fully initialized. Fix it

OK, so write_midx_internal() was given an object_dir to work in,
made various changes to that directory, but at the very end of the
sequence, instead of clearing the revindex in the object_dir we have
been working in, cleared the odb associated with the repository.

Initialized or not, that indeed is very wrong.

And the new code looks obviously correct, with minimal changes.

Thanks for finding and fixing.

> to use the object_dir properly to clean up the *.rev files,
> this avoids the crash and cleans up the *.rev files for the
> now rewritten multi-pack-index properly.
>
> Fixes: 38ff7cabb6b8 ("pack-revindex: write multi-pack reverse indexes")
> Cc: Taylor Blau <me@ttaylorr.com>
> Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
> ---
> v3:
>  - use nongit
> ---
>  midx.c                      | 10 +++++-----
>  t/t5319-multi-pack-index.sh | 15 +++++++++++++++
>  2 files changed, 20 insertions(+), 5 deletions(-)
>
> diff --git a/midx.c b/midx.c
> index 321c6fdd2f18..902e1a7a7d9d 100644
> --- a/midx.c
> +++ b/midx.c
> @@ -882,7 +882,7 @@ static void write_midx_reverse_index(char *midx_name, unsigned char *midx_hash,
>  	strbuf_release(&buf);
>  }
>  
> -static void clear_midx_files_ext(struct repository *r, const char *ext,
> +static void clear_midx_files_ext(const char *object_dir, const char *ext,
>  				 unsigned char *keep_hash);
>  
>  static int midx_checksum_valid(struct multi_pack_index *m)
> @@ -1086,7 +1086,7 @@ static int write_midx_internal(const char *object_dir, struct multi_pack_index *
>  
>  	if (flags & MIDX_WRITE_REV_INDEX)
>  		write_midx_reverse_index(midx_name, midx_hash, &ctx);
> -	clear_midx_files_ext(the_repository, ".rev", midx_hash);
> +	clear_midx_files_ext(object_dir, ".rev", midx_hash);
>  
>  	commit_lock_file(&lk);
>  
> @@ -1135,7 +1135,7 @@ static void clear_midx_file_ext(const char *full_path, size_t full_path_len,
>  		die_errno(_("failed to remove %s"), full_path);
>  }
>  
> -static void clear_midx_files_ext(struct repository *r, const char *ext,
> +static void clear_midx_files_ext(const char *object_dir, const char *ext,
>  				 unsigned char *keep_hash)
>  {
>  	struct clear_midx_data data;
> @@ -1146,7 +1146,7 @@ static void clear_midx_files_ext(struct repository *r, const char *ext,
>  				    hash_to_hex(keep_hash), ext);
>  	data.ext = ext;
>  
> -	for_each_file_in_pack_dir(r->objects->odb->path,
> +	for_each_file_in_pack_dir(object_dir,
>  				  clear_midx_file_ext,
>  				  &data);
>  
> @@ -1165,7 +1165,7 @@ void clear_midx_file(struct repository *r)
>  	if (remove_path(midx))
>  		die(_("failed to clear multi-pack-index at %s"), midx);
>  
> -	clear_midx_files_ext(r, ".rev", NULL);
> +	clear_midx_files_ext(r->objects->odb->path, ".rev", NULL);
>  
>  	free(midx);
>  }
> diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh
> index 3d4d9f10c31b..665c6d64a0ab 100755
> --- a/t/t5319-multi-pack-index.sh
> +++ b/t/t5319-multi-pack-index.sh
> @@ -201,6 +201,21 @@ test_expect_success 'write midx with twelve packs' '
>  
>  compare_results_with_midx "twelve packs"
>  
> +test_expect_success 'multi-pack-index *.rev cleanup with --object-dir' '
> +	git init objdir-test-repo &&
> +	test_when_finished "rm -rf objdir-test-repo" &&
> +	(
> +		cd objdir-test-repo &&
> +		test_commit base &&
> +		git repack -d
> +	) &&
> +	rev="objdir-test-repo/$objdir/pack/multi-pack-index-abcdef123456.rev" &&
> +	touch $rev &&
> +	nongit git multi-pack-index --object-dir="$(pwd)/objdir-test-repo/$objdir" write &&
> +	test_path_is_file objdir-test-repo/$objdir/pack/multi-pack-index &&
> +	test_path_is_missing $rev
> +'
> +
>  test_expect_success 'warn on improper hash version' '
>  	git init --object-format=sha1 sha1 &&
>  	(

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v3] multi-pack-index: fix *.rev cleanups with --object-dir
  2021-08-23 17:10 [PATCH v3] multi-pack-index: fix *.rev cleanups with --object-dir Johannes Berg
  2021-08-23 22:44 ` Junio C Hamano
@ 2021-08-24  0:22 ` Taylor Blau
  2021-08-24  7:50   ` Johannes Berg
  1 sibling, 1 reply; 6+ messages in thread
From: Taylor Blau @ 2021-08-24  0:22 UTC (permalink / raw)
  To: Johannes Berg; +Cc: git, Derrick Stolee, Jeff King

On Mon, Aug 23, 2021 at 07:10:11PM +0200, Johannes Berg wrote:
> If using --object-dir to point into a repo while the current
> working dir is outside, such as
>
>   git init /repo
>   git -C /repo ... # add some objects
>   cd /non-repo
>   git multi-pack-index --object-dir /repo/.git/objects/ write
>
> the binary will segfault trying to access the object-dir via
> the repo it found, but that's not fully initialized. Fix it
> to use the object_dir properly to clean up the *.rev files,
> this avoids the crash and cleans up the *.rev files for the
> now rewritten multi-pack-index properly.

Thanks for splitting this up and clarifying the bug. This matches my
understanding, and is careful to get the "--object-dir write" vs "write
--object-dir" (where the former segfaults, and the latter triggers a
BUG()) distinction right.

This looks good to me, but I had one suggestion for an additional test
we should consider before picking this up.

> diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh
> index 3d4d9f10c31b..665c6d64a0ab 100755
> --- a/t/t5319-multi-pack-index.sh
> +++ b/t/t5319-multi-pack-index.sh
> @@ -201,6 +201,21 @@ test_expect_success 'write midx with twelve packs' '
>
>  compare_results_with_midx "twelve packs"
>
> +test_expect_success 'multi-pack-index *.rev cleanup with --object-dir' '
> +	git init objdir-test-repo &&
> +	test_when_finished "rm -rf objdir-test-repo" &&
> +	(
> +		cd objdir-test-repo &&
> +		test_commit base &&
> +		git repack -d
> +	) &&
> +	rev="objdir-test-repo/$objdir/pack/multi-pack-index-abcdef123456.rev" &&
> +	touch $rev &&

This is the only non-obvious part of the patch, but is necessary because
there's no way to trigger the MIDX code to write a reverse index
(thankfully so, since this means that we're not affecting anybody in the
wild cleaning up .rev's that we shouldn't be).

It may be worth returning to this in the future when we have support for
MIDX bitmaps (which will trigger writing a .rev file), but this is
absolutely the right thing to do in the meantime.

> +	nongit git multi-pack-index --object-dir="$(pwd)/objdir-test-repo/$objdir" write &&
> +	test_path_is_file objdir-test-repo/$objdir/pack/multi-pack-index &&
> +	test_path_is_missing $rev

Makes sense. There's no point in testing that we ignore a .rev file in
the outer repository, since we're using nongit to trigger this bug.

But it may be worth adding an additional test which doesn't use nongit,
and instead invokes 'git multi-pack-index' from a Git repository, but
points at another repo's object directory. That should give us some
confidence that we're not deleting .rev files that we shouldn't.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v3] multi-pack-index: fix *.rev cleanups with --object-dir
  2021-08-24  0:22 ` Taylor Blau
@ 2021-08-24  7:50   ` Johannes Berg
  0 siblings, 0 replies; 6+ messages in thread
From: Johannes Berg @ 2021-08-24  7:50 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, Derrick Stolee, Jeff King

On Mon, 2021-08-23 at 20:22 -0400, Taylor Blau wrote:
> 
> > +	rev="objdir-test-repo/$objdir/pack/multi-pack-index-abcdef123456.rev" &&
> > +	touch $rev &&
> 
> This is the only non-obvious part of the patch, but is necessary because
> there's no way to trigger the MIDX code to write a reverse index
> (thankfully so, since this means that we're not affecting anybody in the
> wild cleaning up .rev's that we shouldn't be).
> 
> It may be worth returning to this in the future when we have support for
> MIDX bitmaps (which will trigger writing a .rev file)

No argument there, though it doesn't matter much for this test how you
arrive at a repo that has a .rev file.

> > +	nongit git multi-pack-index --object-dir="$(pwd)/objdir-test-repo/$objdir" write &&
> > +	test_path_is_file objdir-test-repo/$objdir/pack/multi-pack-index &&
> > +	test_path_is_missing $rev
> 
> Makes sense. There's no point in testing that we ignore a .rev file in
> the outer repository, since we're using nongit to trigger this bug.
> 
> But it may be worth adding an additional test which doesn't use nongit,
> and instead invokes 'git multi-pack-index' from a Git repository, but
> points at another repo's object directory. That should give us some
> confidence that we're not deleting .rev files that we shouldn't.

Maybe you can just send that as a separate follow-up patch? :)

I'm not _entirely_ sure what you'd want to test, you could do at least
these things:

 * test like this that the correct file is deleted, from another repo
   instead of nongit
 * additionally arrange the *other* repo to have a .rev file and check
   that it's *not* deleted?

But to me all of the three (including my test) seem quite equivalent, at
least as long as we assume that the code won't grow a "try to delete all
the .rev files anywhere I can find" thing :)

johannes



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v3] multi-pack-index: fix *.rev cleanups with --object-dir
  2021-08-23 22:44 ` Junio C Hamano
@ 2021-08-24  7:59   ` Johannes Berg
  2021-08-24 19:01     ` Junio C Hamano
  0 siblings, 1 reply; 6+ messages in thread
From: Johannes Berg @ 2021-08-24  7:59 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Taylor Blau, Derrick Stolee, Jeff King

On Mon, 2021-08-23 at 15:44 -0700, Junio C Hamano wrote:
> Johannes Berg <johannes@sipsolutions.net> writes:
> 
> > If using --object-dir to point into a repo while the current
> > working dir is outside, such as
> > 
> >   git init /repo
> >   git -C /repo ... # add some objects
> >   cd /non-repo
> >   git multi-pack-index --object-dir /repo/.git/objects/ write
> > 
> > the binary will segfault trying to access the object-dir via
> > the repo it found, but that's not fully initialized. Fix it
> 
> OK, so write_midx_internal() was given an object_dir to work in,
> made various changes to that directory, but at the very end of the
> sequence, instead of clearing the revindex in the object_dir we have
> been working in, cleared the odb associated with the repository.

I'm not sure I'd claim "cleared the odb" but it's also not entirely
clear to me what you mean by that.

Specifically, what happened is that it cleared out all the .rev files in
the objects/pack folder associated with the repository. And if there
wasn't actually a repository, it would NULL-ptr-deref instead.

Feel free to rewrite the commit log, or I can if you really want me to.
I was more concerned with the segfault, but I can also understand that
you'd be more concerned with the on-disk correctness issue this causes.

johannes


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v3] multi-pack-index: fix *.rev cleanups with --object-dir
  2021-08-24  7:59   ` Johannes Berg
@ 2021-08-24 19:01     ` Junio C Hamano
  0 siblings, 0 replies; 6+ messages in thread
From: Junio C Hamano @ 2021-08-24 19:01 UTC (permalink / raw)
  To: Johannes Berg; +Cc: git, Taylor Blau, Derrick Stolee, Jeff King

Johannes Berg <johannes@sipsolutions.net> writes:

> On Mon, 2021-08-23 at 15:44 -0700, Junio C Hamano wrote:
>> Johannes Berg <johannes@sipsolutions.net> writes:
>> 
>> > If using --object-dir to point into a repo while the current
>> > working dir is outside, such as
>> > 
>> >   git init /repo
>> >   git -C /repo ... # add some objects
>> >   cd /non-repo
>> >   git multi-pack-index --object-dir /repo/.git/objects/ write
>> > 
>> > the binary will segfault trying to access the object-dir via
>> > the repo it found, but that's not fully initialized. Fix it
>> 
>> OK, so write_midx_internal() was given an object_dir to work in,
>> made various changes to that directory, but at the very end of the
>> sequence, instead of clearing the revindex in the object_dir we have
>> been working in, cleared the odb associated with the repository.
>
> I'm not sure I'd claim "cleared the odb" but it's also not entirely
> clear to me what you mean by that.

"cleared the revindex in the wrong odb" is what I meant to say.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-08-24 19:01 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-23 17:10 [PATCH v3] multi-pack-index: fix *.rev cleanups with --object-dir Johannes Berg
2021-08-23 22:44 ` Junio C Hamano
2021-08-24  7:59   ` Johannes Berg
2021-08-24 19:01     ` Junio C Hamano
2021-08-24  0:22 ` Taylor Blau
2021-08-24  7:50   ` Johannes Berg

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).