git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH v2] multi-pack-index: fix *.rev cleanups with --object-dir
@ 2021-08-23  9:40 Johannes Berg
  2021-08-23 16:06 ` Jeff King
  0 siblings, 1 reply; 7+ messages in thread
From: Johannes Berg @ 2021-08-23  9:40 UTC (permalink / raw)
  To: git; +Cc: Taylor Blau

If using --object-dir to point into a repo while the current
working dir is outside, such as

  git init /repo
  git -C /repo ... # add some objects
  cd /non-repo
  git multi-pack-index --object-dir /repo/.git/objects/ write

the binary will segfault trying to access the object-dir via
the repo it found, but that's not fully initialized. Fix it
to use the object_dir properly to clean up the *.rev files,
this avoids the crash and cleans up the *.rev files for the
now rewritten multi-pack-index properly.

Fixes: 38ff7cabb6b8 ("pack-revindex: write multi-pack reverse indexes")
Cc: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
---
Due to running inside git's tree, even with TEST_NO_CREATE_REPO=t
I cannot reproduce the segfault in a test without the "cd /", so
I've kept that. Yes, the test caught in that case that the *.rev
file wasn't cleaned up (due to being initialized to the wrong git
repo [git's] and cleaning up there!), but I wanted to test the
segfault too.
---
 midx.c                      | 10 +++++-----
 t/t5319-multi-pack-index.sh | 19 +++++++++++++++++++
 2 files changed, 24 insertions(+), 5 deletions(-)

diff --git a/midx.c b/midx.c
index 321c6fdd2f18..902e1a7a7d9d 100644
--- a/midx.c
+++ b/midx.c
@@ -882,7 +882,7 @@ static void write_midx_reverse_index(char *midx_name, unsigned char *midx_hash,
 	strbuf_release(&buf);
 }
 
-static void clear_midx_files_ext(struct repository *r, const char *ext,
+static void clear_midx_files_ext(const char *object_dir, const char *ext,
 				 unsigned char *keep_hash);
 
 static int midx_checksum_valid(struct multi_pack_index *m)
@@ -1086,7 +1086,7 @@ static int write_midx_internal(const char *object_dir, struct multi_pack_index *
 
 	if (flags & MIDX_WRITE_REV_INDEX)
 		write_midx_reverse_index(midx_name, midx_hash, &ctx);
-	clear_midx_files_ext(the_repository, ".rev", midx_hash);
+	clear_midx_files_ext(object_dir, ".rev", midx_hash);
 
 	commit_lock_file(&lk);
 
@@ -1135,7 +1135,7 @@ static void clear_midx_file_ext(const char *full_path, size_t full_path_len,
 		die_errno(_("failed to remove %s"), full_path);
 }
 
-static void clear_midx_files_ext(struct repository *r, const char *ext,
+static void clear_midx_files_ext(const char *object_dir, const char *ext,
 				 unsigned char *keep_hash)
 {
 	struct clear_midx_data data;
@@ -1146,7 +1146,7 @@ static void clear_midx_files_ext(struct repository *r, const char *ext,
 				    hash_to_hex(keep_hash), ext);
 	data.ext = ext;
 
-	for_each_file_in_pack_dir(r->objects->odb->path,
+	for_each_file_in_pack_dir(object_dir,
 				  clear_midx_file_ext,
 				  &data);
 
@@ -1165,7 +1165,7 @@ void clear_midx_file(struct repository *r)
 	if (remove_path(midx))
 		die(_("failed to clear multi-pack-index at %s"), midx);
 
-	clear_midx_files_ext(r, ".rev", NULL);
+	clear_midx_files_ext(r->objects->odb->path, ".rev", NULL);
 
 	free(midx);
 }
diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh
index 3d4d9f10c31b..3b6331f64113 100755
--- a/t/t5319-multi-pack-index.sh
+++ b/t/t5319-multi-pack-index.sh
@@ -201,6 +201,25 @@ test_expect_success 'write midx with twelve packs' '
 
 compare_results_with_midx "twelve packs"
 
+test_expect_success 'multi-pack-index *.rev cleanup with --object-dir' '
+	git init objdir-test-repo &&
+	test_when_finished "rm -rf objdir-test-repo" &&
+	(
+		cd objdir-test-repo &&
+		test_commit base &&
+		git repack -d
+	) &&
+	rev="objdir-test-repo/$objdir/pack/multi-pack-index-abcdef123456.rev" &&
+	touch $rev &&
+	(
+		base="$(pwd)" &&
+		cd / && # run outside any git repo, including git itself
+		git multi-pack-index --object-dir="$base/objdir-test-repo/$objdir" write
+	) &&
+	test_path_is_file objdir-test-repo/$objdir/pack/multi-pack-index &&
+	test_path_is_missing $rev
+'
+
 test_expect_success 'warn on improper hash version' '
 	git init --object-format=sha1 sha1 &&
 	(
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] multi-pack-index: fix *.rev cleanups with --object-dir
  2021-08-23  9:40 [PATCH v2] multi-pack-index: fix *.rev cleanups with --object-dir Johannes Berg
@ 2021-08-23 16:06 ` Jeff King
  2021-08-23 17:05   ` Johannes Berg
  0 siblings, 1 reply; 7+ messages in thread
From: Jeff King @ 2021-08-23 16:06 UTC (permalink / raw)
  To: Johannes Berg; +Cc: git, Taylor Blau

On Mon, Aug 23, 2021 at 11:40:49AM +0200, Johannes Berg wrote:

> If using --object-dir to point into a repo while the current
> working dir is outside, such as
> 
>   git init /repo
>   git -C /repo ... # add some objects
>   cd /non-repo
>   git multi-pack-index --object-dir /repo/.git/objects/ write
> 
> the binary will segfault trying to access the object-dir via
> the repo it found, but that's not fully initialized. Fix it
> to use the object_dir properly to clean up the *.rev files,
> this avoids the crash and cleans up the *.rev files for the
> now rewritten multi-pack-index properly.

I'm not entirely convinced that writing a midx when not "inside" a repo
is something that we want to support. But if we do, then...

> Due to running inside git's tree, even with TEST_NO_CREATE_REPO=t
> I cannot reproduce the segfault in a test without the "cd /", so
> I've kept that. Yes, the test caught in that case that the *.rev
> file wasn't cleaned up (due to being initialized to the wrong git
> repo [git's] and cleaning up there!), but I wanted to test the
> segfault too.

...there's a helper in the test suite for doing this kind of "not in a
repo" test:

diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh
index 3b6331f641..3981bf96d0 100755
--- a/t/t5319-multi-pack-index.sh
+++ b/t/t5319-multi-pack-index.sh
@@ -211,11 +211,8 @@ test_expect_success 'multi-pack-index *.rev cleanup with --object-dir' '
 	) &&
 	rev="objdir-test-repo/$objdir/pack/multi-pack-index-abcdef123456.rev" &&
 	touch $rev &&
-	(
-		base="$(pwd)" &&
-		cd / && # run outside any git repo, including git itself
-		git multi-pack-index --object-dir="$base/objdir-test-repo/$objdir" write
-	) &&
+	nongit git multi-pack-index \
+		--object-dir="$PWD/objdir-test-repo/$objdir" write &&
 	test_path_is_file objdir-test-repo/$objdir/pack/multi-pack-index &&
 	test_path_is_missing $rev
 '

-Peff

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] multi-pack-index: fix *.rev cleanups with --object-dir
  2021-08-23 16:06 ` Jeff King
@ 2021-08-23 17:05   ` Johannes Berg
  2021-08-23 17:09     ` Taylor Blau
  0 siblings, 1 reply; 7+ messages in thread
From: Johannes Berg @ 2021-08-23 17:05 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Taylor Blau

On Mon, 2021-08-23 at 12:06 -0400, Jeff King wrote:
> On Mon, Aug 23, 2021 at 11:40:49AM +0200, Johannes Berg wrote:
> 
> > If using --object-dir to point into a repo while the current
> > working dir is outside, such as
> > 
> >   git init /repo
> >   git -C /repo ... # add some objects
> >   cd /non-repo
> >   git multi-pack-index --object-dir /repo/.git/objects/ write
> > 
> > the binary will segfault trying to access the object-dir via
> > the repo it found, but that's not fully initialized. Fix it
> > to use the object_dir properly to clean up the *.rev files,
> > this avoids the crash and cleans up the *.rev files for the
> > now rewritten multi-pack-index properly.
> 
> I'm not entirely convinced that writing a midx when not "inside" a repo
> is something that we want to support. But if we do, then...

Seemed like that was the point of --object-dir?

johannes


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] multi-pack-index: fix *.rev cleanups with --object-dir
  2021-08-23 17:05   ` Johannes Berg
@ 2021-08-23 17:09     ` Taylor Blau
  2021-08-23 17:56       ` Jeff King
                         ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Taylor Blau @ 2021-08-23 17:09 UTC (permalink / raw)
  To: Johannes Berg; +Cc: Jeff King, git

On Mon, Aug 23, 2021 at 07:05:31PM +0200, Johannes Berg wrote:
> On Mon, 2021-08-23 at 12:06 -0400, Jeff King wrote:
> > I'm not entirely convinced that writing a midx when not "inside" a repo
> > is something that we want to support. But if we do, then...
>
> Seemed like that was the point of --object-dir?

Stolee (cc'd) would know more as the original author, but as I recall
the point of `--object-dir` was to be able to write a midx in
directories which were acting as Git repositories, but didn't contain a
`.git` directory.

It's kind of a strange use-case, but I recall that it was important at
the time. Maybe he could shed more light on why. (Either way, we're
stuck with it ;)).

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] multi-pack-index: fix *.rev cleanups with --object-dir
  2021-08-23 17:09     ` Taylor Blau
@ 2021-08-23 17:56       ` Jeff King
  2021-08-23 17:58       ` Junio C Hamano
  2021-08-23 18:00       ` Derrick Stolee
  2 siblings, 0 replies; 7+ messages in thread
From: Jeff King @ 2021-08-23 17:56 UTC (permalink / raw)
  To: Taylor Blau; +Cc: Johannes Berg, git

On Mon, Aug 23, 2021 at 01:09:22PM -0400, Taylor Blau wrote:

> On Mon, Aug 23, 2021 at 07:05:31PM +0200, Johannes Berg wrote:
> > On Mon, 2021-08-23 at 12:06 -0400, Jeff King wrote:
> > > I'm not entirely convinced that writing a midx when not "inside" a repo
> > > is something that we want to support. But if we do, then...
> >
> > Seemed like that was the point of --object-dir?
> 
> Stolee (cc'd) would know more as the original author, but as I recall
> the point of `--object-dir` was to be able to write a midx in
> directories which were acting as Git repositories, but didn't contain a
> `.git` directory.
> 
> It's kind of a strange use-case, but I recall that it was important at
> the time. Maybe he could shed more light on why. (Either way, we're
> stuck with it ;)).

And the point was that those directories would also be alternates of the
current repo (and that there is a current repo). That's one of the
things that your midx-bitmap series tightens.

-Peff

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] multi-pack-index: fix *.rev cleanups with --object-dir
  2021-08-23 17:09     ` Taylor Blau
  2021-08-23 17:56       ` Jeff King
@ 2021-08-23 17:58       ` Junio C Hamano
  2021-08-23 18:00       ` Derrick Stolee
  2 siblings, 0 replies; 7+ messages in thread
From: Junio C Hamano @ 2021-08-23 17:58 UTC (permalink / raw)
  To: Taylor Blau; +Cc: Johannes Berg, Jeff King, git

Taylor Blau <me@ttaylorr.com> writes:

> On Mon, Aug 23, 2021 at 07:05:31PM +0200, Johannes Berg wrote:
>> On Mon, 2021-08-23 at 12:06 -0400, Jeff King wrote:
>> > I'm not entirely convinced that writing a midx when not "inside" a repo
>> > is something that we want to support. But if we do, then...
>>
>> Seemed like that was the point of --object-dir?
>
> Stolee (cc'd) would know more as the original author, but as I recall
> the point of `--object-dir` was to be able to write a midx in
> directories which were acting as Git repositories, but didn't contain a
> `.git` directory.
>
> It's kind of a strange use-case, but I recall that it was important at
> the time. Maybe he could shed more light on why. (Either way, we're
> stuck with it ;)).

It does sound strange.  "git -C $there multi-pack-index write"
would have felt more natural.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] multi-pack-index: fix *.rev cleanups with --object-dir
  2021-08-23 17:09     ` Taylor Blau
  2021-08-23 17:56       ` Jeff King
  2021-08-23 17:58       ` Junio C Hamano
@ 2021-08-23 18:00       ` Derrick Stolee
  2 siblings, 0 replies; 7+ messages in thread
From: Derrick Stolee @ 2021-08-23 18:00 UTC (permalink / raw)
  To: Taylor Blau, Johannes Berg; +Cc: Jeff King, git

On 8/23/2021 1:09 PM, Taylor Blau wrote:
> On Mon, Aug 23, 2021 at 07:05:31PM +0200, Johannes Berg wrote:
>> On Mon, 2021-08-23 at 12:06 -0400, Jeff King wrote:
>>> I'm not entirely convinced that writing a midx when not "inside" a repo
>>> is something that we want to support. But if we do, then...
>>
>> Seemed like that was the point of --object-dir?
> 
> Stolee (cc'd) would know more as the original author, but as I recall
> the point of `--object-dir` was to be able to write a midx in
> directories which were acting as Git repositories, but didn't contain a
> `.git` directory.
> 
> It's kind of a strange use-case, but I recall that it was important at
> the time. Maybe he could shed more light on why. (Either way, we're
> stuck with it ;)).

Yes, the point was for how VFS for Git (and now Scalar) built the
"shared object cache" directory. This is a directory that acts as
an alternate for VFS for Git and Scalar clones so objects downloaded
by one enlistment are immediately available in another.

When the multi-pack-index was created, it was designed to handle
this shared object cache through the --object-dir parameter. There
are custom patches in microsoft/git that shoehorn the option into
background maintenance via a config value.

If I were to redesign the feature for use by Git, then I would make
the clones create a bare copy of the repository (if it doesn't already
exist) and then create a worktree of that bare repo. The key is to
allow users to delete individual worktrees without losing the object
data.

In summary, there is a reason for its design like this, although its
due to an external tool. But we are using it a lot, so I'd prefer that
it stay.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-08-23 18:00 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-23  9:40 [PATCH v2] multi-pack-index: fix *.rev cleanups with --object-dir Johannes Berg
2021-08-23 16:06 ` Jeff King
2021-08-23 17:05   ` Johannes Berg
2021-08-23 17:09     ` Taylor Blau
2021-08-23 17:56       ` Jeff King
2021-08-23 17:58       ` Junio C Hamano
2021-08-23 18:00       ` Derrick Stolee

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).