git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Taylor Blau <me@ttaylorr.com>
To: git@vger.kernel.org
Cc: gitster@pobox.com, derrickstolee@github.com, gregory.szorc@gmail.com
Subject: [PATCH 3/7] midx.c: prevent `expire` from removing the cruft pack
Date: Mon, 19 Sep 2022 21:55:45 -0400	[thread overview]
Message-ID: <3ae9903d2df491e291ab975c56ec78aa13d95655.1663638929.git.me@ttaylorr.com> (raw)
In-Reply-To: <cover.1663638929.git.me@ttaylorr.com>

The `expire` sub-command unlinks any packs that are (a) contained in the
MIDX, but (b) have no objects referenced by the MIDX.

This sub-command ignores `.keep` packs, which remain on-disk even if
they have no objects referenced by the MIDX. Cruft packs, however,
aren't given the same treatment: if none of the objects contained in the
cruft pack are selected from the cruft pack by the MIDX, then the cruft
pack is eligible to be expired.

This is less than desireable, since the cruft pack has important
metadata about the individual object mtimes, which is useful to
determine how quickly an object should age out of the repository when
pruning.

Ordinarily, we wouldn't expect the contents of a cruft pack to
duplicated across non-cruft packs (and we'd expect to see the MIDX
select all cruft objects from other sources even less often). But
nonetheless, it is still possible to trick the `expire` sub-command into
removing the `.mtimes` file in this circumstance.

Teach the `expire` sub-command to ignore cruft packs in the same manner
as it does `.keep` packs, in order to keep their metadata around, even
when they are unreferenced by the MIDX.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 Documentation/git-multi-pack-index.txt |  4 ++--
 midx.c                                 |  2 +-
 t/t5319-multi-pack-index.sh            | 30 ++++++++++++++++++++++++++
 3 files changed, 33 insertions(+), 3 deletions(-)

diff --git a/Documentation/git-multi-pack-index.txt b/Documentation/git-multi-pack-index.txt
index 11e6dc53e3..3696506eb3 100644
--- a/Documentation/git-multi-pack-index.txt
+++ b/Documentation/git-multi-pack-index.txt
@@ -72,8 +72,8 @@ verify::
 expire::
 	Delete the pack-files that are tracked by the MIDX file, but
 	have no objects referenced by the MIDX (with the exception of
-	`.keep` packs). Rewrite the MIDX file afterward to remove all
-	references to these pack-files.
+	`.keep` packs and cruft packs). Rewrite the MIDX file afterward
+	to remove all references to these pack-files.
 
 repack::
 	Create a new pack-file containing objects in small pack-files
diff --git a/midx.c b/midx.c
index c27d0e5f15..bff5b99933 100644
--- a/midx.c
+++ b/midx.c
@@ -1839,7 +1839,7 @@ int expire_midx_packs(struct repository *r, const char *object_dir, unsigned fla
 		if (prepare_midx_pack(r, m, i))
 			continue;
 
-		if (m->packs[i]->pack_keep)
+		if (m->packs[i]->pack_keep || m->packs[i]->is_cruft)
 			continue;
 
 		pack_name = xstrdup(m->packs[i]->pack_name);
diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh
index afbe93f162..2d51b09680 100755
--- a/t/t5319-multi-pack-index.sh
+++ b/t/t5319-multi-pack-index.sh
@@ -847,6 +847,36 @@ test_expect_success 'expire respects .keep files' '
 	)
 '
 
+test_expect_success 'expiring unreferenced cruft pack retains pack' '
+	git init repo &&
+	test_when_finished "rm -fr repo" &&
+	(
+		cd repo &&
+
+		test_commit base &&
+		test_commit --no-tag unreachable &&
+		unreachable=$(git rev-parse HEAD) &&
+
+		git reset --hard base &&
+		git reflog expire --all --expire=all &&
+		git repack --cruft -d &&
+		mtimes="$(ls $objdir/pack/pack-*.mtimes)" &&
+
+		echo "base..$unreachable" >in &&
+		pack="$(git pack-objects --revs --delta-base-offset \
+			$objdir/pack/pack <in)" &&
+
+		# Preferring the contents of "$pack" will leave the
+		# cruft pack unreferenced (ie., none of the objects
+		# contained in the cruft pack will have their MIDX copy
+		# selected from the cruft pack).
+		git multi-pack-index write --preferred-pack="pack-$pack.pack" &&
+		git multi-pack-index expire &&
+
+		test_path_is_file "$mtimes"
+	)
+'
+
 test_expect_success 'repack --batch-size=0 repacks everything' '
 	cp -r dup dup2 &&
 	(
-- 
2.37.0.1.g1379af2e9d


  parent reply	other threads:[~2022-09-20  1:56 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-20  1:55 [PATCH 0/7] midx: ignore cruft pack with `repack`, `expire` Taylor Blau
2022-09-20  1:55 ` [PATCH 1/7] Documentation/git-multi-pack-index.txt: fix typo Taylor Blau
2022-09-20  1:55 ` [PATCH 2/7] Documentation/git-multi-pack-index.txt: clarify expire behavior Taylor Blau
2022-09-20  1:55 ` Taylor Blau [this message]
2022-09-22 14:08   ` [PATCH 3/7] midx.c: prevent `expire` from removing the cruft pack Derrick Stolee
2022-09-20  1:55 ` [PATCH 4/7] midx.c: avoid cruft packs with `repack --batch-size=0` Taylor Blau
2022-09-20  1:55 ` [PATCH 5/7] midx.c: replace `xcalloc()` with `CALLOC_ARRAY()` Taylor Blau
2022-09-20  1:55 ` [PATCH 6/7] midx.c: remove unnecessary loop condition Taylor Blau
2022-09-20  1:55 ` [PATCH 7/7] midx.c: avoid cruft packs with non-zero `repack --batch-size` Taylor Blau
2022-09-20 14:39 ` [PATCH 0/7] midx: ignore cruft pack with `repack`, `expire` Christian Couder
2022-09-22 14:14 ` Derrick Stolee

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3ae9903d2df491e291ab975c56ec78aa13d95655.1663638929.git.me@ttaylorr.com \
    --to=me@ttaylorr.com \
    --cc=derrickstolee@github.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=gregory.szorc@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).