From: Taylor Blau <me@ttaylorr.com>
To: git@vger.kernel.org
Cc: Junio C Hamano <gitster@pobox.com>, Jeff King <peff@peff.net>,
Jonathan Tan <jonathantanmy@google.com>
Subject: [PATCH 0/2] repack: implement `--cruft-max-size`
Date: Thu, 7 Sep 2023 17:51:58 -0400 [thread overview]
Message-ID: <cover.1694123506.git.me@ttaylorr.com> (raw)
(These patches should be applied on top of a merge with
tb/repack-existing-packs-cleanup, and tb/multi-cruft-pack).
This series attempts to give users some more robust tools for managing
repositories with a large number of unreachable objects by storing them
in separate cruft packs, via a new option `--cruft-max-size`, like so:
$ git.compile repack -d --cruft --max-pack-size=10M
[...]
Enumerating cruft objects: 617483, done.
Counting objects: 100% (83791/83791), done.
Delta compression using up to 20 threads
Compressing objects: 100% (59696/59696), done.
Writing objects: 100% (83791/83791), done.
Total 83791 (delta 19251), reused 82502 (delta 19148), pack-reused 0
$ ls -la .git/objects/pack/pack-*.mtimes
-r--r--r-- 1 ttaylorr ttaylorr 179144 Sep 7 17:46 .git/objects/pack/pack-1a95260d26f2897abfd2d54f1d58f535acb81d23.mtimes
-r--r--r-- 1 ttaylorr ttaylorr 452 Sep 7 17:46 .git/objects/pack/pack-5fde8701ae0f2e5553f1fa33de05faf12f94c07f.mtimes
-r--r--r-- 1 ttaylorr ttaylorr 155720 Sep 7 17:46 .git/objects/pack/pack-91f9e66921e0ebe1b5e35d34842551468cecdc28.mtimes
-r--r--r-- 1 ttaylorr ttaylorr 56 Sep 7 17:46 .git/objects/pack/pack-95fe626743207b177b45f32b60fdc313e525ea60.mtimes
The details are explained in the second patch, but the gist is that we
will combine cruft packs up until they reach a certain threshold (as
specified by `--cruft-max-size`) and then begin a new "generation" of
cruft packs. That younger generation will grow up until it reaches the
configured threshold, at which point it will become "frozen" and then
any new unreachable objects will be written into a new generation of
cruft packs.
The goal of this series is to reduce I/O churn in repositories that
either (a) have a large number of unreachable objects, (b) rarely prune
them, or (c) both.
Instead of having to rewrite a cruft pack containing every unreachable
object in the repository, we only have to rewrite a cruft pack up until
it reaches the given threshold, at which point it is effectively kept
(i.e., it behaves as if the cruft pack had a ".keep" file tied to it,
provided that the threshold is held constant).
Thanks in advance for your review!
Taylor Blau (2):
t7700: split cruft-related tests to t7704
builtin/repack.c: implement support for `--cruft-max-size`
Documentation/config/gc.txt | 6 +
Documentation/git-gc.txt | 7 +
Documentation/git-repack.txt | 9 +
builtin/gc.c | 8 +
builtin/repack.c | 133 +++++++++++--
t/t6500-gc.sh | 27 +++
t/t7700-repack.sh | 121 -----------
t/t7704-repack-cruft.sh | 375 +++++++++++++++++++++++++++++++++++
8 files changed, 553 insertions(+), 133 deletions(-)
create mode 100755 t/t7704-repack-cruft.sh
--
2.42.0.138.g7e4e42e1aa
next reply other threads:[~2023-09-07 21:52 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-09-07 21:51 Taylor Blau [this message]
2023-09-07 21:52 ` [PATCH 1/2] t7700: split cruft-related tests to t7704 Taylor Blau
2023-09-08 0:01 ` Eric Sunshine
2023-09-07 21:52 ` [PATCH 2/2] builtin/repack.c: implement support for `--cruft-max-size` Taylor Blau
2023-09-07 23:42 ` Junio C Hamano
2023-09-25 18:01 ` Taylor Blau
2023-09-08 11:21 ` Patrick Steinhardt
2023-10-02 20:30 ` Taylor Blau
2023-10-03 0:44 ` [PATCH v2 0/3] repack: implement `--cruft-max-size` Taylor Blau
2023-10-03 0:44 ` [PATCH v2 1/3] t7700: split cruft-related tests to t7704 Taylor Blau
2023-10-03 0:44 ` [PATCH v2 2/3] builtin/repack.c: parse `--max-pack-size` with OPT_MAGNITUDE Taylor Blau
2023-10-05 11:31 ` Patrick Steinhardt
2023-10-05 17:28 ` Taylor Blau
2023-10-05 20:22 ` Junio C Hamano
2023-10-03 0:44 ` [PATCH v2 3/3] builtin/repack.c: implement support for `--max-cruft-size` Taylor Blau
2023-10-05 12:08 ` Patrick Steinhardt
2023-10-05 17:35 ` Taylor Blau
2023-10-05 20:25 ` Junio C Hamano
2023-10-07 17:20 ` [PATCH] repack: free existing_cruft array after use Jeff King
2023-10-09 1:24 ` Taylor Blau
2023-10-09 17:28 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=cover.1694123506.git.me@ttaylorr.com \
--to=me@ttaylorr.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=jonathantanmy@google.com \
--cc=peff@peff.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).