From: Taylor Blau <me@ttaylorr.com>
To: git@vger.kernel.org
Cc: peff@peff.net, dstolee@microsoft.com, gitster@pobox.com
Subject: [PATCH v4 0/8] repack: support repacking into a geometric sequence
Date: Mon, 22 Feb 2021 21:24:59 -0500 [thread overview]
Message-ID: <cover.1614047097.git.me@ttaylorr.com> (raw)
In-Reply-To: <cover.1611098616.git.me@ttaylorr.com>
Here's a very lightly modified version on v3 of mine and Peff's series
to add a new 'git repack --geometric' mode. Almost nothing has changed
since last time, with the exception of:
- Packs listed over standard input to 'git pack-objects --stdin-packs'
are sorted in descending mtime order (and objects are strung
together in pack order as before) so that objects are laid out
roughly newest-to-oldest in the resulting pack.
- Swapped the order of two paragraphs in patch 5 to make the perf
results clearer.
- Mention '--unpacked' specifically in the documentation for 'git
repack --geometric'.
- Typo fixes.
Range-diff is below. It would be good to start merging this down since
we have a release candidate coming up soon, and I'd rather focus future
reviewer efforts on the multi-pack reverse index and bitmaps series
instead of this one.
Jeff King (4):
p5303: add missing &&-chains
p5303: measure time to repack with keep
builtin/pack-objects.c: rewrite honor-pack-keep logic
packfile: add kept-pack cache for find_kept_pack_entry()
Taylor Blau (4):
packfile: introduce 'find_kept_pack_entry()'
revision: learn '--no-kept-objects'
builtin/pack-objects.c: add '--stdin-packs' option
builtin/repack.c: add '--geometric' option
Documentation/git-pack-objects.txt | 10 +
Documentation/git-repack.txt | 23 ++
builtin/pack-objects.c | 333 ++++++++++++++++++++++++-----
builtin/repack.c | 187 +++++++++++++++-
object-store.h | 5 +
packfile.c | 67 ++++++
packfile.h | 5 +
revision.c | 15 ++
revision.h | 4 +
t/perf/p5303-many-packs.sh | 36 +++-
t/t5300-pack-object.sh | 97 +++++++++
t/t6114-keep-packs.sh | 69 ++++++
t/t7703-repack-geometric.sh | 137 ++++++++++++
13 files changed, 926 insertions(+), 62 deletions(-)
create mode 100755 t/t6114-keep-packs.sh
create mode 100755 t/t7703-repack-geometric.sh
Range-diff against v3:
1: aa94edf39b = 1: bb674e5119 packfile: introduce 'find_kept_pack_entry()'
2: 82f6b45463 = 2: c85a915597 revision: learn '--no-kept-objects'
3: 033e4e3f67 ! 3: 649cf9020b builtin/pack-objects.c: add '--stdin-packs' option
@@ builtin/pack-objects.c: static int git_pack_config(const char *k, const char *v,
+ struct packed_git *a = ((const struct string_list_item*)_a)->util;
+ struct packed_git *b = ((const struct string_list_item*)_b)->util;
+
++ /*
++ * order packs by descending mtime so that objects are laid out
++ * roughly as newest-to-oldest
++ */
+ if (a->mtime < b->mtime)
-+ return -1;
-+ else if (b->mtime < a->mtime)
+ return 1;
++ else if (b->mtime < a->mtime)
++ return -1;
+ else
+ return 0;
+}
4: f9a5faf773 = 4: 6de9f0c52b p5303: add missing &&-chains
5: 181c104a03 ! 5: 94e4f3ee3a p5303: measure time to repack with keep
@@ Metadata
## Commit message ##
p5303: measure time to repack with keep
- Add two new tests to measure repack performance. Both test split the
+ Add two new tests to measure repack performance. Both tests split the
repository into synthetic "pushes", and then leave the remaining objects
in a big base pack.
@@ Commit message
5303.17: repack (1000) 216.87(490.79+14.57)
5303.18: repack with kept (1000) 665.63(938.87+15.76)
- Likewise, the scaling is pretty extreme on --stdin-packs:
-
- 5303.7: repack with --stdin-packs (1) 0.01(0.01+0.00)
- 5303.13: repack with --stdin-packs (50) 3.53(12.07+0.24)
- 5303.19: repack with --stdin-packs (1000) 195.83(371.82+8.10)
-
That's because the code paths around handling .keep files are known to
scale badly; they look in every single pack file to find each object.
Our solution to that was to notice that most repos don't have keep
@@ Commit message
single .keep, that part of pack-objects slows down again (even if we
have fewer objects total to look at).
+ Likewise, the scaling is pretty extreme on --stdin-packs (but each
+ subsequent test is also being asked to do more work):
+
+ 5303.7: repack with --stdin-packs (1) 0.01(0.01+0.00)
+ 5303.13: repack with --stdin-packs (50) 3.53(12.07+0.24)
+ 5303.19: repack with --stdin-packs (1000) 195.83(371.82+8.10)
+
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
6: 67af143fd1 = 6: a116587fb2 builtin/pack-objects.c: rewrite honor-pack-keep logic
7: e9e04b95e7 = 7: db9f07ec1a packfile: add kept-pack cache for find_kept_pack_entry()
8: bd492ec142 ! 8: 51f57d5da2 builtin/repack.c: add '--geometric' option
@@ Documentation/git-repack.txt: depth is 4095.
+packs determined to need to be combined in order to restore a geometric
+progression.
++
-+Loose objects are implicitly included in this "roll-up", without respect
-+to their reachability. This is subject to change in the future. This
-+option (implying a drastically different repack mode) is not guarenteed
-+to work with all other combinations of option to `git repack`).
++When `--unpacked` is specified, loose objects are implicitly included in
++this "roll-up", without respect to their reachability. This is subject
++to change in the future. This option (implying a drastically different
++repack mode) is not guaranteed to work with all other combinations of
++option to `git repack`).
+
Configuration
-------------
--
2.30.0.667.g81c0cbc6fd
next prev parent reply other threads:[~2021-02-23 2:26 UTC|newest]
Thread overview: 120+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-01-19 23:23 [PATCH 00/10] repack: support repacking into a geometric sequence Taylor Blau
2021-01-19 23:24 ` [PATCH 01/10] packfile: introduce 'find_kept_pack_entry()' Taylor Blau
2021-01-20 13:40 ` Derrick Stolee
2021-01-20 14:38 ` Taylor Blau
2021-01-29 2:33 ` Junio C Hamano
2021-01-29 18:38 ` Taylor Blau
2021-01-29 19:31 ` Jeff King
2021-01-29 20:20 ` Junio C Hamano
2021-01-19 23:24 ` [PATCH 02/10] revision: learn '--no-kept-objects' Taylor Blau
2021-01-29 3:10 ` Junio C Hamano
2021-01-29 19:13 ` Taylor Blau
2021-01-19 23:24 ` [PATCH 03/10] builtin/pack-objects.c: learn '--assume-kept-packs-closed' Taylor Blau
2021-01-29 3:21 ` Junio C Hamano
2021-01-29 19:19 ` Jeff King
2021-01-29 20:01 ` Taylor Blau
2021-01-29 20:25 ` Jeff King
2021-01-29 22:10 ` Taylor Blau
2021-01-29 22:57 ` Jeff King
2021-01-29 23:03 ` Junio C Hamano
2021-01-29 23:28 ` Taylor Blau
2021-02-02 3:04 ` Taylor Blau
2021-01-29 23:31 ` Jeff King
2021-01-29 22:13 ` Junio C Hamano
2021-01-29 20:30 ` Junio C Hamano
2021-01-29 22:43 ` Jeff King
2021-01-29 22:53 ` Taylor Blau
2021-01-29 23:00 ` Jeff King
2021-01-29 23:10 ` Junio C Hamano
2021-01-19 23:24 ` [PATCH 04/10] p5303: add missing &&-chains Taylor Blau
2021-01-19 23:24 ` [PATCH 05/10] p5303: measure time to repack with keep Taylor Blau
2021-01-29 3:40 ` Junio C Hamano
2021-01-29 19:32 ` Jeff King
2021-01-29 20:04 ` [PATCH] p5303: avoid sed GNU-ism Jeff King
2021-01-29 20:19 ` Eric Sunshine
2021-01-29 20:27 ` Jeff King
2021-01-29 20:36 ` Eric Sunshine
2021-01-29 22:11 ` Taylor Blau
2021-01-29 20:38 ` [PATCH 05/10] p5303: measure time to repack with keep Junio C Hamano
2021-01-29 22:10 ` Jeff King
2021-01-29 23:12 ` Junio C Hamano
2021-01-19 23:24 ` [PATCH 06/10] pack-objects: rewrite honor-pack-keep logic Taylor Blau
2021-01-19 23:24 ` [PATCH 07/10] packfile: add kept-pack cache for find_kept_pack_entry() Taylor Blau
2021-01-19 23:24 ` [PATCH 08/10] builtin/pack-objects.c: teach '--keep-pack-stdin' Taylor Blau
2021-01-19 23:24 ` [PATCH 09/10] builtin/repack.c: extract loose object handling Taylor Blau
2021-01-20 13:59 ` Derrick Stolee
2021-01-20 14:34 ` Taylor Blau
2021-01-20 15:51 ` Derrick Stolee
2021-01-21 3:45 ` Junio C Hamano
2021-01-19 23:24 ` [PATCH 10/10] builtin/repack.c: add '--geometric' option Taylor Blau
2021-01-20 14:05 ` [PATCH 00/10] repack: support repacking into a geometric sequence Derrick Stolee
2021-02-04 3:58 ` [PATCH v2 0/8] " Taylor Blau
2021-02-04 3:58 ` [PATCH v2 1/8] packfile: introduce 'find_kept_pack_entry()' Taylor Blau
2021-02-16 21:42 ` Jeff King
2021-02-16 21:48 ` Taylor Blau
2021-02-04 3:58 ` [PATCH v2 2/8] revision: learn '--no-kept-objects' Taylor Blau
2021-02-16 23:17 ` Jeff King
2021-02-17 18:35 ` Taylor Blau
2021-02-04 3:59 ` [PATCH v2 3/8] builtin/pack-objects.c: add '--stdin-packs' option Taylor Blau
2021-02-16 23:46 ` Jeff King
2021-02-17 18:59 ` Taylor Blau
2021-02-17 19:21 ` Jeff King
2021-02-04 3:59 ` [PATCH v2 4/8] p5303: add missing &&-chains Taylor Blau
2021-02-04 3:59 ` [PATCH v2 5/8] p5303: measure time to repack with keep Taylor Blau
2021-02-16 23:58 ` Jeff King
2021-02-17 0:02 ` Jeff King
2021-02-17 19:13 ` Taylor Blau
2021-02-17 19:25 ` Jeff King
2021-02-04 3:59 ` [PATCH v2 6/8] builtin/pack-objects.c: rewrite honor-pack-keep logic Taylor Blau
2021-02-17 16:05 ` Jeff King
2021-02-17 19:23 ` Taylor Blau
2021-02-17 19:29 ` Jeff King
2021-02-04 3:59 ` [PATCH v2 7/8] packfile: add kept-pack cache for find_kept_pack_entry() Taylor Blau
2021-02-17 17:11 ` Jeff King
2021-02-17 19:54 ` Taylor Blau
2021-02-17 20:25 ` Jeff King
2021-02-17 20:29 ` Taylor Blau
2021-02-17 21:43 ` Jeff King
2021-02-04 3:59 ` [PATCH v2 8/8] builtin/repack.c: add '--geometric' option Taylor Blau
2021-02-17 18:17 ` Jeff King
2021-02-17 20:01 ` Taylor Blau
2021-02-17 0:01 ` [PATCH v2 0/8] repack: support repacking into a geometric sequence Jeff King
2021-02-17 18:18 ` Jeff King
2021-02-18 3:14 ` [PATCH v3 " Taylor Blau
2021-02-18 3:14 ` [PATCH v3 1/8] packfile: introduce 'find_kept_pack_entry()' Taylor Blau
2021-02-18 3:14 ` [PATCH v3 2/8] revision: learn '--no-kept-objects' Taylor Blau
2021-02-18 3:14 ` [PATCH v3 3/8] builtin/pack-objects.c: add '--stdin-packs' option Taylor Blau
2021-02-18 3:14 ` [PATCH v3 4/8] p5303: add missing &&-chains Taylor Blau
2021-02-18 3:14 ` [PATCH v3 5/8] p5303: measure time to repack with keep Taylor Blau
2021-02-18 3:14 ` [PATCH v3 6/8] builtin/pack-objects.c: rewrite honor-pack-keep logic Taylor Blau
2021-02-18 3:14 ` [PATCH v3 7/8] packfile: add kept-pack cache for find_kept_pack_entry() Taylor Blau
2021-02-18 3:14 ` [PATCH v3 8/8] builtin/repack.c: add '--geometric' option Taylor Blau
2021-02-23 0:31 ` [PATCH v3 0/8] repack: support repacking into a geometric sequence Jeff King
2021-02-23 1:06 ` Taylor Blau
2021-02-23 1:42 ` Jeff King
2021-02-23 2:24 ` Taylor Blau [this message]
2021-02-23 2:25 ` [PATCH v4 1/8] packfile: introduce 'find_kept_pack_entry()' Taylor Blau
2021-02-23 2:25 ` [PATCH v4 2/8] revision: learn '--no-kept-objects' Taylor Blau
2021-02-23 2:25 ` [PATCH v4 3/8] builtin/pack-objects.c: add '--stdin-packs' option Taylor Blau
2021-02-23 8:07 ` Junio C Hamano
2021-02-23 18:51 ` Jeff King
2021-02-23 2:25 ` [PATCH v4 4/8] p5303: add missing &&-chains Taylor Blau
2021-02-23 2:25 ` [PATCH v4 5/8] p5303: measure time to repack with keep Taylor Blau
2021-02-23 2:25 ` [PATCH v4 6/8] builtin/pack-objects.c: rewrite honor-pack-keep logic Taylor Blau
2021-02-23 2:25 ` [PATCH v4 7/8] packfile: add kept-pack cache for find_kept_pack_entry() Taylor Blau
2021-02-23 2:25 ` [PATCH v4 8/8] builtin/repack.c: add '--geometric' option Taylor Blau
2021-02-24 23:19 ` Junio C Hamano
2021-02-24 23:43 ` Junio C Hamano
2021-03-04 21:40 ` Taylor Blau
2021-03-04 21:55 ` Taylor Blau
2021-02-23 3:39 ` [PATCH v4 0/8] repack: support repacking into a geometric sequence Jeff King
2021-02-23 7:43 ` Junio C Hamano
2021-02-23 18:44 ` Jeff King
2021-02-23 19:54 ` Martin Fick
2021-02-23 20:06 ` Taylor Blau
2021-02-23 21:57 ` Martin Fick
2021-02-23 20:15 ` Jeff King
2021-02-23 21:41 ` Martin Fick
2021-02-23 21:53 ` Jeff King
2021-02-24 18:13 ` Martin Fick
2021-02-26 6:23 ` Jeff King
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=cover.1614047097.git.me@ttaylorr.com \
--to=me@ttaylorr.com \
--cc=dstolee@microsoft.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=peff@peff.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).