git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Taylor Blau <me@ttaylorr.com>
To: git@vger.kernel.org
Cc: peff@peff.net, dstolee@microsoft.com, gitster@pobox.com
Subject: [PATCH v4 0/8] repack: support repacking into a geometric sequence
Date: Mon, 22 Feb 2021 21:24:59 -0500	[thread overview]
Message-ID: <cover.1614047097.git.me@ttaylorr.com> (raw)
In-Reply-To: <cover.1611098616.git.me@ttaylorr.com>

Here's a very lightly modified version on v3 of mine and Peff's series
to add a new 'git repack --geometric' mode. Almost nothing has changed
since last time, with the exception of:

  - Packs listed over standard input to 'git pack-objects --stdin-packs'
    are sorted in descending mtime order (and objects are strung
    together in pack order as before) so that objects are laid out
    roughly newest-to-oldest in the resulting pack.

  - Swapped the order of two paragraphs in patch 5 to make the perf
    results clearer.

  - Mention '--unpacked' specifically in the documentation for 'git
    repack --geometric'.

  - Typo fixes.

Range-diff is below. It would be good to start merging this down since
we have a release candidate coming up soon, and I'd rather focus future
reviewer efforts on the multi-pack reverse index and bitmaps series
instead of this one.

Jeff King (4):
  p5303: add missing &&-chains
  p5303: measure time to repack with keep
  builtin/pack-objects.c: rewrite honor-pack-keep logic
  packfile: add kept-pack cache for find_kept_pack_entry()

Taylor Blau (4):
  packfile: introduce 'find_kept_pack_entry()'
  revision: learn '--no-kept-objects'
  builtin/pack-objects.c: add '--stdin-packs' option
  builtin/repack.c: add '--geometric' option

 Documentation/git-pack-objects.txt |  10 +
 Documentation/git-repack.txt       |  23 ++
 builtin/pack-objects.c             | 333 ++++++++++++++++++++++++-----
 builtin/repack.c                   | 187 +++++++++++++++-
 object-store.h                     |   5 +
 packfile.c                         |  67 ++++++
 packfile.h                         |   5 +
 revision.c                         |  15 ++
 revision.h                         |   4 +
 t/perf/p5303-many-packs.sh         |  36 +++-
 t/t5300-pack-object.sh             |  97 +++++++++
 t/t6114-keep-packs.sh              |  69 ++++++
 t/t7703-repack-geometric.sh        | 137 ++++++++++++
 13 files changed, 926 insertions(+), 62 deletions(-)
 create mode 100755 t/t6114-keep-packs.sh
 create mode 100755 t/t7703-repack-geometric.sh

Range-diff against v3:
1:  aa94edf39b = 1:  bb674e5119 packfile: introduce 'find_kept_pack_entry()'
2:  82f6b45463 = 2:  c85a915597 revision: learn '--no-kept-objects'
3:  033e4e3f67 ! 3:  649cf9020b builtin/pack-objects.c: add '--stdin-packs' option
    @@ builtin/pack-objects.c: static int git_pack_config(const char *k, const char *v,
     +	struct packed_git *a = ((const struct string_list_item*)_a)->util;
     +	struct packed_git *b = ((const struct string_list_item*)_b)->util;
     +
    ++	/*
    ++	 * order packs by descending mtime so that objects are laid out
    ++	 * roughly as newest-to-oldest
    ++	 */
     +	if (a->mtime < b->mtime)
    -+		return -1;
    -+	else if (b->mtime < a->mtime)
     +		return 1;
    ++	else if (b->mtime < a->mtime)
    ++		return -1;
     +	else
     +		return 0;
     +}
4:  f9a5faf773 = 4:  6de9f0c52b p5303: add missing &&-chains
5:  181c104a03 ! 5:  94e4f3ee3a p5303: measure time to repack with keep
    @@ Metadata
      ## Commit message ##
         p5303: measure time to repack with keep
     
    -    Add two new tests to measure repack performance. Both test split the
    +    Add two new tests to measure repack performance. Both tests split the
         repository into synthetic "pushes", and then leave the remaining objects
         in a big base pack.
     
    @@ Commit message
           5303.17: repack (1000)                      216.87(490.79+14.57)
           5303.18: repack with kept (1000)            665.63(938.87+15.76)
     
    -    Likewise, the scaling is pretty extreme on --stdin-packs:
    -
    -      5303.7: repack with --stdin-packs (1)       0.01(0.01+0.00)
    -      5303.13: repack with --stdin-packs (50)     3.53(12.07+0.24)
    -      5303.19: repack with --stdin-packs (1000)   195.83(371.82+8.10)
    -
         That's because the code paths around handling .keep files are known to
         scale badly; they look in every single pack file to find each object.
         Our solution to that was to notice that most repos don't have keep
    @@ Commit message
         single .keep, that part of pack-objects slows down again (even if we
         have fewer objects total to look at).
     
    +    Likewise, the scaling is pretty extreme on --stdin-packs (but each
    +    subsequent test is also being asked to do more work):
    +
    +      5303.7: repack with --stdin-packs (1)       0.01(0.01+0.00)
    +      5303.13: repack with --stdin-packs (50)     3.53(12.07+0.24)
    +      5303.19: repack with --stdin-packs (1000)   195.83(371.82+8.10)
    +
         Signed-off-by: Jeff King <peff@peff.net>
         Signed-off-by: Taylor Blau <me@ttaylorr.com>
     
6:  67af143fd1 = 6:  a116587fb2 builtin/pack-objects.c: rewrite honor-pack-keep logic
7:  e9e04b95e7 = 7:  db9f07ec1a packfile: add kept-pack cache for find_kept_pack_entry()
8:  bd492ec142 ! 8:  51f57d5da2 builtin/repack.c: add '--geometric' option
    @@ Documentation/git-repack.txt: depth is 4095.
     +packs determined to need to be combined in order to restore a geometric
     +progression.
     ++
    -+Loose objects are implicitly included in this "roll-up", without respect
    -+to their reachability. This is subject to change in the future. This
    -+option (implying a drastically different repack mode) is not guarenteed
    -+to work with all other combinations of option to `git repack`).
    ++When `--unpacked` is specified, loose objects are implicitly included in
    ++this "roll-up", without respect to their reachability. This is subject
    ++to change in the future. This option (implying a drastically different
    ++repack mode) is not guaranteed to work with all other combinations of
    ++option to `git repack`).
     +
      Configuration
      -------------
-- 
2.30.0.667.g81c0cbc6fd

  parent reply	other threads:[~2021-02-23  2:26 UTC|newest]

Thread overview: 120+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-19 23:23 [PATCH 00/10] repack: support repacking into a geometric sequence Taylor Blau
2021-01-19 23:24 ` [PATCH 01/10] packfile: introduce 'find_kept_pack_entry()' Taylor Blau
2021-01-20 13:40   ` Derrick Stolee
2021-01-20 14:38     ` Taylor Blau
2021-01-29  2:33   ` Junio C Hamano
2021-01-29 18:38     ` Taylor Blau
2021-01-29 19:31     ` Jeff King
2021-01-29 20:20       ` Junio C Hamano
2021-01-19 23:24 ` [PATCH 02/10] revision: learn '--no-kept-objects' Taylor Blau
2021-01-29  3:10   ` Junio C Hamano
2021-01-29 19:13     ` Taylor Blau
2021-01-19 23:24 ` [PATCH 03/10] builtin/pack-objects.c: learn '--assume-kept-packs-closed' Taylor Blau
2021-01-29  3:21   ` Junio C Hamano
2021-01-29 19:19     ` Jeff King
2021-01-29 20:01       ` Taylor Blau
2021-01-29 20:25         ` Jeff King
2021-01-29 22:10           ` Taylor Blau
2021-01-29 22:57             ` Jeff King
2021-01-29 23:03             ` Junio C Hamano
2021-01-29 23:28               ` Taylor Blau
2021-02-02  3:04                 ` Taylor Blau
2021-01-29 23:31               ` Jeff King
2021-01-29 22:13           ` Junio C Hamano
2021-01-29 20:30       ` Junio C Hamano
2021-01-29 22:43         ` Jeff King
2021-01-29 22:53           ` Taylor Blau
2021-01-29 23:00             ` Jeff King
2021-01-29 23:10             ` Junio C Hamano
2021-01-19 23:24 ` [PATCH 04/10] p5303: add missing &&-chains Taylor Blau
2021-01-19 23:24 ` [PATCH 05/10] p5303: measure time to repack with keep Taylor Blau
2021-01-29  3:40   ` Junio C Hamano
2021-01-29 19:32     ` Jeff King
2021-01-29 20:04       ` [PATCH] p5303: avoid sed GNU-ism Jeff King
2021-01-29 20:19         ` Eric Sunshine
2021-01-29 20:27           ` Jeff King
2021-01-29 20:36             ` Eric Sunshine
2021-01-29 22:11               ` Taylor Blau
2021-01-29 20:38       ` [PATCH 05/10] p5303: measure time to repack with keep Junio C Hamano
2021-01-29 22:10         ` Jeff King
2021-01-29 23:12           ` Junio C Hamano
2021-01-19 23:24 ` [PATCH 06/10] pack-objects: rewrite honor-pack-keep logic Taylor Blau
2021-01-19 23:24 ` [PATCH 07/10] packfile: add kept-pack cache for find_kept_pack_entry() Taylor Blau
2021-01-19 23:24 ` [PATCH 08/10] builtin/pack-objects.c: teach '--keep-pack-stdin' Taylor Blau
2021-01-19 23:24 ` [PATCH 09/10] builtin/repack.c: extract loose object handling Taylor Blau
2021-01-20 13:59   ` Derrick Stolee
2021-01-20 14:34     ` Taylor Blau
2021-01-20 15:51       ` Derrick Stolee
2021-01-21  3:45     ` Junio C Hamano
2021-01-19 23:24 ` [PATCH 10/10] builtin/repack.c: add '--geometric' option Taylor Blau
2021-01-20 14:05 ` [PATCH 00/10] repack: support repacking into a geometric sequence Derrick Stolee
2021-02-04  3:58 ` [PATCH v2 0/8] " Taylor Blau
2021-02-04  3:58   ` [PATCH v2 1/8] packfile: introduce 'find_kept_pack_entry()' Taylor Blau
2021-02-16 21:42     ` Jeff King
2021-02-16 21:48       ` Taylor Blau
2021-02-04  3:58   ` [PATCH v2 2/8] revision: learn '--no-kept-objects' Taylor Blau
2021-02-16 23:17     ` Jeff King
2021-02-17 18:35       ` Taylor Blau
2021-02-04  3:59   ` [PATCH v2 3/8] builtin/pack-objects.c: add '--stdin-packs' option Taylor Blau
2021-02-16 23:46     ` Jeff King
2021-02-17 18:59       ` Taylor Blau
2021-02-17 19:21         ` Jeff King
2021-02-04  3:59   ` [PATCH v2 4/8] p5303: add missing &&-chains Taylor Blau
2021-02-04  3:59   ` [PATCH v2 5/8] p5303: measure time to repack with keep Taylor Blau
2021-02-16 23:58     ` Jeff King
2021-02-17  0:02       ` Jeff King
2021-02-17 19:13       ` Taylor Blau
2021-02-17 19:25         ` Jeff King
2021-02-04  3:59   ` [PATCH v2 6/8] builtin/pack-objects.c: rewrite honor-pack-keep logic Taylor Blau
2021-02-17 16:05     ` Jeff King
2021-02-17 19:23       ` Taylor Blau
2021-02-17 19:29         ` Jeff King
2021-02-04  3:59   ` [PATCH v2 7/8] packfile: add kept-pack cache for find_kept_pack_entry() Taylor Blau
2021-02-17 17:11     ` Jeff King
2021-02-17 19:54       ` Taylor Blau
2021-02-17 20:25         ` Jeff King
2021-02-17 20:29           ` Taylor Blau
2021-02-17 21:43             ` Jeff King
2021-02-04  3:59   ` [PATCH v2 8/8] builtin/repack.c: add '--geometric' option Taylor Blau
2021-02-17 18:17     ` Jeff King
2021-02-17 20:01       ` Taylor Blau
2021-02-17  0:01   ` [PATCH v2 0/8] repack: support repacking into a geometric sequence Jeff King
2021-02-17 18:18     ` Jeff King
2021-02-18  3:14 ` [PATCH v3 " Taylor Blau
2021-02-18  3:14   ` [PATCH v3 1/8] packfile: introduce 'find_kept_pack_entry()' Taylor Blau
2021-02-18  3:14   ` [PATCH v3 2/8] revision: learn '--no-kept-objects' Taylor Blau
2021-02-18  3:14   ` [PATCH v3 3/8] builtin/pack-objects.c: add '--stdin-packs' option Taylor Blau
2021-02-18  3:14   ` [PATCH v3 4/8] p5303: add missing &&-chains Taylor Blau
2021-02-18  3:14   ` [PATCH v3 5/8] p5303: measure time to repack with keep Taylor Blau
2021-02-18  3:14   ` [PATCH v3 6/8] builtin/pack-objects.c: rewrite honor-pack-keep logic Taylor Blau
2021-02-18  3:14   ` [PATCH v3 7/8] packfile: add kept-pack cache for find_kept_pack_entry() Taylor Blau
2021-02-18  3:14   ` [PATCH v3 8/8] builtin/repack.c: add '--geometric' option Taylor Blau
2021-02-23  0:31   ` [PATCH v3 0/8] repack: support repacking into a geometric sequence Jeff King
2021-02-23  1:06     ` Taylor Blau
2021-02-23  1:42       ` Jeff King
2021-02-23  2:24 ` Taylor Blau [this message]
2021-02-23  2:25   ` [PATCH v4 1/8] packfile: introduce 'find_kept_pack_entry()' Taylor Blau
2021-02-23  2:25   ` [PATCH v4 2/8] revision: learn '--no-kept-objects' Taylor Blau
2021-02-23  2:25   ` [PATCH v4 3/8] builtin/pack-objects.c: add '--stdin-packs' option Taylor Blau
2021-02-23  8:07     ` Junio C Hamano
2021-02-23 18:51       ` Jeff King
2021-02-23  2:25   ` [PATCH v4 4/8] p5303: add missing &&-chains Taylor Blau
2021-02-23  2:25   ` [PATCH v4 5/8] p5303: measure time to repack with keep Taylor Blau
2021-02-23  2:25   ` [PATCH v4 6/8] builtin/pack-objects.c: rewrite honor-pack-keep logic Taylor Blau
2021-02-23  2:25   ` [PATCH v4 7/8] packfile: add kept-pack cache for find_kept_pack_entry() Taylor Blau
2021-02-23  2:25   ` [PATCH v4 8/8] builtin/repack.c: add '--geometric' option Taylor Blau
2021-02-24 23:19     ` Junio C Hamano
2021-02-24 23:43       ` Junio C Hamano
2021-03-04 21:40         ` Taylor Blau
2021-03-04 21:55       ` Taylor Blau
2021-02-23  3:39   ` [PATCH v4 0/8] repack: support repacking into a geometric sequence Jeff King
2021-02-23  7:43   ` Junio C Hamano
2021-02-23 18:44     ` Jeff King
2021-02-23 19:54       ` Martin Fick
2021-02-23 20:06         ` Taylor Blau
2021-02-23 21:57           ` Martin Fick
2021-02-23 20:15         ` Jeff King
2021-02-23 21:41           ` Martin Fick
2021-02-23 21:53             ` Jeff King
2021-02-24 18:13               ` Martin Fick
2021-02-26  6:23                 ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cover.1614047097.git.me@ttaylorr.com \
    --to=me@ttaylorr.com \
    --cc=dstolee@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).