git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Taylor Blau <me@ttaylorr.com>
To: git@vger.kernel.org
Cc: Jeff King <peff@peff.net>,
	Derrick Stolee <derrickstolee@github.com>,
	Junio C Hamano <gitster@pobox.com>
Subject: [PATCH v2 00/10] gc: enable cruft packs by default
Date: Tue, 18 Apr 2023 16:40:29 -0400	[thread overview]
Message-ID: <cover.1681850424.git.me@ttaylorr.com> (raw)
In-Reply-To: <cover.1681764848.git.me@ttaylorr.com>

Here is a very tiny reroll of my series to graduate `gc.cruftPacks` out
of `feature.experimental` and enables it by default.

A complete summary of the topic is available in the original cover
letter[1], and the changes since last time are limited to test clean-up,
patch reorganization, and some touch-ups on the patch messages
themselves.

As always, a range-diff is below for convenience.

Thanks for all of the review thus far, and in advance for any review on
this round, too.

[1]: https://lore.kernel.org/git/cover.1681764848.git.me@ttaylorr.com/

Taylor Blau (10):
  pack-write.c: plug a leak in stage_tmp_packfiles()
  builtin/repack.c: fix incorrect reference to '-C'
  builtin/gc.c: ignore cruft packs with `--keep-largest-pack`
  t/t5304-prune.sh: prepare for `gc --cruft` by default
  t/t6501-freshen-objects.sh: prepare for `gc --cruft` by default
  t/t6500-gc.sh: refactor cruft pack tests
  t/t6500-gc.sh: add additional test cases
  t/t9300-fast-import.sh: prepare for `gc --cruft` by default
  builtin/gc.c: make `gc.cruftPacks` enabled by default
  repository.h: drop unused `gc_cruft_packs`

 Documentation/config/feature.txt |   3 -
 Documentation/config/gc.txt      |  12 +--
 Documentation/git-gc.txt         |  12 +--
 Documentation/gitformat-pack.txt |   4 +-
 builtin/gc.c                     |   8 +-
 builtin/repack.c                 |   2 +-
 pack-write.c                     |  14 ++--
 repo-settings.c                  |   4 +-
 repository.h                     |   1 -
 t/t5304-prune.sh                 |  28 +++----
 t/t6500-gc.sh                    | 135 ++++++++++++++++---------------
 t/t6501-freshen-objects.sh       |  10 +--
 t/t9300-fast-import.sh           |  13 +--
 13 files changed, 120 insertions(+), 126 deletions(-)

Range-diff against v1:
 1:  65ac7ed9b8 =  1:  c477b754e7 pack-write.c: plug a leak in stage_tmp_packfiles()
 2:  fbc8d15032 =  2:  52fb61fa9c builtin/repack.c: fix incorrect reference to '-C'
 3:  796df920ad !  3:  63b9ee8e2e builtin/gc.c: ignore cruft packs with `--keep-largest-pack`
    @@ Commit message
           - The same is true for `gc.bigPackThreshold`, if the size of the cruft
             pack exceeds the limit set by the caller.
     
    -    Ignore cruft packs in the common implementation for both of these
    -    options, and add a pair of tests to prevent any future regressions here.
    +    In the future, it is possible that `gc.bigPackThreshold` could be used
    +    to write a separate cruft pack containing any new unreachable objects
    +    that entered the repository since the last time a cruft pack was
    +    written.
    +
    +    There are some complexities to doing so, mainly around handling
    +    pruning objects that are in an existing cruft pack that is above the
    +    threshold (which would either need to be rewritten, or else delay
    +    pruning). Rewriting a substantially similar cruft pack isn't ideal, but
    +    it is significantly better than the status-quo.
    +
    +    If users have large cruft packs that they don't want to rewrite, they
    +    can mark them as `*.keep` packs. But in general, if a repository has a
    +    cruft pack that is so large it is slowing down GC's, it should probably
    +    be pruned anyway.
    +
    +    In the meantime, ignore cruft packs in the common implementation for
    +    both of these options, and add a pair of tests to prevent any future
    +    regressions here.
     
         Signed-off-by: Taylor Blau <me@ttaylorr.com>
     
    @@ Documentation/git-gc.txt: be performed as well.
     -	All packs except the largest pack and those marked with a
     -	`.keep` files are consolidated into a single pack. When this
     -	option is used, `gc.bigPackThreshold` is ignored.
    -+	All packs except the largest pack, any packs marked with a
    -+	`.keep` file, and any cruft pack(s) are consolidated into a
    -+	single pack. When this option is used, `gc.bigPackThreshold` is
    -+	ignored.
    ++	All packs except the largest non-cruft pack, any packs marked
    ++	with a `.keep` file, and any cruft pack(s) are consolidated into
    ++	a single pack. When this option is used, `gc.bigPackThreshold`
    ++	is ignored.
      
      AGGRESSIVE
      ----------
 4:  44006da959 =  4:  905eeb9027 t/t5304-prune.sh: prepare for `gc --cruft` by default
 8:  4ccc525c39 !  5:  fa6eafb1fe t/t6501-freshen-objects.sh: prepare for `gc --cruft` by default
    @@ Commit message
         cover the case of freshening loose objects not using cruft packs.
     
         We could run this test twice, once with `--cruft` and once with
    -    `--no-cruft`, but doing so is unnecessary, since the object rescuing and
    -    freshening behavior is already extensively tested via t5329.
    +    `--no-cruft`, but doing so is unnecessary, since we already test object
    +    rescuing, freshening, and dealing with corrupt parts of the unreachable
    +    object graph extensively via t5329.
     
         Signed-off-by: Taylor Blau <me@ttaylorr.com>
     
 6:  56a965e517 =  6:  e6270d75fa t/t6500-gc.sh: refactor cruft pack tests
 7:  6957e54f51 !  7:  9db3fa9e36 t/t6500-gc.sh: add additional test cases
    @@ Commit message
         which enumerates all possible combinations of arguments that will
         produce (or not produce) a cruft pack.
     
    -    This prepares us for the following commit, which will change the default
    +    This prepares us for a future commit which will change the default value
         of `gc.cruftPacks` by ensuring that we understand which invocations do
         and do not change as a result.
     
    @@ t/t6500-gc.sh: do
      done
      
      for argv in \
    -+	"gc --no-cruft" \
    ++	"gc" \
     +	"-c gc.cruftPacks=false gc" \
     +	"-c gc.cruftPacks=true gc --no-cruft" \
      	"-c feature.expiremental=true -c gc.cruftPacks=false gc" \
 5:  1b07eb83fe !  8:  894cf176ea t/t9300-fast-import.sh: prepare for `gc --cruft` by default
    @@ Metadata
      ## Commit message ##
         t/t9300-fast-import.sh: prepare for `gc --cruft` by default
     
    -    In a similar fashion as the previous commit, adjust the fast-import
    -    tests to prepare for "git gc" generating a cruft pack by default.
    +    In a similar fashion as previous commits, adjust the fast-import tests
    +    to prepare for "git gc" generating a cruft pack by default.
     
         This adjustment is slightly different, however. Instead of relying on us
         writing out the objects loose, and then calling `git prune` to remove
    @@ Commit message
         one `git gc --prune`, which handles pruning both loose objects, and
         objects that would otherwise be written to a cruft pack.
     
    +    Likely this pattern of "git gc && git prune" started all the way back in
    +    03db4525d3 (Support gitlinks in fast-import., 2008-07-19), which
    +    happened after deprecating `git gc --prune` in 9e7d501990 (builtin-gc.c:
    +    deprecate --prune, it now really has no effect, 2008-05-09).
    +
    +    After `--prune` was un-deprecated in 58e9d9d472 (gc: make --prune useful
    +    again by accepting an optional parameter, 2009-02-14), this script got a
    +    handful of new "git gc && git prune" instances via via 4cedb78cb5
    +    (fast-import: add input format tests, 2011-08-11). These could have been
    +    `git gc --prune`, but weren't (likely taking after 03db4525d3).
    +
         Signed-off-by: Taylor Blau <me@ttaylorr.com>
     
      ## t/t9300-fast-import.sh ##
 9:  bfda40a21d !  9:  b6784ddfe2 builtin/gc.c: make `gc.cruftPacks` enabled by default
    @@ t/t6500-gc.sh: assert_no_cruft_packs () {
      }
      
      for argv in \
    +-	"gc --cruft" \
     +	"gc" \
    - 	"gc --cruft" \
      	"-c gc.cruftPacks=true gc" \
     -	"-c gc.cruftPacks=false gc --cruft" \
     -	"-c feature.experimental=true gc" \
    @@ t/t6500-gc.sh: assert_no_cruft_packs () {
      do
      	test_expect_success "git $argv generates a cruft pack" '
      		test_when_finished "rm -fr repo" &&
    -@@ t/t6500-gc.sh: done
    +@@ t/t6500-gc.sh: do
    + done
    + 
      for argv in \
    - 	"gc --no-cruft" \
    +-	"gc" \
    ++	"gc --no-cruft" \
      	"-c gc.cruftPacks=false gc" \
     -	"-c gc.cruftPacks=true gc --no-cruft" \
     -	"-c feature.expiremental=true -c gc.cruftPacks=false gc" \
10:  fa15125454 = 10:  c67ee7c2ff repository.h: drop unused `gc_cruft_packs`
-- 
2.40.0.362.gc67ee7c2ff

  parent reply	other threads:[~2023-04-18 20:41 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-17 20:54 [PATCH 00/10] gc: enable cruft packs by default Taylor Blau
2023-04-17 20:54 ` [PATCH 01/10] pack-write.c: plug a leak in stage_tmp_packfiles() Taylor Blau
2023-04-18 10:30   ` Jeff King
2023-04-18 19:40     ` Taylor Blau
2023-04-17 20:54 ` [PATCH 02/10] builtin/repack.c: fix incorrect reference to '-C' Taylor Blau
2023-04-17 20:54 ` [PATCH 03/10] builtin/gc.c: ignore cruft packs with `--keep-largest-pack` Taylor Blau
2023-04-17 22:54   ` Junio C Hamano
2023-04-17 23:03     ` Taylor Blau
2023-04-18 10:39       ` Jeff King
2023-04-18 14:54         ` Derrick Stolee
2023-04-17 20:54 ` [PATCH 04/10] t/t5304-prune.sh: prepare for `gc --cruft` by default Taylor Blau
2023-04-17 20:54 ` [PATCH 05/10] t/t9300-fast-import.sh: " Taylor Blau
2023-04-18 10:43   ` Jeff King
2023-04-18 19:44     ` Taylor Blau
2023-04-17 20:54 ` [PATCH 06/10] t/t6500-gc.sh: refactor cruft pack tests Taylor Blau
2023-04-17 20:54 ` [PATCH 07/10] t/t6500-gc.sh: add additional test cases Taylor Blau
2023-04-18 10:48   ` Jeff King
2023-04-18 19:48     ` Taylor Blau
2023-04-17 20:54 ` [PATCH 08/10] t/t6501-freshen-objects.sh: prepare for `gc --cruft` by default Taylor Blau
2023-04-18 10:56   ` Jeff King
2023-04-18 19:50     ` Taylor Blau
2023-04-22 11:23       ` Jeff King
2023-04-17 20:54 ` [PATCH 09/10] builtin/gc.c: make `gc.cruftPacks` enabled " Taylor Blau
2023-04-18 11:00   ` Jeff King
2023-04-18 19:52     ` Taylor Blau
2023-04-17 20:54 ` [PATCH 10/10] repository.h: drop unused `gc_cruft_packs` Taylor Blau
2023-04-18 11:02   ` Jeff King
2023-04-18 11:04 ` [PATCH 00/10] gc: enable cruft packs by default Jeff King
2023-04-18 19:53   ` Taylor Blau
2023-04-18 20:40 ` Taylor Blau [this message]
2023-04-18 20:40   ` [PATCH v2 01/10] pack-write.c: plug a leak in stage_tmp_packfiles() Taylor Blau
2023-04-19 22:00     ` Junio C Hamano
2023-04-20 16:31       ` Taylor Blau
2023-04-20 16:57         ` Junio C Hamano
2023-04-18 20:40   ` [PATCH v2 02/10] builtin/repack.c: fix incorrect reference to '-C' Taylor Blau
2023-04-18 20:40   ` [PATCH v2 03/10] builtin/gc.c: ignore cruft packs with `--keep-largest-pack` Taylor Blau
2023-04-18 20:40   ` [PATCH v2 04/10] t/t5304-prune.sh: prepare for `gc --cruft` by default Taylor Blau
2023-04-18 20:40   ` [PATCH v2 05/10] t/t6501-freshen-objects.sh: " Taylor Blau
2023-04-18 20:40   ` [PATCH v2 06/10] t/t6500-gc.sh: refactor cruft pack tests Taylor Blau
2023-04-18 20:40   ` [PATCH v2 07/10] t/t6500-gc.sh: add additional test cases Taylor Blau
2023-04-18 20:40   ` [PATCH v2 08/10] t/t9300-fast-import.sh: prepare for `gc --cruft` by default Taylor Blau
2023-04-18 20:40   ` [PATCH v2 09/10] builtin/gc.c: make `gc.cruftPacks` enabled " Taylor Blau
2023-04-19 22:22     ` Junio C Hamano
2023-04-20 17:24       ` Taylor Blau
2023-04-20 17:31         ` Junio C Hamano
2023-04-20 19:19           ` Taylor Blau
2023-04-18 20:41   ` [PATCH v2 10/10] repository.h: drop unused `gc_cruft_packs` Taylor Blau
2023-04-19 22:19     ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cover.1681850424.git.me@ttaylorr.com \
    --to=me@ttaylorr.com \
    --cc=derrickstolee@github.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).