git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: Derrick Stolee via GitGitGadget <gitgitgadget@gmail.com>
Cc: git@vger.kernel.org, me@ttaylorr.com, gitster@pobox.com,
	Derrick Stolee <derrickstolee@github.com>,
	Derrick Stolee <dstolee@microsoft.com>
Subject: Re: [PATCH 1/2] repack: respect kept objects with '--write-midx -b'
Date: Fri, 17 Dec 2021 12:24:06 -0500	[thread overview]
Message-ID: <YbzHtsl045XZbJGN@coredump.intra.peff.net> (raw)
In-Reply-To: <1ed91f6d255b76bdbdcccea7e1effcebbb263ced.1639758526.git.gitgitgadget@gmail.com>

On Fri, Dec 17, 2021 at 04:28:45PM +0000, Derrick Stolee via GitGitGadget wrote:

> From: Derrick Stolee <dstolee@microsoft.com>
> 
> Historically, we needed a single packfile in order to have reachability
> bitmaps. This introduced logic that when 'git repack' had a '-b' option
> that we should stop sending the '--honor-pack-keep' option to the 'git
> pack-objects' child process, ensuring that we create a packfile
> containing all reachable objects.
> 
> In the world of multi-pack-index bitmaps, we no longer need to repack
> all objects into a single pack to have valid bitmaps. Thus, we should
> continue sending the '--honor-pack-keep' flag to 'git pack-objects'.
> 
> The fix is very simple: only disable the flag when writing bitmaps but
> also _not_ writing the multi-pack-index.
> 
> This opens the door to new repacking strategies that might want to keep
> some historical set of objects in a stable pack-file while only
> repacking more recent objects.

That all makes sense. Another way of thinking about it: we disable
--honor-pack-keep so we that keep packs do not interfere with writing a
pack bitmap. But with --write-midx, we skip the pack bitmap entirely.

In the end it's the same, but I wanted to emphasize that the important
hing is not so much that we are writing a midx bitmap as that we are
_not_ writing a pack bitmap. And that is what makes this OK to do (that
the repack code already disables the pack bitmap when writing a midx
one).

> diff --git a/builtin/repack.c b/builtin/repack.c
> index 9b0be6a6ab3..1f128b7c90b 100644
> --- a/builtin/repack.c
> +++ b/builtin/repack.c
> @@ -693,7 +693,7 @@ int cmd_repack(int argc, const char **argv, const char *prefix)
>  		write_bitmaps = 0;
>  	}
>  	if (pack_kept_objects < 0)
> -		pack_kept_objects = write_bitmaps > 0;
> +		pack_kept_objects = write_bitmaps > 0 && !write_midx;

So the code change here looks good.

> diff --git a/t/t7700-repack.sh b/t/t7700-repack.sh
> index 0260ad6f0e0..8c4ba6500be 100755
> --- a/t/t7700-repack.sh
> +++ b/t/t7700-repack.sh
> @@ -372,4 +372,19 @@ test_expect_success '--write-midx with preferred bitmap tips' '
>  	)
>  '
>  
> +test_expect_success '--write-midx -b packs non-kept objects' '
> +	git init midx-kept &&
> +	test_when_finished "rm -fr midx-kept" &&
> +	(
> +		cd midx-kept &&
> +		test_commit_bulk 100 &&
> +		GIT_TRACE2_EVENT="$(pwd)/trace.txt" \
> +			git repack --write-midx -a -b &&
> +		cat trace.txt | \
> +			grep \"event\":\"start\" | \
> +			grep pack-objects | \
> +			grep \"--honor-pack-keep\"
> +	)
> +'

This looks correct, though:

  - do we really need this separate repo directory? The test before it
    uses one, but only because it needs a very specific set of commits.
    There is a long-running "midx" directory we could use, though I
    think just the regular test repo would be fine, too.

  - likewise, do we need 100 commits? They are not too expensive these
    days with test_commit_bulk, but I think we don't even care about the
    repo contents at all.

  - there is no actual .keep file in your test. That's OK, as we are
    just checking the args passed to pack-objects.

  - useless use of cat. :) Also, you could probably do it all with one
    grep. This is bikeshedding, of course, but it's nice to keep process
    counts low in the test suite. Also, your middle grep is looser than
    the others (it doesn't check for surrounding quotes).

So something like this would work:

  test_expect_success '--write-midx -b packs non-kept objects' '
          GIT_TRACE2_EVENT="$(pwd)/midx-keep-bitmap.trace" \
                  git -C midx repack --write-midx -a -b &&
          grep "\"event\":\"start\".*\"pack-objects\".*\"--honor-pack-keep\"" \
                  midx-keep-bitmap.trace
  '

One could perhaps argue that the combined grep is less readable. If
that's a concern, I'd suggest wrapping it in a function like:

  # usage: check_trace2_arg <trace_file> <cmd> <arg>
  check_trace2_arg () {
	grep "\"event\":\"start\".*\"$2\".*\"$3\"" "$1"
  }

All just suggestions, of course. I'd be happy enough to see the patch go
in as-is.

-Peff

  reply	other threads:[~2021-12-17 17:24 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-17 16:28 [PATCH 0/2] Two small 'git repack' fixes Derrick Stolee via GitGitGadget
2021-12-17 16:28 ` [PATCH 1/2] repack: respect kept objects with '--write-midx -b' Derrick Stolee via GitGitGadget
2021-12-17 17:24   ` Jeff King [this message]
2021-12-20 13:40     ` Derrick Stolee
2021-12-20 13:50       ` Jeff King
2021-12-18  9:58   ` Ævar Arnfjörð Bjarmason
2021-12-17 16:28 ` [PATCH 2/2] repack: make '--quiet' disable progress Derrick Stolee via GitGitGadget
2021-12-17 18:10   ` Jeff King
2021-12-20 13:37     ` Derrick Stolee
2021-12-20 13:49       ` Jeff King
2021-12-20 14:46         ` Derrick Stolee
2021-12-18  9:55   ` Ævar Arnfjörð Bjarmason
2021-12-20 13:38     ` Derrick Stolee
2021-12-20 14:48 ` [PATCH v2 0/2] Two small 'git repack' fixes Derrick Stolee via GitGitGadget
2021-12-20 14:48   ` [PATCH v2 1/2] repack: respect kept objects with '--write-midx -b' Derrick Stolee via GitGitGadget
2021-12-20 14:48   ` [PATCH v2 2/2] repack: make '--quiet' disable progress Derrick Stolee via GitGitGadget
2021-12-20 19:01   ` [PATCH v2 0/2] Two small 'git repack' fixes Ævar Arnfjörð Bjarmason

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YbzHtsl045XZbJGN@coredump.intra.peff.net \
    --to=peff@peff.net \
    --cc=derrickstolee@github.com \
    --cc=dstolee@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=gitster@pobox.com \
    --cc=me@ttaylorr.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).