git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Taylor Blau <me@ttaylorr.com>
To: Derrick Stolee <derrickstolee@github.com>
Cc: git@vger.kernel.org, Jeff King <peff@peff.net>,
	Junio C Hamano <gitster@pobox.com>
Subject: Re: [PATCH v3 3/3] pack-bitmap.c: use commit boundary during bitmap traversal
Date: Mon, 8 May 2023 18:12:01 -0400	[thread overview]
Message-ID: <ZFlzsYAREyA+6NLX@nand.local> (raw)
In-Reply-To: <673a2b60-0556-fc2b-edf7-f2a19ef5dcbe@github.com>

On Mon, May 08, 2023 at 04:53:35PM -0400, Derrick Stolee wrote:
> On 5/8/2023 1:38 PM, Taylor Blau wrote:
>
> > -	/*
> > -	 * if we have a HAVES list, but none of those haves is contained
> > -	 * in the packfile that has a bitmap, we don't have anything to
> > -	 * optimize here
> > -	 */
> > -	if (haves && !in_bitmapped_pack(bitmap_git, haves))
> > -		goto cleanup;
> > +	use_boundary_traversal = git_env_bool(GIT_TEST_PACK_USE_BITMAP_BOUNDARY_TRAVERSAL, -1);
> > +	if (use_boundary_traversal < 0) {
> > +		prepare_repo_settings(revs->repo);
> > +		use_boundary_traversal = revs->repo->settings.pack_use_bitmap_boundary_traversal;
> > +	}
> > +
> > +	if (!use_boundary_traversal) {
> > +		/*
> > +		 * if we have a HAVES list, but none of those haves is contained
> > +		 * in the packfile that has a bitmap, we don't have anything to
> > +		 * optimize here
> > +		 */
> > +		if (haves && !in_bitmapped_pack(bitmap_git, haves))
> > +			goto cleanup;
> > +	}
>
> I was reading along, nodding my head, until I came across this comment.
> I recognize that it's moved from an existing place, but this seems very
> strange and incorrect.
>
> Is this implying that if the 'haves' are not in the bitmapped pack, then
> we _can't_ construct a bitmap representing the objects they can reach?
>
> Do we not have the ability to extend the object order beyond the bitmapped
> pack in-memory in a way that allows us to provide bit positions for the
> objects not in the bitmapped pack?

It is a strange heuristic indeed, and I had the same thought as you the
first time I remember encountering this part of the code.

in_bitmapped_pack() asks whether there is at least one of the given set
of objects which appears in the pack/MIDX corresponding with our bitmap.
In other words, this heuristic causes us to bail when none of the
objects listed as haves appear in the corresponding pack/MIDX.

Strictly speaking, there isn't a fundamental limitation preventing us
from using the bitmap traversal in cases where none of the tips exist in
the bitmapped pack/MIDX. That's what the extended index is for (see
`bitmap_position_extended()` and `ext_index_add_object()` for more
details).

AFAICT, the heuristic here is another way of saying, "if none of these
haves appear in the corresponding pack or MIDX, my bitmap coverage is
likely bad enough that it's not worth trying to construct a full bitmap
on the haves side (and we should fall back to the ordinary object
traversal)".

> I can understand saying "it might be more expensive to construct a bitmap
> here, because we need to walk into the bitmapped pack before we have a
> hope of hitting a bitmap." However, that's far from "we don't have anything
> to optimize".

Agreed.

> Something to think about removing in the future, but not a blocker for
> this series. Getting it out of the way for the boundary-based walk makes
> even more sense because the commits to check are those in the boundary,
> not the haves themselves.

Yeah. I think that what we do here will end up depending on the
performance of the boundary based approach. If it's a clear enough win
in a majority of cases, I'd just as soon drop the classic traversal (and
with it, this heuristic).

But if we do end up keeping the classic traversal around for a while, I
absolutely agree that we should investigate and consider dropping this
optimization, which I have long-suspected as not being very useful.

> > +test_expect_success 'boundary-based traversal is used when requested' '
> > +	git repack -a -d --write-bitmap-index &&
> > +
> > +	for argv in \
> > +		"git -c pack.useBitmapBoundaryTraversal=true" \
> > +		"git -c feature.experimental=true" \
> > +		"GIT_TEST_PACK_USE_BITMAP_BOUNDARY_TRAVERSAL=1 git"
> > +	do
> > +		eval "GIT_TRACE2_EVENT=1 $argv rev-list --objects \
> > +			--use-bitmap-index second..other 2>perf" &&
> > +		grep "\"region_enter\".*\"label\":\"haves/boundary\"" perf ||
> > +			return 1
> > +	done &&
> > +
> > +	for argv in \
> > +		"git -c pack.useBitmapBoundaryTraversal=false" \
> > +		"git -c feature.experimental=true -c pack.useBitmapBoundaryTraversal=false" \
> > +		"GIT_TEST_PACK_USE_BITMAP_BOUNDARY_TRAVERSAL=0 git -c pack.useBitmapBoundaryTraversal=true" \
> > +		"GIT_TEST_PACK_USE_BITMAP_BOUNDARY_TRAVERSAL=0 git -c feature.experimental=true"
>
> This ordering (GIT_TEST_*=0 overrides config) seems backwards to me, but
> it doesn't really matter since it's a GIT_TEST_* variable. Thanks for
> including tests so the order is documented.

Now that you mention it, I think it's backwards too ;-). But I agree
that documenting it here is sufficient (and ideally, we don't have to
live with this GIT_TEST variable for more than a release or two,
anyway).

> > +	do
> > +		eval "GIT_TRACE2_EVENT=1 $argv rev-list --objects \
> > +			--use-bitmap-index second..other 2>perf" &&
> > +		grep "\"region_enter\".*\"label\":\"haves/classic\"" perf ||
> > +			return 1
>
> nit: you should be able to use 'test_region' here. Probably not worth
> a re-roll, as everything else looks good to me.

Ah, I didn't know of `test_region` before. I'll keep it in mind, thanks!

Thanks,
Taylor

  reply	other threads:[~2023-05-08 22:12 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-25  0:00 [PATCH 0/3] pack-bitmap: boundary-based bitmap traversal Taylor Blau
2023-04-25  0:00 ` [PATCH 1/3] revision: support tracking uninteresting commits Taylor Blau
2023-04-25 18:15   ` Derrick Stolee
2023-05-03 21:48     ` Taylor Blau
2023-05-04 13:46       ` Derrick Stolee
2023-05-03 22:08     ` Taylor Blau
2023-05-04 13:59       ` Derrick Stolee
2023-05-05 17:30         ` Taylor Blau
2023-04-25 18:22   ` Junio C Hamano
2023-04-25 18:48     ` Taylor Blau
2023-04-25  0:00 ` [PATCH 2/3] pack-bitmap.c: extract `fill_in_bitmap()` Taylor Blau
2023-04-25 18:32   ` Junio C Hamano
2023-04-25 18:51     ` [PATCH 2/3] pack-bitmap.c: extract `fill_in_bitmap()`t Taylor Blau
2023-04-25  0:00 ` [PATCH 3/3] pack-bitmap.c: use commit boundary during bitmap traversal Taylor Blau
2023-04-25 18:52   ` Junio C Hamano
2023-05-02 21:31     ` Felipe Contreras
2023-05-03 21:42     ` Taylor Blau
2023-05-03 21:58       ` Junio C Hamano
2023-04-25 18:53   ` Derrick Stolee
2023-04-25 18:02 ` [PATCH 0/3] pack-bitmap: boundary-based " Junio C Hamano
2023-04-25 18:30   ` Derrick Stolee
2023-04-25 18:57     ` Taylor Blau
2023-04-25 19:52       ` Derrick Stolee
2023-05-03 21:43         ` Taylor Blau
2023-04-25 18:06 ` Derrick Stolee
2023-04-25 19:01   ` Taylor Blau
2023-04-25 20:27     ` Derrick Stolee
2023-05-01 22:08 ` Junio C Hamano
2023-05-02 23:52   ` Taylor Blau
2023-05-05 17:27 ` [PATCH v2 0/2] " Taylor Blau
2023-05-05 17:27   ` [PATCH v2 1/2] pack-bitmap.c: extract `fill_in_bitmap()` Taylor Blau
2023-05-05 17:27   ` [PATCH v2 2/2] pack-bitmap.c: use commit boundary during bitmap traversal Taylor Blau
2023-05-05 18:13     ` Derrick Stolee
2023-05-05 18:43       ` Taylor Blau
2023-05-05 17:33   ` [PATCH v2 0/2] pack-bitmap: boundary-based " Junio C Hamano
2023-05-05 17:59   ` Derrick Stolee
2023-05-05 18:46     ` [PATCH v2 0/2] pack-bitmap: boundary-based bitmap traversalt Taylor Blau
2023-05-05 20:45       ` Derrick Stolee
2023-05-08 17:38 ` [PATCH v3 0/3] pack-bitmap: boundary-based bitmap traversal Taylor Blau
2023-05-08 17:38   ` [PATCH v3 1/3] object: add object_array initializer helper function Taylor Blau
2023-05-08 17:38   ` [PATCH v3 2/3] pack-bitmap.c: extract `fill_in_bitmap()` Taylor Blau
2023-05-08 17:38   ` [PATCH v3 3/3] pack-bitmap.c: use commit boundary during bitmap traversal Taylor Blau
2023-05-08 20:53     ` Derrick Stolee
2023-05-08 22:12       ` Taylor Blau [this message]
2023-05-10 22:55   ` [PATCH v3 0/3] pack-bitmap: boundary-based " Junio C Hamano
2023-05-10 23:10     ` Taylor Blau
2023-05-11 15:23       ` Derrick Stolee
2023-06-08 16:25 ` [PATCH v4 " Taylor Blau
2023-06-08 16:25   ` [PATCH v4 1/3] object: add object_array initializer helper function Taylor Blau
2023-06-08 16:25   ` [PATCH v4 2/3] pack-bitmap.c: extract `fill_in_bitmap()` Taylor Blau
2023-06-08 16:25   ` [PATCH v4 3/3] pack-bitmap.c: use commit boundary during bitmap traversal Taylor Blau

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZFlzsYAREyA+6NLX@nand.local \
    --to=me@ttaylorr.com \
    --cc=derrickstolee@github.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).