git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: Kousik Sanagavarapu <five231003@gmail.com>
Cc: git@vger.kernel.org, jonathantanmy@google.com
Subject: Re: [PATCH v2] index-pack: remove fetch_if_missing=0
Date: Fri, 10 Mar 2023 12:30:58 -0800	[thread overview]
Message-ID: <xmqqzg8k4ad9.fsf@gitster.g> (raw)
In-Reply-To: <20230310183029.19429-1-five231003@gmail.com> (Kousik Sanagavarapu's message of "Sat, 11 Mar 2023 00:00:29 +0530")

Kousik Sanagavarapu <five231003@gmail.com> writes:

> This global was added as a temporary measure to suppress the fetching
> of missing objects and can be removed once the commandshave been taught
> to handle these cases.

Two requests.

 * Could you substantiate this claim for future readers of "git
   log"?  A reference to an old mailing list discussion or a log
   message of the commit that added the temporary measure that says
   the above plan would be perfect.

 * What exactly does "once the commands have been taught"?  Which
   commands?  Could you clarify?

> Hence, use has_object() to check for the existence of an object, which
> has the default behavior of not lazy-fetching in a partial clone. It is
> worth mentioning that this is the only place where there is potential for
> lazy-fetching and all other cases are properly handled, making it safe to
> remove this global here.

This paragraph is very well explained.

> @@ -1728,14 +1727,6 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix)
>  	int report_end_of_input = 0;
>  	int hash_algo = 0;
>  
> -	/*
> -	 * index-pack never needs to fetch missing objects except when
> -	 * REF_DELTA bases are missing (which are explicitly handled). It only
> -	 * accesses the repo to do hash collision checks and to check which
> -	 * REF_DELTA bases need to be fetched.
> -	 */

OK.  The comment describes the design choice we made to flip the
fetch_if_missing flag off.  The old world-view was that we would
notice a breakage by non-functioning index-pack when a lazy clone is
missing objects that we need by disabling auto-fetching, and we
instead explicitly handle any missing and necessary objects by lazy
fetching (like "when we lack REF_DELTA bases").  It does sound like
a conservative thing to do, compared to the opposite approach we are
taking with this patch, i.e. we would not fail if we tried to access
objects we do not need to, because we have lazy fetching enabled,
and we just ended up with bloated object store nobody may notice.

To protect us from future breakage that can come from the new
approach, it is a very good thing that you added new tests to ensure
no unnecessary lazy fetching is done (I am not offhand sure if that
test is sufficient, though).

> -	fetch_if_missing = 0;

Looking good to me.  Jonathan, who reviewed the previous round, do
you have any comments?

Thanks, all.  Will queue.

> diff --git a/t/t5616-partial-clone.sh b/t/t5616-partial-clone.sh
> index f519d2a87a..46af8698ce 100755
> --- a/t/t5616-partial-clone.sh
> +++ b/t/t5616-partial-clone.sh
> @@ -644,6 +644,41 @@ test_expect_success 'repack does not loosen promisor objects' '
>  	grep "loosen_unused_packed_objects/loosened:0" trace
>  '
>  
> +test_expect_success 'index-pack does not lazy-fetch when checking for sha1 collsions' '
> +	rm -rf server promisor-remote client repo trace &&
> +
> +	# setup
> +	git init server &&
> +	for i in 1 2 3 4
> +	do
> +		echo $i >server/file$i &&
> +		git -C server add file$i &&
> +		git -C server commit -am "Commit $i" || return 1
> +	done &&
> +	git -C server config --local uploadpack.allowFilter 1 &&
> +	git -C server config --local uploadpack.allowAnySha1InWant 1 &&
> +	HASH=$(git -C server hash-object file3) &&
> +
> +	git init promisor-remote &&
> +	git -C promisor-remote fetch --keep "file://$(pwd)/server" &&
> +
> +	git clone --no-checkout --filter=blob:none "file://$(pwd)/server" client &&
> +	git -C client remote set-url origin "file://$(pwd)/promisor-remote" &&
> +	git -C client config extensions.partialClone 1 &&
> +	git -C client config remote.origin.promisor 1 &&
> +
> +	git init repo &&
> +	echo "5" >repo/file5 &&
> +	git -C repo config --local uploadpack.allowFilter 1 &&
> +	git -C repo config --local uploadpack.allowAnySha1InWant 1 &&
> +
> +	# verify that no lazy-fetching is done when fetching from another repo
> +	GIT_TRACE_PACKET="$(pwd)/trace" git -C client \
> +					fetch --keep "file://$(pwd)/repo" main &&
> +
> +	! grep "want $HASH" trace
> +'
> +
>  test_expect_success 'lazy-fetch in submodule succeeds' '
>  	# setup
>  	test_config_global protocol.file.allow always &&

  reply	other threads:[~2023-03-10 20:33 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-25  5:24 [PATCH] index-pack: remove fetch_if_missing=0 Kousik Sanagavarapu
2023-02-27 16:56 ` Kousik Sanagavarapu
2023-02-27 22:14 ` Jonathan Tan
2023-02-28  3:54   ` Kousik Sanagavarapu
2023-03-10 18:30 ` [PATCH v2] " Kousik Sanagavarapu
2023-03-10 20:30   ` Junio C Hamano [this message]
2023-03-10 21:13     ` Jonathan Tan
2023-03-10 21:41       ` Junio C Hamano
2023-03-11  2:59         ` Jonathan Tan
2023-03-12 17:16         ` Kousik Sanagavarapu
2023-03-11  6:22       ` [PATCH] " Kousik Sanagavarapu
2023-03-11  6:00     ` Kousik Sanagavarapu
2023-03-13 18:15   ` [PATCH v3] " Kousik Sanagavarapu
2023-03-13 19:17     ` Junio C Hamano
2023-03-13 19:18     ` Junio C Hamano
2023-03-17 17:56     ` [PATCH v4] " Kousik Sanagavarapu
2023-03-17 22:58       ` Junio C Hamano
2023-03-19  6:17         ` Kousik Sanagavarapu
2023-03-11 20:01 ` [PATCH] " Sean Allred
2023-03-11 20:37   ` Junio C Hamano
     [not found] <--in-reply-to=20230225052439.27096-1-five231003@gmail.com>
2023-03-10 18:20 ` [PATCH v2] " Kousik Sanagavarapu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqzg8k4ad9.fsf@gitster.g \
    --to=gitster@pobox.com \
    --cc=five231003@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=jonathantanmy@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).