git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Han Xin <hanxin.hx@bytedance.com>
To: hanxin.hx@bytedance.com
Cc: chiyutianyi@gmail.com, derrickstolee@github.com,
	git@vger.kernel.org, haiyangtand@gmail.com,
	jonathantanmy@google.com, me@ttaylorr.com,
	Junio C Hamano <gitster@pobox.com>,
	ps@pks.im
Subject: [PATCH v3 0/2] no lazy fetch in lookup_commit_in_graph()
Date: Tue, 28 Jun 2022 10:02:50 +0800	[thread overview]
Message-ID: <cover.1656381667.git.hanxin.hx@bytedance.com> (raw)
In-Reply-To: <cover.1656044659.git.hanxin.hx@bytedance.com>

This patch fixes the following issue:
When we found the commit in the graph in lookup_commit_in_graph(), but
the commit is missing from the repository, we will try
promisor_remote_get_direct() and then enter another loop.

Then we will go into an endless loop:
  git fetch -> deref_without_lazy_fetch() ->
    lookup_commit_in_graph() -> repo_has_object_file() ->
      promisor_remote_get_direct() -> fetch_objects() ->
        git fetch (a new loop round)

Changes since v2:

* Remove test_have_prereq() from ULIMIT_PROCESSES as
  "run_with_limited_processses true" is enough.

* Teach run_with_limited_processses() to support dash and zsh.

* Skip the whole test file if ulimit is not avaliable.

* Minor grammar/comment etc. fixes throughout.

Han Xin (2):
  test-lib.sh: add limited processes to test-lib
  commit-graph.c: no lazy fetch in lookup_commit_in_graph()

 commit-graph.c                             |  2 +-
 t/t5329-no-lazy-fetch-with-commit-graph.sh | 53 ++++++++++++++++++++++
 t/test-lib.sh                              | 16 +++++++
 3 files changed, 70 insertions(+), 1 deletion(-)
 create mode 100755 t/t5329-no-lazy-fetch-with-commit-graph.sh

Range-diff against v2:
1:  442a4c351d ! 1:  ad0a539759 test-lib.sh: add limited processes to test-lib
    @@ t/test-lib.sh: test_lazy_prereq ULIMIT_FILE_DESCRIPTORS '
      '
      
     +run_with_limited_processses () {
    -+	(ulimit -u 512 && "$@")
    ++	# bash and ksh use "ulimit -u", dash uses "ulimit -p"
    ++	if test -n "$BASH_VERSION"
    ++	then
    ++		ulimit_max_process="-u"
    ++	elif test -n "$KSH_VERSION"
    ++	then
    ++		ulimit_max_process="-u"
    ++	fi
    ++	(ulimit ${ulimit_max_process-"-p"} 512 && "$@")
     +}
     +
     +test_lazy_prereq ULIMIT_PROCESSES '
    -+	test_have_prereq !HPPA,!MINGW,!CYGWIN &&
     +	run_with_limited_processses true
     +'
     +
2:  a7d456db9b ! 2:  3cdb1abd43 commit-graph.c: no lazy fetch in lookup_commit_in_graph()
    @@ Metadata
      ## Commit message ##
         commit-graph.c: no lazy fetch in lookup_commit_in_graph()
     
    -    If a commit is in the commit graph, we would expect the commit to also
    -    be present. So we should use has_object() instead of
    -    repo_has_object_file(), which will help us avoid getting into an endless
    -    loop of lazy fetch.
    +    The commit-graph is used to opportunistically optimize accesses to
    +    certain pieces of information on commit objects, and
    +    lookup_commit_in_graph() tries to say "no" when the requested commit
    +    does not locally exist by returning NULL, in which case the caller
    +    can ask for (which may result in on-demand fetching from a promisor
    +    remote) and parse the commit object itself.
     
    -    When we found the commit in the graph in lookup_commit_in_graph(), but
    -    the commit is missing from the repository, we will try
    -    promisor_remote_get_direct() and then enter another loop. While
    -    sometimes it will finally succeed because it cannot fork subprocess,
    -    it has exhausted the local process resources and can be harmful to the
    -    remote service.
    +    However, it uses a wrong helper, repo_has_object_file(), to do so.
    +    This helper not only checks if an object is mmediately available in
    +    the local object store, but also tries to fetch from a promisor remote.
    +    But the fetch machinery calls lookup_commit_in_graph(), thus causing an
    +    infinite loop.
    +
    +    We should make lookup_commit_in_graph() expect that a commit given to it
    +    can be legitimately missing from the local object store, by using the
    +    has_object_file() helper instead.
     
         Signed-off-by: Han Xin <hanxin.hx@bytedance.com>
     
    @@ t/t5329-no-lazy-fetch-with-commit-graph.sh (new)
     +
     +. ./test-lib.sh
     +
    ++if ! test_have_prereq ULIMIT_PROCESSES
    ++then
    ++	skip_all='skipping tests for no lazy fetch with the commit-graph, ulimit processes not available'
    ++	test_done
    ++fi
    ++
     +test_expect_success 'setup: prepare a repository with a commit' '
     +	git init with-commit &&
     +	test_commit -C with-commit the-commit &&
    @@ t/t5329-no-lazy-fetch-with-commit-graph.sh (new)
     +	# create a ref that points to the commit in alternates
     +	git -C with-commit-graph update-ref refs/ref_to_the_commit "$oid" &&
     +	# prepare some other objects to commit-graph
    -+	test_commit -C with-commit-graph somthing &&
    ++	test_commit -C with-commit-graph something &&
     +	git -c gc.writeCommitGraph=true -C with-commit-graph gc &&
     +	test_path_is_file with-commit-graph/.git/objects/info/commit-graph
     +'
    @@ t/t5329-no-lazy-fetch-with-commit-graph.sh (new)
     +	test_must_fail git -C with-commit-graph cat-file -e $oid
     +'
     +
    -+test_expect_success 'setup: prepare another commit to fetch' '
    -+	test_commit -C with-commit another-commit &&
    ++test_expect_success 'setup: prepare any commit to fetch' '
    ++	test_commit -C with-commit any-commit &&
     +	anycommit=$(git -C with-commit rev-parse HEAD)
     +'
     +
    -+test_expect_success ULIMIT_PROCESSES 'fetch any commit from promisor with the usage of the commit graph' '
    ++test_expect_success 'fetch any commit from promisor with the usage of the commit graph' '
     +	git -C with-commit-graph remote add origin "$(pwd)/with-commit" &&
     +	git -C with-commit-graph config remote.origin.promisor true &&
     +	git -C with-commit-graph config remote.origin.partialclonefilter blob:none &&
-- 
2.36.1


  parent reply	other threads:[~2022-06-28  2:06 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-14  7:25 An endless loop fetching issue with partial clone, alternates and commit graph Haiyng Tan
2022-06-15  2:18 ` Taylor Blau
2022-06-16  3:38   ` [RFC PATCH 0/2] " Han Xin
2022-06-16  3:38     ` [RFC PATCH 1/2] commit-graph.c: add "flags" to lookup_commit_in_graph() Han Xin
2022-06-16  3:38     ` [RFC PATCH 2/2] fetch-pack.c: pass "oi_flags" " Han Xin
2022-06-17 21:47     ` [RFC PATCH 0/2] Re: An endless loop fetching issue with partial clone, alternates and commit graph Jonathan Tan
2022-06-18  3:01     ` [PATCH v1] commit-graph.c: no lazy fetch in lookup_commit_in_graph() Han Xin
2022-06-20  7:07       ` Patrick Steinhardt
2022-06-20  8:53         ` [External] " 欣韩
2022-06-20  9:05           ` Patrick Steinhardt
2022-06-21 18:23       ` Jonathan Tan
2022-06-22  3:17         ` Han Xin
2022-06-24  5:27       ` [PATCH v2 0/2] " Han Xin
2022-06-24  5:27         ` [PATCH v2 1/2] test-lib.sh: add limited processes to test-lib Han Xin
2022-06-24 16:03           ` Junio C Hamano
2022-06-25  1:35             ` Han Xin
2022-06-27 12:22               ` Junio C Hamano
2022-06-24  5:27         ` [PATCH v2 2/2] commit-graph.c: no lazy fetch in lookup_commit_in_graph() Han Xin
2022-06-24 16:56           ` Junio C Hamano
2022-06-25  2:25             ` Han Xin
2022-06-25  2:31               ` Han Xin
2022-06-28  2:02         ` Han Xin [this message]
2022-06-28  2:02           ` [PATCH v3 1/2] test-lib.sh: add limited processes to test-lib Han Xin
2022-06-28  2:02           ` [PATCH v3 2/2] commit-graph.c: no lazy fetch in lookup_commit_in_graph() Han Xin
2022-06-28  7:49             ` Ævar Arnfjörð Bjarmason
2022-06-28 17:36               ` Junio C Hamano
2022-06-30 12:21                 ` Johannes Schindelin
2022-06-30 13:43                   ` Ævar Arnfjörð Bjarmason
2022-06-30 15:40                     ` Junio C Hamano
2022-06-30 18:47                       ` Ævar Arnfjörð Bjarmason
2022-07-01 19:31                       ` Johannes Schindelin
2022-07-01 20:47                         ` Junio C Hamano
2022-06-29  2:08               ` Han Xin
2022-06-30 17:37           ` test name conflict + js/ci-github-workflow-markup regression (was: [PATCH v3 0/2] no lazy fetch in lookup_commit_in_graph()) Ævar Arnfjörð Bjarmason
2022-07-01  1:34           ` [PATCH v4 0/1] no lazy fetch in lookup_commit_in_graph() Han Xin
2022-07-01  1:34             ` [PATCH v4 1/1] commit-graph.c: " Han Xin
2022-07-09 12:23               ` Michael J Gruber
2022-07-11 15:09                 ` Jeff King
2022-07-11 20:17                   ` Junio C Hamano
2022-07-12  1:52                     ` [External] " Han Xin
2022-07-12  5:23                       ` Junio C Hamano
2022-07-12  5:32                         ` Han Xin
2022-07-12  6:37                         ` [External] " Jeff King
2022-07-12 14:19                           ` Junio C Hamano
2022-07-12  6:50             ` [PATCH v5 0/1] " Han Xin
2022-07-12  6:50               ` [PATCH v5 1/1] commit-graph.c: " Han Xin
2022-07-12  9:50                 ` Ævar Arnfjörð Bjarmason
2022-07-13  1:26                   ` Han Xin
2022-07-12  6:58               ` [PATCH v5 0/1] " Jeff King
2022-07-12  8:01             ` [PATCH v1] t5330: remove run_with_limited_processses() Han Xin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cover.1656381667.git.hanxin.hx@bytedance.com \
    --to=hanxin.hx@bytedance.com \
    --cc=chiyutianyi@gmail.com \
    --cc=derrickstolee@github.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=haiyangtand@gmail.com \
    --cc=jonathantanmy@google.com \
    --cc=me@ttaylorr.com \
    --cc=ps@pks.im \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).