git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* Re: An endless loop fetching issue with partial clone, alternates and commit graph
@ 2022-06-14  7:25 Haiyng Tan
  2022-06-15  2:18 ` Taylor Blau
  0 siblings, 1 reply; 51+ messages in thread
From: Haiyng Tan @ 2022-06-14  7:25 UTC (permalink / raw)
  To: git, chiyutianyi; +Cc: ps, jonathantanmy

On Mon, 13 Jun 2022 00:17:07 +0800, Han Xin wrote:
> We found an issue that could create an endless loop where alternates
> objects are used improperly.
>
> While do fetching in a partial cloned repository with a commit graph,
> deref_without_lazy_fetch_extended() will call lookup_commit_in_graph()
> to find the commit object. We can found the code in commit-graph.c:
>
>      struct commit *lookup_commit_in_graph(struct repository *repo, const struct object_id *id)
>      {
>           …
>           if (!search_commit_pos_in_graph(id, repo->objects->commit_graph, &pos))
>                return NULL;
>           if (!repo_has_object_file(repo, id))
>                return NULL;

> If we found the object in the commit graph, but missing it in the repository,
> we will go into an endless loop:
>      git fetch -> deref_without_lazy_fetch_extended() ->
>           lookup_commit_in_graph() -> repo_has_object_file() ->
>                promisor_remote_get_direct() -> fetch_objects() ->
>                     git fetch
>
> I know that the reason for this issue is due to improper use of
> alternates, we can ensure that objects will not be lost by maintaining
> all the references. But shouldn't we do something about this unusual
> usage, it will cause a fetch bombardment of the remote git service.
>
> We can reproduce this issue with the following test case, it will
> generate a lot of git processes, please be careful to stop it.
> ———————————————————————————
> #!/bin/sh
>
> test_description='test for an endless loop fetching’
>
> GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
> export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
>
> . ./test-lib.sh
>
> test_expect_success 'setup’ ‘
>     git init --bare dest.git &&
>     test_commit one &&
>    git checkout -b testbranch &&
>    test_commit two &&
>    git push dest.git --all
> '
>
> test_expect_success 'prepare a alternates repository without testbranch' '
>    git clone -b $GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME dest.git alternates &&
>    oid=$(git -C alternates rev-parse refs/remotes/origin/testbranch) &&
>    git -C alternates update-ref -d refs/remotes/origin/testbranch &&
>    git -C alternates gc --prune=now
> '
>
> test_expect_success 'prepare a repository with commit-graph' '
>    git init source &&
>    echo "$(pwd)/dest.git/objects" >source/.git/objects/info/alternates &&
>    git -C source remote add origin "$(pwd)/dest.git" &&
>    git -C source config remote.origin.promisor true &&
>    git -C source config remote.origin.partialclonefilter blob:none &&
>    git -C source fetch origin &&
>    (
>        cd source &&
>        test_commit three &&
>        git -c gc.writeCommitGraph=true gc
>    )
> '
>
> test_expect_success 'change alternates' '
>    echo "$(pwd)/alternates/.git/objects" >source/.git/objects/info/alternates &&
>    # this will bring an endless loop fetching
>    git -C source fetch origin $oid
> '
>
> test_done
>
> ------------------------------------------------------
>
> Thanks
> -Han Xin

I think it's caused by using lazy-fetch in deref_without_lazy_fetch_extended().
In lookup_commit_in_graph(), lazy-fetch is initiated by
repo_has_object_file() used.
has_object() should be used, it's no-lazy-fetch.

^ permalink raw reply	[flat|nested] 51+ messages in thread
* An endless loop fetching issue with partial clone, alternates and commit graph
@ 2022-06-12 16:17 Han Xin
  0 siblings, 0 replies; 51+ messages in thread
From: Han Xin @ 2022-06-12 16:17 UTC (permalink / raw)
  To: git; +Cc: Han Xin, Patrick Steinhardt

We found an issue that could create an endless loop where alternates
objects are used improperly.

While do fetching in a partial cloned repository with a commit graph,
deref_without_lazy_fetch_extended() will call lookup_commit_in_graph()
to find the commit object. We can found the code in commit-graph.c:

     struct commit *lookup_commit_in_graph(struct repository *repo, const struct object_id *id)
     {
          ...
          if (!search_commit_pos_in_graph(id, repo->objects->commit_graph, &pos))
               return NULL;
          if (!repo_has_object_file(repo, id))
               return NULL;

If we found the object in the commit graph, but missing it in the repository,
we will go into an endless loop:
     git fetch -> deref_without_lazy_fetch_extended() -> 
          lookup_commit_in_graph() -> repo_has_object_file() -> 
               promisor_remote_get_direct() -> fetch_objects() ->
                    git fetch

I know that the reason for this issue is due to improper use of
alternates, we can ensure that objects will not be lost by maintaining
all the references. But shouldn't we do something about this unusual
usage, it will cause a fetch bombardment of the remote git service.

We can reproduce this issue with the following test case, it will
generate a lot of git processes, please be careful to stop it.
------------------------------------------------------
#!/bin/sh

test_description='test for an endless loop fetching'

GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME

. ./test-lib.sh

test_expect_success 'setup' '
    git init --bare dest.git &&
    test_commit one &&
    git checkout -b testbranch &&
    test_commit two &&
    git push dest.git --all
'

test_expect_success 'prepare a alternates repository without testbranch' '
    git clone -b $GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME dest.git alternates &&
    oid=$(git -C alternates rev-parse refs/remotes/origin/testbranch) &&
    git -C alternates update-ref -d refs/remotes/origin/testbranch &&
    git -C alternates gc --prune=now
'

test_expect_success 'prepare a repository with commit-graph' '
    git init source &&
    echo "$(pwd)/dest.git/objects" >source/.git/objects/info/alternates &&
    git -C source remote add origin "$(pwd)/dest.git" &&
    git -C source config remote.origin.promisor true &&
    git -C source config remote.origin.partialclonefilter blob:none &&
    git -C source fetch origin &&
    (
        cd source &&
        test_commit three &&
        git -c gc.writeCommitGraph=true gc
    )
'

test_expect_success 'change alternates' '
    echo "$(pwd)/alternates/.git/objects" >source/.git/objects/info/alternates &&
    # this will bring an endless loop fetching
    git -C source fetch origin $oid
'

test_done

------------------------------------------------------

Thanks
-Han Xin

^ permalink raw reply	[flat|nested] 51+ messages in thread

end of thread, other threads:[~2022-07-13  1:27 UTC | newest]

Thread overview: 51+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-14  7:25 An endless loop fetching issue with partial clone, alternates and commit graph Haiyng Tan
2022-06-15  2:18 ` Taylor Blau
2022-06-16  3:38   ` [RFC PATCH 0/2] " Han Xin
2022-06-16  3:38     ` [RFC PATCH 1/2] commit-graph.c: add "flags" to lookup_commit_in_graph() Han Xin
2022-06-16  3:38     ` [RFC PATCH 2/2] fetch-pack.c: pass "oi_flags" " Han Xin
2022-06-17 21:47     ` [RFC PATCH 0/2] Re: An endless loop fetching issue with partial clone, alternates and commit graph Jonathan Tan
2022-06-18  3:01     ` [PATCH v1] commit-graph.c: no lazy fetch in lookup_commit_in_graph() Han Xin
2022-06-20  7:07       ` Patrick Steinhardt
2022-06-20  8:53         ` [External] " 欣韩
2022-06-20  9:05           ` Patrick Steinhardt
2022-06-21 18:23       ` Jonathan Tan
2022-06-22  3:17         ` Han Xin
2022-06-24  5:27       ` [PATCH v2 0/2] " Han Xin
2022-06-24  5:27         ` [PATCH v2 1/2] test-lib.sh: add limited processes to test-lib Han Xin
2022-06-24 16:03           ` Junio C Hamano
2022-06-25  1:35             ` Han Xin
2022-06-27 12:22               ` Junio C Hamano
2022-06-24  5:27         ` [PATCH v2 2/2] commit-graph.c: no lazy fetch in lookup_commit_in_graph() Han Xin
2022-06-24 16:56           ` Junio C Hamano
2022-06-25  2:25             ` Han Xin
2022-06-25  2:31               ` Han Xin
2022-06-28  2:02         ` [PATCH v3 0/2] " Han Xin
2022-06-28  2:02           ` [PATCH v3 1/2] test-lib.sh: add limited processes to test-lib Han Xin
2022-06-28  2:02           ` [PATCH v3 2/2] commit-graph.c: no lazy fetch in lookup_commit_in_graph() Han Xin
2022-06-28  7:49             ` Ævar Arnfjörð Bjarmason
2022-06-28 17:36               ` Junio C Hamano
2022-06-30 12:21                 ` Johannes Schindelin
2022-06-30 13:43                   ` Ævar Arnfjörð Bjarmason
2022-06-30 15:40                     ` Junio C Hamano
2022-06-30 18:47                       ` Ævar Arnfjörð Bjarmason
2022-07-01 19:31                       ` Johannes Schindelin
2022-07-01 20:47                         ` Junio C Hamano
2022-06-29  2:08               ` Han Xin
2022-06-30 17:37           ` test name conflict + js/ci-github-workflow-markup regression (was: [PATCH v3 0/2] no lazy fetch in lookup_commit_in_graph()) Ævar Arnfjörð Bjarmason
2022-07-01  1:34           ` [PATCH v4 0/1] no lazy fetch in lookup_commit_in_graph() Han Xin
2022-07-01  1:34             ` [PATCH v4 1/1] commit-graph.c: " Han Xin
2022-07-09 12:23               ` Michael J Gruber
2022-07-11 15:09                 ` Jeff King
2022-07-11 20:17                   ` Junio C Hamano
2022-07-12  1:52                     ` [External] " Han Xin
2022-07-12  5:23                       ` Junio C Hamano
2022-07-12  5:32                         ` Han Xin
2022-07-12  6:37                         ` [External] " Jeff King
2022-07-12 14:19                           ` Junio C Hamano
2022-07-12  6:50             ` [PATCH v5 0/1] " Han Xin
2022-07-12  6:50               ` [PATCH v5 1/1] commit-graph.c: " Han Xin
2022-07-12  9:50                 ` Ævar Arnfjörð Bjarmason
2022-07-13  1:26                   ` Han Xin
2022-07-12  6:58               ` [PATCH v5 0/1] " Jeff King
2022-07-12  8:01             ` [PATCH v1] t5330: remove run_with_limited_processses() Han Xin
  -- strict thread matches above, loose matches on Subject: below --
2022-06-12 16:17 An endless loop fetching issue with partial clone, alternates and commit graph Han Xin

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).