git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* Re: An endless loop fetching issue with partial clone, alternates and commit graph
@ 2022-06-14  7:25 Haiyng Tan
  2022-06-15  2:18 ` Taylor Blau
  0 siblings, 1 reply; 50+ messages in thread
From: Haiyng Tan @ 2022-06-14  7:25 UTC (permalink / raw)
  To: git, chiyutianyi; +Cc: ps, jonathantanmy

On Mon, 13 Jun 2022 00:17:07 +0800, Han Xin wrote:
> We found an issue that could create an endless loop where alternates
> objects are used improperly.
>
> While do fetching in a partial cloned repository with a commit graph,
> deref_without_lazy_fetch_extended() will call lookup_commit_in_graph()
> to find the commit object. We can found the code in commit-graph.c:
>
>      struct commit *lookup_commit_in_graph(struct repository *repo, const struct object_id *id)
>      {
>           …
>           if (!search_commit_pos_in_graph(id, repo->objects->commit_graph, &pos))
>                return NULL;
>           if (!repo_has_object_file(repo, id))
>                return NULL;

> If we found the object in the commit graph, but missing it in the repository,
> we will go into an endless loop:
>      git fetch -> deref_without_lazy_fetch_extended() ->
>           lookup_commit_in_graph() -> repo_has_object_file() ->
>                promisor_remote_get_direct() -> fetch_objects() ->
>                     git fetch
>
> I know that the reason for this issue is due to improper use of
> alternates, we can ensure that objects will not be lost by maintaining
> all the references. But shouldn't we do something about this unusual
> usage, it will cause a fetch bombardment of the remote git service.
>
> We can reproduce this issue with the following test case, it will
> generate a lot of git processes, please be careful to stop it.
> ———————————————————————————
> #!/bin/sh
>
> test_description='test for an endless loop fetching’
>
> GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
> export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
>
> . ./test-lib.sh
>
> test_expect_success 'setup’ ‘
>     git init --bare dest.git &&
>     test_commit one &&
>    git checkout -b testbranch &&
>    test_commit two &&
>    git push dest.git --all
> '
>
> test_expect_success 'prepare a alternates repository without testbranch' '
>    git clone -b $GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME dest.git alternates &&
>    oid=$(git -C alternates rev-parse refs/remotes/origin/testbranch) &&
>    git -C alternates update-ref -d refs/remotes/origin/testbranch &&
>    git -C alternates gc --prune=now
> '
>
> test_expect_success 'prepare a repository with commit-graph' '
>    git init source &&
>    echo "$(pwd)/dest.git/objects" >source/.git/objects/info/alternates &&
>    git -C source remote add origin "$(pwd)/dest.git" &&
>    git -C source config remote.origin.promisor true &&
>    git -C source config remote.origin.partialclonefilter blob:none &&
>    git -C source fetch origin &&
>    (
>        cd source &&
>        test_commit three &&
>        git -c gc.writeCommitGraph=true gc
>    )
> '
>
> test_expect_success 'change alternates' '
>    echo "$(pwd)/alternates/.git/objects" >source/.git/objects/info/alternates &&
>    # this will bring an endless loop fetching
>    git -C source fetch origin $oid
> '
>
> test_done
>
> ------------------------------------------------------
>
> Thanks
> -Han Xin

I think it's caused by using lazy-fetch in deref_without_lazy_fetch_extended().
In lookup_commit_in_graph(), lazy-fetch is initiated by
repo_has_object_file() used.
has_object() should be used, it's no-lazy-fetch.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: An endless loop fetching issue with partial clone, alternates and commit graph
  2022-06-14  7:25 An endless loop fetching issue with partial clone, alternates and commit graph Haiyng Tan
@ 2022-06-15  2:18 ` Taylor Blau
  2022-06-16  3:38   ` [RFC PATCH 0/2] " Han Xin
  0 siblings, 1 reply; 50+ messages in thread
From: Taylor Blau @ 2022-06-15  2:18 UTC (permalink / raw)
  To: Haiyng Tan; +Cc: git, chiyutianyi, ps, jonathantanmy, Derrick Stolee

[+cc Stolee]

On Tue, Jun 14, 2022 at 03:25:13PM +0800, Haiyng Tan wrote:
> I think it's caused by using lazy-fetch in
> deref_without_lazy_fetch_extended().  In lookup_commit_in_graph(),
> lazy-fetch is initiated by repo_has_object_file() used.  has_object()
> should be used, it's no-lazy-fetch.

Hmm. Are there cases where lookup_commit_in_graph() is expected to
lazily fetch missing objects from promisor remotes? If so, then this
wouldn't quite work. If not, then this seems like an appropriate fix to
me.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [RFC PATCH 0/2] Re: An endless loop fetching issue with partial clone, alternates and commit graph
  2022-06-15  2:18 ` Taylor Blau
@ 2022-06-16  3:38   ` Han Xin
  2022-06-16  3:38     ` [RFC PATCH 1/2] commit-graph.c: add "flags" to lookup_commit_in_graph() Han Xin
                       ` (3 more replies)
  0 siblings, 4 replies; 50+ messages in thread
From: Han Xin @ 2022-06-16  3:38 UTC (permalink / raw)
  To: me; +Cc: chiyutianyi, derrickstolee, git, haiyangtand, jonathantanmy, ps,
	Han Xin

On Wed, Jun 15, 2022 at 10:18 AM Taylor Blau <me@ttaylorr.com> wrote:
>
> [+cc Stolee]
>
> On Tue, Jun 14, 2022 at 03:25:13PM +0800, Haiyng Tan wrote:
> > I think it's caused by using lazy-fetch in
> > deref_without_lazy_fetch_extended().  In lookup_commit_in_graph(),
> > lazy-fetch is initiated by repo_has_object_file() used.  has_object()
> > should be used, it's no-lazy-fetch.
>
> Hmm. Are there cases where lookup_commit_in_graph() is expected to
> lazily fetch missing objects from promisor remotes? If so, then this
> wouldn't quite work. If not, then this seems like an appropriate fix to
> me.
>
> Thanks,
> Taylor

We can see the use of has_object() in RelNotes/2.29.0.txt[1]:
   * A new helper function has_object() has been introduced to make it
     easier to mark object existence checks that do and don't want to
     trigger lazy fetches, and a few such checks are converted using it.

Let's see the difference between has_object() and repo_has_object_file():
    int has_object(struct repository *r, const struct object_id *oid,
            unsigned flags)
    {
        int quick = !(flags & HAS_OBJECT_RECHECK_PACKED);
        unsigned object_info_flags = OBJECT_INFO_SKIP_FETCH_OBJECT |
            (quick ? OBJECT_INFO_QUICK : 0);

        if (!startup_info->have_repository)
            return 0;
        return oid_object_info_extended(r, oid, NULL, object_info_flags) >= 0;
    }

    int repo_has_object_file_with_flags(struct repository *r,
                        const struct object_id *oid, int flags)
    {
        if (!startup_info->have_repository)
            return 0;
        return oid_object_info_extended(r, oid, NULL, flags) >= 0;
    }

    int repo_has_object_file(struct repository *r,
                const struct object_id *oid)
    {
        return repo_has_object_file_with_flags(r, oid, 0);
    }

Now we kown that has_object() add OBJECT_INFO_SKIP_FETCH_OBJECT to skip
fetch object.

I found that Ævar Arnfjörð Bjarmason added deref_without_lazy_fetch()
4 weeks ago[2]:
    static struct commit *deref_without_lazy_fetch(const struct object_id *oid,
                            int mark_tags_complete)
    {
        enum object_type type;
        unsigned flags = OBJECT_INFO_SKIP_FETCH_OBJECT | OBJECT_INFO_QUICK;
        return deref_without_lazy_fetch_extended(oid, mark_tags_complete,
                            &type, flags);
    }

But oi_flags is only used by oid_object_info_extended() and is missed by
lookup_commit_in_graph():
    static struct commit *deref_without_lazy_fetch_extended(const struct object_id *oid,
                                int mark_tags_complete,
                                enum object_type *type,
                                unsigned int oi_flags)
    {
        struct object_info info = { .typep = type };
        struct commit *commit;

        commit = lookup_commit_in_graph(the_repository, oid);
        if (commit)
            return commit;

        while (1) {
            if (oid_object_info_extended(the_repository, oid, &info,
                            oi_flags))

So, an appropriate fix can be that let lookup_commit_in_graph() pickup
oi_flags and pass it to oid_object_info_extended(), then the fetching
loop will be prevent by the given flag OBJECT_INFO_SKIP_FETCH_OBJECT.

1. https://github.com/git/git/blob/master/Documentation/RelNotes/2.29.0.txt
2. https://lore.kernel.org/git/2a563b5f18cc9c42cb71a9547344a5435f6bc058.1652731865.git.gitgitgadget@gmail.com/

Thanks
-Han Xin

Han Xin (2):
  commit-graph.c: add "flags" to lookup_commit_in_graph()
  fetch-pack.c: pass "oi_flags" to lookup_commit_in_graph()

 builtin/fetch.c                    |  4 ++-
 commit-graph.c                     |  5 ++--
 commit-graph.h                     |  3 +-
 fetch-pack.c                       | 10 +++----
 revision.c                         |  2 +-
 t/t5583-fetch-with-commit-graph.sh | 47 ++++++++++++++++++++++++++++++
 upload-pack.c                      |  5 ++--
 7 files changed, 64 insertions(+), 12 deletions(-)
 create mode 100644 t/t5583-fetch-with-commit-graph.sh

-- 
2.36.1


^ permalink raw reply	[flat|nested] 50+ messages in thread

* [RFC PATCH 1/2] commit-graph.c: add "flags" to lookup_commit_in_graph()
  2022-06-16  3:38   ` [RFC PATCH 0/2] " Han Xin
@ 2022-06-16  3:38     ` Han Xin
  2022-06-16  3:38     ` [RFC PATCH 2/2] fetch-pack.c: pass "oi_flags" " Han Xin
                       ` (2 subsequent siblings)
  3 siblings, 0 replies; 50+ messages in thread
From: Han Xin @ 2022-06-16  3:38 UTC (permalink / raw)
  To: me; +Cc: chiyutianyi, derrickstolee, git, haiyangtand, jonathantanmy, ps,
	Han Xin

When try to do deref_without_lazy_fetch_extended(), "oi_flags" will
be missed by lookup_commit_in_graph(), then repo_has_object_file()
may start a new round objects fetching.
So let's add "flags" to lookup_commit_in_graph() and use
repo_has_object_file_with_flags() to pass the flags.

Signed-off-by: Han Xin <hanxin.hx@bytedance.com>
---
 builtin/fetch.c | 4 +++-
 commit-graph.c  | 5 +++--
 commit-graph.h  | 3 ++-
 fetch-pack.c    | 4 ++--
 revision.c      | 2 +-
 upload-pack.c   | 5 +++--
 6 files changed, 14 insertions(+), 9 deletions(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index ac29c2b1ae..44285d5318 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -1179,7 +1179,9 @@ static int store_updated_refs(const char *raw_url, const char *remote_name,
 				 * annotated tags.
 				 */
 				if (!starts_with(rm->name, "refs/tags/"))
-					commit = lookup_commit_in_graph(the_repository, &rm->old_oid);
+					commit = lookup_commit_in_graph(
+						the_repository, &rm->old_oid,
+						0);
 				if (!commit) {
 					commit = lookup_commit_reference_gently(the_repository,
 										&rm->old_oid,
diff --git a/commit-graph.c b/commit-graph.c
index 92d4503336..b09f454bb5 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -889,7 +889,8 @@ static int find_commit_pos_in_graph(struct commit *item, struct commit_graph *g,
 	}
 }
 
-struct commit *lookup_commit_in_graph(struct repository *repo, const struct object_id *id)
+struct commit *lookup_commit_in_graph(struct repository *repo,
+				      const struct object_id *id, int flags)
 {
 	struct commit *commit;
 	uint32_t pos;
@@ -898,7 +899,7 @@ struct commit *lookup_commit_in_graph(struct repository *repo, const struct obje
 		return NULL;
 	if (!search_commit_pos_in_graph(id, repo->objects->commit_graph, &pos))
 		return NULL;
-	if (!repo_has_object_file(repo, id))
+	if (!repo_has_object_file_with_flags(repo, id, flags))
 		return NULL;
 
 	commit = lookup_commit(repo, id);
diff --git a/commit-graph.h b/commit-graph.h
index 2e3ac35237..747a67c0ee 100644
--- a/commit-graph.h
+++ b/commit-graph.h
@@ -46,7 +46,8 @@ int parse_commit_in_graph(struct repository *r, struct commit *item);
  * that we don't return commits whose object has been pruned. Otherwise, this
  * function returns `NULL`.
  */
-struct commit *lookup_commit_in_graph(struct repository *repo, const struct object_id *id);
+struct commit *lookup_commit_in_graph(struct repository *repo,
+				      const struct object_id *id, int flags);
 
 /*
  * It is possible that we loaded commit contents from the commit buffer,
diff --git a/fetch-pack.c b/fetch-pack.c
index cb6647d657..4a62fb182e 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -123,7 +123,7 @@ static struct commit *deref_without_lazy_fetch_extended(const struct object_id *
 	struct object_info info = { .typep = type };
 	struct commit *commit;
 
-	commit = lookup_commit_in_graph(the_repository, oid);
+	commit = lookup_commit_in_graph(the_repository, oid, 0);
 	if (commit)
 		return commit;
 
@@ -714,7 +714,7 @@ static void mark_complete_and_common_ref(struct fetch_negotiator *negotiator,
 	for (ref = *refs; ref; ref = ref->next) {
 		struct commit *commit;
 
-		commit = lookup_commit_in_graph(the_repository, &ref->old_oid);
+		commit = lookup_commit_in_graph(the_repository, &ref->old_oid, 0);
 		if (!commit) {
 			struct object *o;
 
diff --git a/revision.c b/revision.c
index 211352795c..df5db51f98 100644
--- a/revision.c
+++ b/revision.c
@@ -379,7 +379,7 @@ static struct object *get_reference(struct rev_info *revs, const char *name,
 	 * look up the object ID in those graphs. Like this, we can avoid
 	 * parsing commit data from disk.
 	 */
-	commit = lookup_commit_in_graph(revs->repo, oid);
+	commit = lookup_commit_in_graph(revs->repo, oid, 0);
 	if (commit)
 		object = &commit->object;
 	else
diff --git a/upload-pack.c b/upload-pack.c
index 3a851b3606..0fa9c3cf3f 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -1407,7 +1407,7 @@ static int parse_want(struct packet_writer *writer, const char *line,
 			die("git upload-pack: protocol error, "
 			    "expected to get oid, not '%s'", line);
 
-		commit = lookup_commit_in_graph(the_repository, &oid);
+		commit = lookup_commit_in_graph(the_repository, &oid, 0);
 		if (commit)
 			o = &commit->object;
 		else
@@ -1455,7 +1455,8 @@ static int parse_want_ref(struct packet_writer *writer, const char *line,
 		item->util = oiddup(&oid);
 
 		if (!starts_with(refname_nons, "refs/tags/")) {
-			struct commit *commit = lookup_commit_in_graph(the_repository, &oid);
+			struct commit *commit =
+				lookup_commit_in_graph(the_repository, &oid, 0);
 			if (commit)
 				o = &commit->object;
 		}
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [RFC PATCH 2/2] fetch-pack.c: pass "oi_flags" to lookup_commit_in_graph()
  2022-06-16  3:38   ` [RFC PATCH 0/2] " Han Xin
  2022-06-16  3:38     ` [RFC PATCH 1/2] commit-graph.c: add "flags" to lookup_commit_in_graph() Han Xin
@ 2022-06-16  3:38     ` Han Xin
  2022-06-17 21:47     ` [RFC PATCH 0/2] Re: An endless loop fetching issue with partial clone, alternates and commit graph Jonathan Tan
  2022-06-18  3:01     ` [PATCH v1] commit-graph.c: no lazy fetch in lookup_commit_in_graph() Han Xin
  3 siblings, 0 replies; 50+ messages in thread
From: Han Xin @ 2022-06-16  3:38 UTC (permalink / raw)
  To: me; +Cc: chiyutianyi, derrickstolee, git, haiyangtand, jonathantanmy, ps,
	Han Xin

As the custom "oi_flags" is missed by lookup_commit_in_graph(), we will
get another lazy fetch round if we found the commit in commit graph but
miss it in the local object repository.

We can see the issue via[1].

1. https://lore.kernel.org/git/20220612161707.21807-1-chiyutianyi@gmail.com/

Signed-off-by: Han Xin <hanxin.hx@bytedance.com>
---
 fetch-pack.c                       | 10 +++----
 t/t5583-fetch-with-commit-graph.sh | 47 ++++++++++++++++++++++++++++++
 2 files changed, 52 insertions(+), 5 deletions(-)
 create mode 100644 t/t5583-fetch-with-commit-graph.sh

diff --git a/fetch-pack.c b/fetch-pack.c
index 4a62fb182e..ca1234e456 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -123,7 +123,7 @@ static struct commit *deref_without_lazy_fetch_extended(const struct object_id *
 	struct object_info info = { .typep = type };
 	struct commit *commit;
 
-	commit = lookup_commit_in_graph(the_repository, oid, 0);
+	commit = lookup_commit_in_graph(the_repository, oid, oi_flags);
 	if (commit)
 		return commit;
 
@@ -704,6 +704,7 @@ static void mark_complete_and_common_ref(struct fetch_negotiator *negotiator,
 	struct ref *ref;
 	int old_save_commit_buffer = save_commit_buffer;
 	timestamp_t cutoff = 0;
+	int oi_flags = OBJECT_INFO_SKIP_FETCH_OBJECT | OBJECT_INFO_QUICK;
 
 	if (args->refetch)
 		return;
@@ -714,13 +715,12 @@ static void mark_complete_and_common_ref(struct fetch_negotiator *negotiator,
 	for (ref = *refs; ref; ref = ref->next) {
 		struct commit *commit;
 
-		commit = lookup_commit_in_graph(the_repository, &ref->old_oid, 0);
+		commit = lookup_commit_in_graph(the_repository, &ref->old_oid,
+						oi_flags);
 		if (!commit) {
 			struct object *o;
 
-			if (!has_object_file_with_flags(&ref->old_oid,
-						OBJECT_INFO_QUICK |
-						OBJECT_INFO_SKIP_FETCH_OBJECT))
+			if (!has_object_file_with_flags(&ref->old_oid, oi_flags))
 				continue;
 			o = parse_object(the_repository, &ref->old_oid);
 			if (!o || o->type != OBJ_COMMIT)
diff --git a/t/t5583-fetch-with-commit-graph.sh b/t/t5583-fetch-with-commit-graph.sh
new file mode 100644
index 0000000000..cb2beafa8d
--- /dev/null
+++ b/t/t5583-fetch-with-commit-graph.sh
@@ -0,0 +1,47 @@
+#!/bin/sh
+
+test_description='test for fetching missing object with a full commit-graph'
+
+GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
+export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
+
+. ./test-lib.sh
+
+test_expect_success 'setup' '
+	git init --bare dest.git &&
+	test_commit one &&
+	git checkout -b testbranch &&
+	test_commit two &&
+	git push dest.git --all
+'
+
+test_expect_success 'prepare a alternates repository without testbranch' '
+	git clone -b $GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME dest.git alternates &&
+	oid=$(git -C alternates rev-parse refs/remotes/origin/testbranch) &&
+	git -C alternates update-ref -d refs/remotes/origin/testbranch &&
+	git -C alternates gc --prune=now
+'
+
+test_expect_success 'prepare a repository with a full commit-graph' '
+	git init source &&
+	echo "$(pwd)/dest.git/objects" >source/.git/objects/info/alternates &&
+	git -C source remote add origin "$(pwd)/dest.git" &&
+	git -C source config remote.origin.promisor true &&
+	git -C source config remote.origin.partialclonefilter blob:none &&
+	git -C source fetch origin &&
+	(
+		cd source &&
+		test_commit three &&
+		git -c gc.writeCommitGraph=true gc
+	)
+'
+
+test_expect_success 'change the alternates to that without commit two' '
+	echo "$(pwd)/alternates/.git/objects" >source/.git/objects/info/alternates
+'
+
+test_expect_success 'fetch the missing object' '
+	git -C source fetch origin $oid
+'
+
+test_done
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: [RFC PATCH 0/2] Re: An endless loop fetching issue with partial clone, alternates and commit graph
  2022-06-16  3:38   ` [RFC PATCH 0/2] " Han Xin
  2022-06-16  3:38     ` [RFC PATCH 1/2] commit-graph.c: add "flags" to lookup_commit_in_graph() Han Xin
  2022-06-16  3:38     ` [RFC PATCH 2/2] fetch-pack.c: pass "oi_flags" " Han Xin
@ 2022-06-17 21:47     ` Jonathan Tan
  2022-06-18  3:01     ` [PATCH v1] commit-graph.c: no lazy fetch in lookup_commit_in_graph() Han Xin
  3 siblings, 0 replies; 50+ messages in thread
From: Jonathan Tan @ 2022-06-17 21:47 UTC (permalink / raw)
  To: Han Xin; +Cc: Jonathan Tan, me, chiyutianyi, derrickstolee, git, haiyangtand,
	ps

Han Xin <hanxin.hx@bytedance.com> writes:
> On Wed, Jun 15, 2022 at 10:18 AM Taylor Blau <me@ttaylorr.com> wrote:
> >
> > [+cc Stolee]
> >
> > On Tue, Jun 14, 2022 at 03:25:13PM +0800, Haiyng Tan wrote:
> > > I think it's caused by using lazy-fetch in
> > > deref_without_lazy_fetch_extended().  In lookup_commit_in_graph(),
> > > lazy-fetch is initiated by repo_has_object_file() used.  has_object()
> > > should be used, it's no-lazy-fetch.
> >
> > Hmm. Are there cases where lookup_commit_in_graph() is expected to
> > lazily fetch missing objects from promisor remotes? If so, then this
> > wouldn't quite work. If not, then this seems like an appropriate fix to
> > me.
> >
> > Thanks,
> > Taylor

I think that if a commit is in the commit graph, we would expect the
commit to also be present. So changing to has_object() makes sense.

> We can see the use of has_object() in RelNotes/2.29.0.txt[1]:
>    * A new helper function has_object() has been introduced to make it
>      easier to mark object existence checks that do and don't want to
>      trigger lazy fetches, and a few such checks are converted using it.

Also relevant is the comment on repo_has_object_file() in
object-store.h.

> So, an appropriate fix can be that let lookup_commit_in_graph() pickup
> oi_flags and pass it to oid_object_info_extended(), then the fetching
> loop will be prevent by the given flag OBJECT_INFO_SKIP_FETCH_OBJECT.

Hmm...why not change it to has_object_file() instead, as Haiyng Tan
mentioned?

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH v1] commit-graph.c: no lazy fetch in lookup_commit_in_graph()
  2022-06-16  3:38   ` [RFC PATCH 0/2] " Han Xin
                       ` (2 preceding siblings ...)
  2022-06-17 21:47     ` [RFC PATCH 0/2] Re: An endless loop fetching issue with partial clone, alternates and commit graph Jonathan Tan
@ 2022-06-18  3:01     ` Han Xin
  2022-06-20  7:07       ` Patrick Steinhardt
                         ` (2 more replies)
  3 siblings, 3 replies; 50+ messages in thread
From: Han Xin @ 2022-06-18  3:01 UTC (permalink / raw)
  To: hanxin.hx
  Cc: chiyutianyi, derrickstolee, git, haiyangtand, jonathantanmy, me,
	ps

If a commit is in the commit graph, we would expect the commit to also
be present. So we should use has_object() instead of
repo_has_object_file(), which will help us avoid getting into an endless
loop of lazy fetch.

We can see the endless loop issue via this[1].

1. https://lore.kernel.org/git/20220612161707.21807-1-chiyutianyi@gmail.com/

Signed-off-by: Han Xin <hanxin.hx@bytedance.com>
---
 commit-graph.c                             |  2 +-
 t/t5329-no-lazy-fetch-with-commit-graph.sh | 50 ++++++++++++++++++++++
 2 files changed, 51 insertions(+), 1 deletion(-)
 create mode 100755 t/t5329-no-lazy-fetch-with-commit-graph.sh

diff --git a/commit-graph.c b/commit-graph.c
index 2b52818731..2dd9bcc7ea 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -907,7 +907,7 @@ struct commit *lookup_commit_in_graph(struct repository *repo, const struct obje
 		return NULL;
 	if (!search_commit_pos_in_graph(id, repo->objects->commit_graph, &pos))
 		return NULL;
-	if (!repo_has_object_file(repo, id))
+	if (!has_object(repo, id, 0))
 		return NULL;
 
 	commit = lookup_commit(repo, id);
diff --git a/t/t5329-no-lazy-fetch-with-commit-graph.sh b/t/t5329-no-lazy-fetch-with-commit-graph.sh
new file mode 100755
index 0000000000..ea5940b9f1
--- /dev/null
+++ b/t/t5329-no-lazy-fetch-with-commit-graph.sh
@@ -0,0 +1,50 @@
+#!/bin/sh
+
+test_description='test for no lazy fetch with the commit-graph'
+
+. ./test-lib.sh
+
+test_expect_success 'setup' '
+	git init --bare dest.git &&
+	test_commit one &&
+	git checkout -b tmp &&
+	test_commit two &&
+	git push dest.git --all
+'
+
+test_expect_success 'prepare a alternates repository without commit two' '
+	git clone --bare dest.git alternates &&
+	oid=$(git -C alternates rev-parse refs/heads/tmp) &&
+	git -C alternates update-ref -d refs/heads/tmp &&
+	git -C alternates gc --prune=now &&
+	pack=$(echo alternates/objects/pack/*.pack) &&
+	git verify-pack -v "$pack" >have &&
+	! grep "$oid" have
+'
+
+test_expect_success 'prepare a repository with a commit-graph contains commit two' '
+	git init source &&
+	echo "$(pwd)/dest.git/objects" >source/.git/objects/info/alternates &&
+	git -C source remote add origin "$(pwd)/dest.git" &&
+	git -C source config remote.origin.promisor true &&
+	git -C source config remote.origin.partialclonefilter blob:none &&
+	# the source repository has the whole refs contains refs/heads/tmp
+	git -C source fetch origin &&
+	(
+		cd source &&
+		test_commit three &&
+		git -c gc.writeCommitGraph=true gc
+	)
+'
+
+test_expect_success 'change the alternates of source to that without commit two' '
+	# now we have a commit-graph in the source repository but without the commit two
+	echo "$(pwd)/alternates/objects" >source/.git/objects/info/alternates
+'
+
+test_expect_success 'fetch the missing commit' '
+	git -C source fetch origin $oid 2>fetch.out &&
+	grep "$oid" fetch.out
+'
+
+test_done
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: [PATCH v1] commit-graph.c: no lazy fetch in lookup_commit_in_graph()
  2022-06-18  3:01     ` [PATCH v1] commit-graph.c: no lazy fetch in lookup_commit_in_graph() Han Xin
@ 2022-06-20  7:07       ` Patrick Steinhardt
  2022-06-20  8:53         ` [External] " 欣韩
  2022-06-21 18:23       ` Jonathan Tan
  2022-06-24  5:27       ` [PATCH v2 0/2] " Han Xin
  2 siblings, 1 reply; 50+ messages in thread
From: Patrick Steinhardt @ 2022-06-20  7:07 UTC (permalink / raw)
  To: Han Xin; +Cc: chiyutianyi, derrickstolee, git, haiyangtand, jonathantanmy, me

[-- Attachment #1: Type: text/plain, Size: 3810 bytes --]

On Sat, Jun 18, 2022 at 11:01:30AM +0800, Han Xin wrote:
> If a commit is in the commit graph, we would expect the commit to also
> be present. So we should use has_object() instead of
> repo_has_object_file(), which will help us avoid getting into an endless
> loop of lazy fetch.
> 
> We can see the endless loop issue via this[1].
> 
> 1. https://lore.kernel.org/git/20220612161707.21807-1-chiyutianyi@gmail.com/
> 
> Signed-off-by: Han Xin <hanxin.hx@bytedance.com>

Thanks a lot for working on this issue!

> ---
>  commit-graph.c                             |  2 +-
>  t/t5329-no-lazy-fetch-with-commit-graph.sh | 50 ++++++++++++++++++++++
>  2 files changed, 51 insertions(+), 1 deletion(-)
>  create mode 100755 t/t5329-no-lazy-fetch-with-commit-graph.sh
> 
> diff --git a/commit-graph.c b/commit-graph.c
> index 2b52818731..2dd9bcc7ea 100644
> --- a/commit-graph.c
> +++ b/commit-graph.c
> @@ -907,7 +907,7 @@ struct commit *lookup_commit_in_graph(struct repository *repo, const struct obje
>  		return NULL;
>  	if (!search_commit_pos_in_graph(id, repo->objects->commit_graph, &pos))
>  		return NULL;
> -	if (!repo_has_object_file(repo, id))
> +	if (!has_object(repo, id, 0))
>  		return NULL;

Agreed, this change makes sense to me.

>  	commit = lookup_commit(repo, id);
> diff --git a/t/t5329-no-lazy-fetch-with-commit-graph.sh b/t/t5329-no-lazy-fetch-with-commit-graph.sh
> new file mode 100755
> index 0000000000..ea5940b9f1
> --- /dev/null
> +++ b/t/t5329-no-lazy-fetch-with-commit-graph.sh
> @@ -0,0 +1,50 @@
> +#!/bin/sh
> +
> +test_description='test for no lazy fetch with the commit-graph'
> +
> +. ./test-lib.sh
> +
> +test_expect_success 'setup' '

Nit: I find it a bit confusing that you only call the first test case
`setup`, while all the other tests except for the last one are also only
setting up the necessary state.

> +	git init --bare dest.git &&
> +	test_commit one &&
> +	git checkout -b tmp &&
> +	test_commit two &&
> +	git push dest.git --all
> +'
> +
> +test_expect_success 'prepare a alternates repository without commit two' '
> 
> +	git clone --bare dest.git alternates &&
> 
> +	oid=$(git -C alternates rev-parse refs/heads/tmp) &&
> +	git -C alternates update-ref -d refs/heads/tmp &&
> +	git -C alternates gc --prune=now &&
> +	pack=$(echo alternates/objects/pack/*.pack) &&
> +	git verify-pack -v "$pack" >have &&
> +	! grep "$oid" have
> +'

Instead of going into the low-level details you could just verify that
`git cat-file -e $oid` returns `1`.

> +
> +test_expect_success 'prepare a repository with a commit-graph contains commit two' '
> +	git init source &&
> +	echo "$(pwd)/dest.git/objects" >source/.git/objects/info/alternates &&
> +	git -C source remote add origin "$(pwd)/dest.git" &&
> +	git -C source config remote.origin.promisor true &&
> +	git -C source config remote.origin.partialclonefilter blob:none &&
> +	# the source repository has the whole refs contains refs/heads/tmp
> +	git -C source fetch origin &&
> +	(
> +		cd source &&
> +		test_commit three &&
> +		git -c gc.writeCommitGraph=true gc
> +	)
> +'
> +
> +test_expect_success 'change the alternates of source to that without commit two' '
> +	# now we have a commit-graph in the source repository but without the commit two
> +	echo "$(pwd)/alternates/objects" >source/.git/objects/info/alternates
> +'
> +
> +test_expect_success 'fetch the missing commit' '
> +	git -C source fetch origin $oid 2>fetch.out &&
> +	grep "$oid" fetch.out
> +'

This test passes even without your fix, albeit a lot slower compared
to with it. Can we somehow cause it to fail reliably so that the test
becomes effective in catching a regression here?

> +test_done
> -- 
> 2.36.1
> 

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [External] Re: [PATCH v1] commit-graph.c: no lazy fetch in lookup_commit_in_graph()
  2022-06-20  7:07       ` Patrick Steinhardt
@ 2022-06-20  8:53         ` 欣韩
  2022-06-20  9:05           ` Patrick Steinhardt
  0 siblings, 1 reply; 50+ messages in thread
From: 欣韩 @ 2022-06-20  8:53 UTC (permalink / raw)
  To: Patrick Steinhardt
  Cc: chiyutianyi, derrickstolee, git, haiyangtand, jonathantanmy, me

On Mon, Jun 20, 2022 at 3:34 PM Patrick Steinhardt <ps@pks.im> wrote:
>
> On Sat, Jun 18, 2022 at 11:01:30AM +0800, Han Xin wrote:
> > If a commit is in the commit graph, we would expect the commit to also
> > be present. So we should use has_object() instead of
> > repo_has_object_file(), which will help us avoid getting into an endless
> > loop of lazy fetch.
> >
> > We can see the endless loop issue via this[1].
> >
> > 1. https://lore.kernel.org/git/20220612161707.21807-1-chiyutianyi@gmail.com/
> >
> > Signed-off-by: Han Xin <hanxin.hx@bytedance.com>
>
> Thanks a lot for working on this issue!
>
> > ---
> >  commit-graph.c                             |  2 +-
> >  t/t5329-no-lazy-fetch-with-commit-graph.sh | 50 ++++++++++++++++++++++
> >  2 files changed, 51 insertions(+), 1 deletion(-)
> >  create mode 100755 t/t5329-no-lazy-fetch-with-commit-graph.sh
> >
> > diff --git a/commit-graph.c b/commit-graph.c
> > index 2b52818731..2dd9bcc7ea 100644
> > --- a/commit-graph.c
> > +++ b/commit-graph.c
> > @@ -907,7 +907,7 @@ struct commit *lookup_commit_in_graph(struct repository *repo, const struct obje
> >               return NULL;
> >       if (!search_commit_pos_in_graph(id, repo->objects->commit_graph, &pos))
> >               return NULL;
> > -     if (!repo_has_object_file(repo, id))
> > +     if (!has_object(repo, id, 0))
> >               return NULL;
>
> Agreed, this change makes sense to me.
>
> >       commit = lookup_commit(repo, id);
> > diff --git a/t/t5329-no-lazy-fetch-with-commit-graph.sh b/t/t5329-no-lazy-fetch-with-commit-graph.sh
> > new file mode 100755
> > index 0000000000..ea5940b9f1
> > --- /dev/null
> > +++ b/t/t5329-no-lazy-fetch-with-commit-graph.sh
> > @@ -0,0 +1,50 @@
> > +#!/bin/sh
> > +
> > +test_description='test for no lazy fetch with the commit-graph'
> > +
> > +. ./test-lib.sh
> > +
> > +test_expect_success 'setup' '
>
> Nit: I find it a bit confusing that you only call the first test case
> `setup`, while all the other tests except for the last one are also only
> setting up the necessary state.

Nod.
There is indeed some misleading information here.
I will adjust it in the next patch.

>
> > +     git init --bare dest.git &&
> > +     test_commit one &&
> > +     git checkout -b tmp &&
> > +     test_commit two &&
> > +     git push dest.git --all
> > +'
> > +
> > +test_expect_success 'prepare a alternates repository without commit two' '
> >
> > +     git clone --bare dest.git alternates &&
> >
> > +     oid=$(git -C alternates rev-parse refs/heads/tmp) &&
> > +     git -C alternates update-ref -d refs/heads/tmp &&
> > +     git -C alternates gc --prune=now &&
> > +     pack=$(echo alternates/objects/pack/*.pack) &&
> > +     git verify-pack -v "$pack" >have &&
> > +     ! grep "$oid" have
> > +'
>
> Instead of going into the low-level details you could just verify that
> `git cat-file -e $oid` returns `1`.
>

Nod.

> > +
> > +test_expect_success 'prepare a repository with a commit-graph contains commit two' '
> > +     git init source &&
> > +     echo "$(pwd)/dest.git/objects" >source/.git/objects/info/alternates &&
> > +     git -C source remote add origin "$(pwd)/dest.git" &&
> > +     git -C source config remote.origin.promisor true &&
> > +     git -C source config remote.origin.partialclonefilter blob:none &&
> > +     # the source repository has the whole refs contains refs/heads/tmp
> > +     git -C source fetch origin &&
> > +     (
> > +             cd source &&
> > +             test_commit three &&
> > +             git -c gc.writeCommitGraph=true gc
> > +     )
> > +'
> > +
> > +test_expect_success 'change the alternates of source to that without commit two' '
> > +     # now we have a commit-graph in the source repository but without the commit two
> > +     echo "$(pwd)/alternates/objects" >source/.git/objects/info/alternates
> > +'
> > +
> > +test_expect_success 'fetch the missing commit' '
> > +     git -C source fetch origin $oid 2>fetch.out &&
> > +     grep "$oid" fetch.out
> > +'
>
> This test passes even without your fix, albeit a lot slower compared
> to with it. Can we somehow cause it to fail reliably so that the test
> becomes effective in catching a regression here?
>

Could you help me find the reason why this testcase passes even
without the fix.

From the execution of Github Action, it seems that the problem always exist:
https://github.com/chiyutianyi/git/actions/runs/2527421443.

Thanks.
-Han Xin

> > +test_done
> > --
> > 2.36.1
> >
>
> Patrick

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [External] Re: [PATCH v1] commit-graph.c: no lazy fetch in lookup_commit_in_graph()
  2022-06-20  8:53         ` [External] " 欣韩
@ 2022-06-20  9:05           ` Patrick Steinhardt
  0 siblings, 0 replies; 50+ messages in thread
From: Patrick Steinhardt @ 2022-06-20  9:05 UTC (permalink / raw)
  To: 欣韩
  Cc: chiyutianyi, derrickstolee, git, haiyangtand, jonathantanmy, me

[-- Attachment #1: Type: text/plain, Size: 2157 bytes --]

On Mon, Jun 20, 2022 at 04:53:47PM +0800, 欣韩 wrote:
> On Mon, Jun 20, 2022 at 3:34 PM Patrick Steinhardt <ps@pks.im> wrote:
> >
> > On Sat, Jun 18, 2022 at 11:01:30AM +0800, Han Xin wrote:
[snip]
> > > +test_expect_success 'prepare a repository with a commit-graph contains commit two' '
> > > +     git init source &&
> > > +     echo "$(pwd)/dest.git/objects" >source/.git/objects/info/alternates &&
> > > +     git -C source remote add origin "$(pwd)/dest.git" &&
> > > +     git -C source config remote.origin.promisor true &&
> > > +     git -C source config remote.origin.partialclonefilter blob:none &&
> > > +     # the source repository has the whole refs contains refs/heads/tmp
> > > +     git -C source fetch origin &&
> > > +     (
> > > +             cd source &&
> > > +             test_commit three &&
> > > +             git -c gc.writeCommitGraph=true gc
> > > +     )
> > > +'
> > > +
> > > +test_expect_success 'change the alternates of source to that without commit two' '
> > > +     # now we have a commit-graph in the source repository but without the commit two
> > > +     echo "$(pwd)/alternates/objects" >source/.git/objects/info/alternates
> > > +'
> > > +
> > > +test_expect_success 'fetch the missing commit' '
> > > +     git -C source fetch origin $oid 2>fetch.out &&
> > > +     grep "$oid" fetch.out
> > > +'
> >
> > This test passes even without your fix, albeit a lot slower compared
> > to with it. Can we somehow cause it to fail reliably so that the test
> > becomes effective in catching a regression here?
> >
> 
> Could you help me find the reason why this testcase passes even
> without the fix.
> 
> From the execution of Github Action, it seems that the problem always exist:
> https://github.com/chiyutianyi/git/actions/runs/2527421443.
> 
> Thanks.
> -Han Xin

Hard to say, I'm not sure either. One thing I noticed though is that in
your CI run there's failure in e.g. linux-gcc, but the test run for
linux-musl succeeds. Personally I'm using musl libc on my system, as
well, so maybe it's a discrepancy between musl- and glibc-based systems?

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v1] commit-graph.c: no lazy fetch in lookup_commit_in_graph()
  2022-06-18  3:01     ` [PATCH v1] commit-graph.c: no lazy fetch in lookup_commit_in_graph() Han Xin
  2022-06-20  7:07       ` Patrick Steinhardt
@ 2022-06-21 18:23       ` Jonathan Tan
  2022-06-22  3:17         ` Han Xin
  2022-06-24  5:27       ` [PATCH v2 0/2] " Han Xin
  2 siblings, 1 reply; 50+ messages in thread
From: Jonathan Tan @ 2022-06-21 18:23 UTC (permalink / raw)
  To: Han Xin; +Cc: Jonathan Tan, chiyutianyi, derrickstolee, git, haiyangtand, me,
	ps

Han Xin <hanxin.hx@bytedance.com> writes:
> If a commit is in the commit graph, we would expect the commit to also
> be present. So we should use has_object() instead of
> repo_has_object_file(), which will help us avoid getting into an endless
> loop of lazy fetch.
> 
> We can see the endless loop issue via this[1].
> 
> 1. https://lore.kernel.org/git/20220612161707.21807-1-chiyutianyi@gmail.com/

As described in SubmittingPatches:

  Try to make sure your explanation can be understood   
  without external resources. Instead of giving a URL to a mailing list
  archive, summarize the relevant points of the discussion.

> +test_expect_success 'setup' '
> +	git init --bare dest.git &&
> +	test_commit one &&
> +	git checkout -b tmp &&
> +	test_commit two &&
> +	git push dest.git --all
> +'

You can commit directly to the repo by using "test_commit -C dest.git".
Also, can the repositories be better named? I see a "dest.git" (which
seems to contain all the objects), "alternates" (which seems to contain
everything except refs/heads/tmp), "source" (which only contains the
commit graph), and the current directory. It would probably be better to
name them e.g. "with-commit", "without-commit", "only-commit-graph", and
omit using the current directory altogether.

> +test_expect_success 'prepare a alternates repository without commit two' '
> +	git clone --bare dest.git alternates &&
> +	oid=$(git -C alternates rev-parse refs/heads/tmp) &&
> +	git -C alternates update-ref -d refs/heads/tmp &&
> +	git -C alternates gc --prune=now &&
> +	pack=$(echo alternates/objects/pack/*.pack) &&
> +	git verify-pack -v "$pack" >have &&
> +	! grep "$oid" have
> +'

OK, except refs/heads/tmp could probably have a better name.

> +test_expect_success 'prepare a repository with a commit-graph contains commit two' '
> +	git init source &&
> +	echo "$(pwd)/dest.git/objects" >source/.git/objects/info/alternates &&
> +	git -C source remote add origin "$(pwd)/dest.git" &&
> +	git -C source config remote.origin.promisor true &&
> +	git -C source config remote.origin.partialclonefilter blob:none &&
> +	# the source repository has the whole refs contains refs/heads/tmp
> +	git -C source fetch origin &&
> +	(
> +		cd source &&
> +		test_commit three &&
> +		git -c gc.writeCommitGraph=true gc
> +	)
> +'

Is the purpose of the fetch only to add a ref? If yes, it's clearer just
to create that branch instead of fetching.

> +test_expect_success 'change the alternates of source to that without commit two' '
> +	# now we have a commit-graph in the source repository but without the commit two
> +	echo "$(pwd)/alternates/objects" >source/.git/objects/info/alternates
> +'

OK.

> +test_expect_success 'fetch the missing commit' '
> +	git -C source fetch origin $oid 2>fetch.out &&
> +	grep "$oid" fetch.out
> +'

Is the bug triggered by fetching the missing commit or by fetching any
commit (which triggers the usage of the commit graph)? If any commit,
then it's clearer to create an arbitrary commit and then fetch it.

Also, I thought that the issue was an infinite loop, and the thing being
tested here looks different from that. If you want to ensure that
nothing is being fetched, you can use GIT_TRACE="$(pwd)/trace" to
observe a fetch-pack command being invoked or
GIT_TRACE_PACKET="$(pwd)/trace" to observe the packet being sent. If
you're worried about an infinite loop, you can set origin to a directory
that does not exist (so that the fetch immediately fails).

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Re: [PATCH v1] commit-graph.c: no lazy fetch in lookup_commit_in_graph()
  2022-06-21 18:23       ` Jonathan Tan
@ 2022-06-22  3:17         ` Han Xin
  0 siblings, 0 replies; 50+ messages in thread
From: Han Xin @ 2022-06-22  3:17 UTC (permalink / raw)
  To: Jonathan Tan
  Cc: chiyutianyi, derrickstolee, git, haiyangtand, Taylor Blau,
	Patrick Steinhardt

On Wed, Jun 22, 2022 at 2:23 AM Jonathan Tan <jonathantanmy@google.com> wrote:
>
> Han Xin <hanxin.hx@bytedance.com> writes:
> > If a commit is in the commit graph, we would expect the commit to also
> > be present. So we should use has_object() instead of
> > repo_has_object_file(), which will help us avoid getting into an endless
> > loop of lazy fetch.
> >
> > We can see the endless loop issue via this[1].
> >
> > 1. https://lore.kernel.org/git/20220612161707.21807-1-chiyutianyi@gmail.com/
>
> As described in SubmittingPatches:
>
>   Try to make sure your explanation can be understood
>   without external resources. Instead of giving a URL to a mailing list
>   archive, summarize the relevant points of the discussion.
>

Nod.

> > +test_expect_success 'setup' '
> > +     git init --bare dest.git &&
> > +     test_commit one &&
> > +     git checkout -b tmp &&
> > +     test_commit two &&
> > +     git push dest.git --all
> > +'
>
> You can commit directly to the repo by using "test_commit -C dest.git".
> Also, can the repositories be better named? I see a "dest.git" (which
> seems to contain all the objects), "alternates" (which seems to contain
> everything except refs/heads/tmp), "source" (which only contains the
> commit graph), and the current directory. It would probably be better to
> name them e.g. "with-commit", "without-commit", "only-commit-graph", and
> omit using the current directory altogether.

Yes, it makes sense to me.

>
> > +test_expect_success 'prepare a alternates repository without commit two' '
> > +     git clone --bare dest.git alternates &&
> > +     oid=$(git -C alternates rev-parse refs/heads/tmp) &&
> > +     git -C alternates update-ref -d refs/heads/tmp &&
> > +     git -C alternates gc --prune=now &&
> > +     pack=$(echo alternates/objects/pack/*.pack) &&
> > +     git verify-pack -v "$pack" >have &&
> > +     ! grep "$oid" have
> > +'
>
> OK, except refs/heads/tmp could probably have a better name.

I'll rethink the naming here.

>
> > +test_expect_success 'prepare a repository with a commit-graph contains commit two' '
> > +     git init source &&
> > +     echo "$(pwd)/dest.git/objects" >source/.git/objects/info/alternates &&
> > +     git -C source remote add origin "$(pwd)/dest.git" &&
> > +     git -C source config remote.origin.promisor true &&
> > +     git -C source config remote.origin.partialclonefilter blob:none &&
> > +     # the source repository has the whole refs contains refs/heads/tmp
> > +     git -C source fetch origin &&
> > +     (
> > +             cd source &&
> > +             test_commit three &&
> > +             git -c gc.writeCommitGraph=true gc
> > +     )
> > +'
>
> Is the purpose of the fetch only to add a ref? If yes, it's clearer just
> to create that branch instead of fetching.

Nod.

>
> > +test_expect_success 'change the alternates of source to that without commit two' '
> > +     # now we have a commit-graph in the source repository but without the commit two
> > +     echo "$(pwd)/alternates/objects" >source/.git/objects/info/alternates
> > +'
>
> OK.
>
> > +test_expect_success 'fetch the missing commit' '
> > +     git -C source fetch origin $oid 2>fetch.out &&
> > +     grep "$oid" fetch.out
> > +'
>
> Is the bug triggered by fetching the missing commit or by fetching any
> commit (which triggers the usage of the commit graph)? If any commit,
> then it's clearer to create an arbitrary commit and then fetch it.
>

Yes, using the missing object in the commit-graph seems to be a little
misleading.

> Also, I thought that the issue was an infinite loop, and the thing being
> tested here looks different from that. If you want to ensure that
> nothing is being fetched, you can use GIT_TRACE="$(pwd)/trace" to
> observe a fetch-pack command being invoked or
> GIT_TRACE_PACKET="$(pwd)/trace" to observe the packet being sent. If
> you're worried about an infinite loop, you can set origin to a directory
> that does not exist (so that the fetch immediately fails).

Maybe we can use "ulimit" ?

Then the test case can be:

    test_expect_success 'fetch the missing commit once' '
        ulimit -u 512 &&
        GIT_TRACE="$(pwd)/trace" git -C source fetch origin $oid 2>err &&
        ! grep "error: cannot fork" err &&
        test $(grep "fetch origin" trace | wc -l) -eq 1
     '

Without this fix, "git fetch" would finally succeed because we didn't
check the return value of promise_remote_get_direct(). We can find
the err output like this:

    error: cannot fork() for -c: Resource temporarily unavailable
    fatal: promisor-remote: unable to fork off fetch subprocess

And we can see a lot of trace logs as follows:

    trace: run_command: git -c fetch.negotiationAlgorithm=noop fetch
origin --no-tags --no-write-fetch-head --recurse-submodules=no
--filter=blob:none --stdin
    trace: built-in: git fetch origin --no-tags --no-write-fetch-head
--recurse-submodules=no --filter=blob:none --stdin

Thanks.
-Han Xin

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH v2 0/2] commit-graph.c: no lazy fetch in lookup_commit_in_graph()
  2022-06-18  3:01     ` [PATCH v1] commit-graph.c: no lazy fetch in lookup_commit_in_graph() Han Xin
  2022-06-20  7:07       ` Patrick Steinhardt
  2022-06-21 18:23       ` Jonathan Tan
@ 2022-06-24  5:27       ` Han Xin
  2022-06-24  5:27         ` [PATCH v2 1/2] test-lib.sh: add limited processes to test-lib Han Xin
                           ` (2 more replies)
  2 siblings, 3 replies; 50+ messages in thread
From: Han Xin @ 2022-06-24  5:27 UTC (permalink / raw)
  To: hanxin.hx
  Cc: chiyutianyi, derrickstolee, git, haiyangtand, jonathantanmy, me,
	ps

This patch fixes the following issue:
When we found the commit in the graph in lookup_commit_in_graph(), but
the commit is missing from the repository, we will try
promisor_remote_get_direct() and then enter another loop.

Then we will go into an endless loop:
  git fetch -> deref_without_lazy_fetch() ->
    lookup_commit_in_graph() -> repo_has_object_file() ->
      promisor_remote_get_direct() -> fetch_objects() ->
        git fetch (a new loop round)

Changes since v1:

* add run_with_limited_processses() to test-lib so that we can use it to
  limit forking subprocesses. As we didn't check the return value of
  promise_remote_get_direct(), "git fetch" would finally succeed due to:

    error: cannot fork() for -c: Resource temporarily unavailable
    fatal: promisor-remote: unable to fork off fetch subprocess

* Rename test repositories, reference name and use GIT_TRACE to observe
  the fetch process.

Han Xin (2):
  test-lib.sh: add limited processes to test-lib
  commit-graph.c: no lazy fetch in lookup_commit_in_graph()

 commit-graph.c                             |  2 +-
 t/t5329-no-lazy-fetch-with-commit-graph.sh | 47 ++++++++++++++++++++++
 t/test-lib.sh                              |  9 +++++
 3 files changed, 57 insertions(+), 1 deletion(-)
 create mode 100755 t/t5329-no-lazy-fetch-with-commit-graph.sh

Range-diff against v1:
1:  ebc14bfd5e < -:  ---------- commit-graph.c: no lazy fetch in lookup_commit_in_graph()
-:  ---------- > 1:  442a4c351d test-lib.sh: add limited processes to test-lib
-:  ---------- > 2:  d3a99a5c5a commit-graph.c: no lazy fetch in lookup_commit_in_graph()
-- 
2.36.1


^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH v2 1/2] test-lib.sh: add limited processes to test-lib
  2022-06-24  5:27       ` [PATCH v2 0/2] " Han Xin
@ 2022-06-24  5:27         ` Han Xin
  2022-06-24 16:03           ` Junio C Hamano
  2022-06-24  5:27         ` [PATCH v2 2/2] commit-graph.c: no lazy fetch in lookup_commit_in_graph() Han Xin
  2022-06-28  2:02         ` [PATCH v3 0/2] " Han Xin
  2 siblings, 1 reply; 50+ messages in thread
From: Han Xin @ 2022-06-24  5:27 UTC (permalink / raw)
  To: hanxin.hx
  Cc: chiyutianyi, derrickstolee, git, haiyangtand, jonathantanmy, me,
	ps

We will use the lazy prerequisite ULIMIT_PROCESSES in a follow-up
commit.

With run_with_limited_processses() we can limit forking subprocesses and
fail reliably in some test cases.

Signed-off-by: Han Xin <hanxin.hx@bytedance.com>
---
 t/test-lib.sh | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/t/test-lib.sh b/t/test-lib.sh
index 8ba5ca1534..f920e3b0ae 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -1816,6 +1816,15 @@ test_lazy_prereq ULIMIT_FILE_DESCRIPTORS '
 	run_with_limited_open_files true
 '
 
+run_with_limited_processses () {
+	(ulimit -u 512 && "$@")
+}
+
+test_lazy_prereq ULIMIT_PROCESSES '
+	test_have_prereq !HPPA,!MINGW,!CYGWIN &&
+	run_with_limited_processses true
+'
+
 build_option () {
 	git version --build-options |
 	sed -ne "s/^$1: //p"
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v2 2/2] commit-graph.c: no lazy fetch in lookup_commit_in_graph()
  2022-06-24  5:27       ` [PATCH v2 0/2] " Han Xin
  2022-06-24  5:27         ` [PATCH v2 1/2] test-lib.sh: add limited processes to test-lib Han Xin
@ 2022-06-24  5:27         ` Han Xin
  2022-06-24 16:56           ` Junio C Hamano
  2022-06-28  2:02         ` [PATCH v3 0/2] " Han Xin
  2 siblings, 1 reply; 50+ messages in thread
From: Han Xin @ 2022-06-24  5:27 UTC (permalink / raw)
  To: hanxin.hx
  Cc: chiyutianyi, derrickstolee, git, haiyangtand, jonathantanmy, me,
	ps

If a commit is in the commit graph, we would expect the commit to also
be present. So we should use has_object() instead of
repo_has_object_file(), which will help us avoid getting into an endless
loop of lazy fetch.

When we found the commit in the graph in lookup_commit_in_graph(), but
the commit is missing from the repository, we will try
promisor_remote_get_direct() and then enter another loop. While
sometimes it will finally succeed because it cannot fork subprocess,
it has exhausted the local process resources and can be harmful to the
remote service.

Signed-off-by: Han Xin <hanxin.hx@bytedance.com>
---
 commit-graph.c                             |  2 +-
 t/t5329-no-lazy-fetch-with-commit-graph.sh | 47 ++++++++++++++++++++++
 2 files changed, 48 insertions(+), 1 deletion(-)
 create mode 100755 t/t5329-no-lazy-fetch-with-commit-graph.sh

diff --git a/commit-graph.c b/commit-graph.c
index 2b52818731..2dd9bcc7ea 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -907,7 +907,7 @@ struct commit *lookup_commit_in_graph(struct repository *repo, const struct obje
 		return NULL;
 	if (!search_commit_pos_in_graph(id, repo->objects->commit_graph, &pos))
 		return NULL;
-	if (!repo_has_object_file(repo, id))
+	if (!has_object(repo, id, 0))
 		return NULL;
 
 	commit = lookup_commit(repo, id);
diff --git a/t/t5329-no-lazy-fetch-with-commit-graph.sh b/t/t5329-no-lazy-fetch-with-commit-graph.sh
new file mode 100755
index 0000000000..4d25d2c950
--- /dev/null
+++ b/t/t5329-no-lazy-fetch-with-commit-graph.sh
@@ -0,0 +1,47 @@
+#!/bin/sh
+
+test_description='test for no lazy fetch with the commit-graph'
+
+. ./test-lib.sh
+
+test_expect_success 'setup: prepare a repository with a commit' '
+	git init with-commit &&
+	test_commit -C with-commit the-commit &&
+	oid=$(git -C with-commit rev-parse HEAD)
+'
+
+test_expect_success 'setup: prepare a repository with commit-graph contains the commit' '
+	git init with-commit-graph &&
+	echo "$(pwd)/with-commit/.git/objects" \
+		>with-commit-graph/.git/objects/info/alternates &&
+	# create a ref that points to the commit in alternates
+	git -C with-commit-graph update-ref refs/ref_to_the_commit "$oid" &&
+	# prepare some other objects to commit-graph
+	test_commit -C with-commit-graph somthing &&
+	git -c gc.writeCommitGraph=true -C with-commit-graph gc &&
+	test_path_is_file with-commit-graph/.git/objects/info/commit-graph
+'
+
+test_expect_success 'setup: change the alternates to what without the commit' '
+	git init --bare without-commit &&
+	echo "$(pwd)/without-commit/objects" \
+		>with-commit-graph/.git/objects/info/alternates &&
+	test_must_fail git -C with-commit-graph cat-file -e $oid
+'
+
+test_expect_success 'setup: prepare another commit to fetch' '
+	test_commit -C with-commit another-commit &&
+	anycommit=$(git -C with-commit rev-parse HEAD)
+'
+
+test_expect_success ULIMIT_PROCESSES 'fetch any commit from promisor with the usage of the commit graph' '
+	git -C with-commit-graph remote add origin "$(pwd)/with-commit" &&
+	git -C with-commit-graph config remote.origin.promisor true &&
+	git -C with-commit-graph config remote.origin.partialclonefilter blob:none &&
+	GIT_TRACE="$(pwd)/trace" run_with_limited_processses \
+		git -C with-commit-graph fetch origin $anycommit 2>err &&
+	test_i18ngrep ! "fatal: promisor-remote: unable to fork off fetch subprocess" err &&
+	test $(grep "fetch origin" trace | wc -l) -eq 1
+'
+
+test_done
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: [PATCH v2 1/2] test-lib.sh: add limited processes to test-lib
  2022-06-24  5:27         ` [PATCH v2 1/2] test-lib.sh: add limited processes to test-lib Han Xin
@ 2022-06-24 16:03           ` Junio C Hamano
  2022-06-25  1:35             ` Han Xin
  0 siblings, 1 reply; 50+ messages in thread
From: Junio C Hamano @ 2022-06-24 16:03 UTC (permalink / raw)
  To: Han Xin; +Cc: chiyutianyi, derrickstolee, git, haiyangtand, jonathantanmy, me,
	ps

Han Xin <hanxin.hx@bytedance.com> writes:

> We will use the lazy prerequisite ULIMIT_PROCESSES in a follow-up
> commit.
>
> With run_with_limited_processses() we can limit forking subprocesses and
> fail reliably in some test cases.
>
> Signed-off-by: Han Xin <hanxin.hx@bytedance.com>
> ---
>  t/test-lib.sh | 9 +++++++++
>  1 file changed, 9 insertions(+)
>
> diff --git a/t/test-lib.sh b/t/test-lib.sh
> index 8ba5ca1534..f920e3b0ae 100644
> --- a/t/test-lib.sh
> +++ b/t/test-lib.sh
> @@ -1816,6 +1816,15 @@ test_lazy_prereq ULIMIT_FILE_DESCRIPTORS '
>  	run_with_limited_open_files true
>  '
>  
> +run_with_limited_processses () {
> +	(ulimit -u 512 && "$@")

The "-u" presumably is a way to say that the current user can have
only 512 processes at once that is supported by bash and ksh?  dash
seems to use "-p" for this but "-p" of course means something
completely different to other shells (and is read-only), which is a
mess X-<.

I suspect that it is OK to make it practically bash-only, but then ...

> +}
> +
> +test_lazy_prereq ULIMIT_PROCESSES '
> +	test_have_prereq !HPPA,!MINGW,!CYGWIN &&
> +	run_with_limited_processses true

... as this lazy-prereq makes a trial run that would fail when the
system does not allow "ulimit -u 512", do we need the platform
specific prereq check?  I am wondering if the second line alone is
sufficient.

Also, 512 is not a number I would exactly call "limit forking".
Does it have to be so high, I wonder.  Of course it cannot be so low
like 3 or 8 or even 32, as per-user limitation counts your window
manager and shells running in other windows.

What you ideally want is an option that lets you limit the number of
processes the shell that issued the ulimit call can spawn
simultaneously, but I didn't find it in "man bash/dash/ksh".

> +'
> +
>  build_option () {
>  	git version --build-options |
>  	sed -ne "s/^$1: //p"

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v2 2/2] commit-graph.c: no lazy fetch in lookup_commit_in_graph()
  2022-06-24  5:27         ` [PATCH v2 2/2] commit-graph.c: no lazy fetch in lookup_commit_in_graph() Han Xin
@ 2022-06-24 16:56           ` Junio C Hamano
  2022-06-25  2:25             ` Han Xin
  0 siblings, 1 reply; 50+ messages in thread
From: Junio C Hamano @ 2022-06-24 16:56 UTC (permalink / raw)
  To: Han Xin; +Cc: chiyutianyi, derrickstolee, git, haiyangtand, jonathantanmy, me,
	ps

Han Xin <hanxin.hx@bytedance.com> writes:

> If a commit is in the commit graph, we would expect the commit to also
> be present.

Hmph, is that a fundamental requirement, or is that a limitation of
the current implementation?  Naïvely, I do not quite see why we
cannot first partially clone from a remote, access objects that
locally do not exist and lazily fetch them from the promissor, and
then build a reachability graph.  I expect that the resulting commit
graph records the lazily fetched objects at that point.  And then we
should be able to "lose" the lazily fetched objects that we know we
can fetch from the promissor again when we need them in the future.
And we would be in a situation where commits are pruned away, not
locally available in our object store, but can be (re)fetched from
the promisor, no?

> So we should use has_object() instead of
> repo_has_object_file(), which will help us avoid getting into an endless
> loop of lazy fetch.

It all depends on the reason we call lookup_commit_in_graph(), I
think.  Is there an easy way to remember the fact that we are
checking if object X is here with repo_has_object_file(X), so that
an on-demand fetch that happens when X does not locally exist would
not bother checking with lookup_commit_in_graph()?  IOW, temporarily
disable the use of commit-graph when we are lazily fetching?

> When we found the commit in the graph in lookup_commit_in_graph(),
> but the commit is missing from the repository, we will try
> promisor_remote_get_direct() and then enter another loop.  While
> sometimes it will finally succeed because it cannot fork
> subprocess,

Is that a mode of "succeed"-ing?  Or merely a way to exit an endless
loop that does not make any progress with a failure?

> it has exhausted the local process resources and can be harmful to the
> remote service.
>
> Signed-off-by: Han Xin <hanxin.hx@bytedance.com>
> ---

I think the single-liner change in the patch is a good one, but I am
having a hard time to agree with the reasoning above that explains
why it is a good change.

Here is an attempt to express a reasoning I can understand, can
agree with, and (I think) better describes why the change is a good
one.  Does my understanding of the problem and the solution totally
misses the mark?

	The commit-graph is used to opportunistically optimize
	accesses to certain pieces of information on commit objects,
	and lookup_commit_in_graph() tries to say "no" when the
	requested commit does not locally exist by returning NULL,
	in which case the caller can ask for (which may result in
	on-demand fetching from a promisor remote) and parse the
	commit object itself.

	However, it uses a wrong helper, repo_has_object_file(), to
	do so.  This helper not only checks if an object is
	immediately available in the local object store, but also
	tries to fetch from a promisor remote.  But the fetch
	machinery calls lookup_commit_in_graph(), thus causing an
	infinite loop.

	We should make lookup_commit_in_graph() expect that a commit
	given to it can be legitimately missing from the local
	object store, by using the has_object_file() helper instead.
	
> diff --git a/t/t5329-no-lazy-fetch-with-commit-graph.sh b/t/t5329-no-lazy-fetch-with-commit-graph.sh
> new file mode 100755
> index 0000000000..4d25d2c950
> --- /dev/null
> +++ b/t/t5329-no-lazy-fetch-with-commit-graph.sh

Hmph, does this short-test need a completely new file?

> @@ -0,0 +1,47 @@
> +#!/bin/sh
> +
> +test_description='test for no lazy fetch with the commit-graph'
> +
> +. ./test-lib.sh
> +
> +test_expect_success 'setup: prepare a repository with a commit' '
> +	git init with-commit &&
> +	test_commit -C with-commit the-commit &&
> +	oid=$(git -C with-commit rev-parse HEAD)
> +'
> +
> +test_expect_success 'setup: prepare a repository with commit-graph contains the commit' '
> +	git init with-commit-graph &&
> +	echo "$(pwd)/with-commit/.git/objects" \
> +		>with-commit-graph/.git/objects/info/alternates &&
> +	# create a ref that points to the commit in alternates
> +	git -C with-commit-graph update-ref refs/ref_to_the_commit "$oid" &&
> +	# prepare some other objects to commit-graph
> +	test_commit -C with-commit-graph somthing &&

somthing? something?

> +	git -c gc.writeCommitGraph=true -C with-commit-graph gc &&
> +	test_path_is_file with-commit-graph/.git/objects/info/commit-graph
> +'
> +
> +test_expect_success 'setup: change the alternates to what without the commit' '
> +	git init --bare without-commit &&
> +	echo "$(pwd)/without-commit/objects" \
> +		>with-commit-graph/.git/objects/info/alternates &&

Doesn't this deliberately _corrupt_ the with-commit-graph repository
that depended on the object whose name is $oid in with-commit
repository?  Do we require a corrupt repository to trigger the "bug"?

> +	test_must_fail git -C with-commit-graph cat-file -e $oid
> +'
> +
> +test_expect_success 'setup: prepare another commit to fetch' '
> +	test_commit -C with-commit another-commit &&
> +	anycommit=$(git -C with-commit rev-parse HEAD)

anycommit?  another_commit?  Be consistent in naming.

> +'
> +
> +test_expect_success ULIMIT_PROCESSES 'fetch any commit from promisor with the usage of the commit graph' '

So we did all of the above set-up sequences only to skip the most
interesting test, if we were testing with "dash"?  I suspect that it
may be cleaner to put the prerequisite to the whole file with the
"early test_done" trick like t0051 and t3008.

> +	git -C with-commit-graph remote add origin "$(pwd)/with-commit" &&
> +	git -C with-commit-graph config remote.origin.promisor true &&
> +	git -C with-commit-graph config remote.origin.partialclonefilter blob:none &&
> +	GIT_TRACE="$(pwd)/trace" run_with_limited_processses \
> +		git -C with-commit-graph fetch origin $anycommit 2>err &&
> +	test_i18ngrep ! "fatal: promisor-remote: unable to fork off fetch subprocess" err &&
> +	test $(grep "fetch origin" trace | wc -l) -eq 1
> +'
> +
> +test_done

Thanks.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Re: [PATCH v2 1/2] test-lib.sh: add limited processes to test-lib
  2022-06-24 16:03           ` Junio C Hamano
@ 2022-06-25  1:35             ` Han Xin
  2022-06-27 12:22               ` Junio C Hamano
  0 siblings, 1 reply; 50+ messages in thread
From: Han Xin @ 2022-06-25  1:35 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: chiyutianyi, derrickstolee, git, haiyangtand, Jonathan Tan,
	Taylor Blau, Patrick Steinhardt

On Sat, Jun 25, 2022 at 12:03 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> Han Xin <hanxin.hx@bytedance.com> writes:
>
> > We will use the lazy prerequisite ULIMIT_PROCESSES in a follow-up
> > commit.
> >
> > With run_with_limited_processses() we can limit forking subprocesses and
> > fail reliably in some test cases.
> >
> > Signed-off-by: Han Xin <hanxin.hx@bytedance.com>
> > ---
> >  t/test-lib.sh | 9 +++++++++
> >  1 file changed, 9 insertions(+)
> >
> > diff --git a/t/test-lib.sh b/t/test-lib.sh
> > index 8ba5ca1534..f920e3b0ae 100644
> > --- a/t/test-lib.sh
> > +++ b/t/test-lib.sh
> > @@ -1816,6 +1816,15 @@ test_lazy_prereq ULIMIT_FILE_DESCRIPTORS '
> >       run_with_limited_open_files true
> >  '
> >
> > +run_with_limited_processses () {
> > +     (ulimit -u 512 && "$@")
>
> The "-u" presumably is a way to say that the current user can have
> only 512 processes at once that is supported by bash and ksh?  dash
> seems to use "-p" for this but "-p" of course means something
> completely different to other shells (and is read-only), which is a
> mess X-<.
>
> I suspect that it is OK to make it practically bash-only, but then ...
>
> > +}
> > +
> > +test_lazy_prereq ULIMIT_PROCESSES '
> > +     test_have_prereq !HPPA,!MINGW,!CYGWIN &&
> > +     run_with_limited_processses true
>
> ... as this lazy-prereq makes a trial run that would fail when the
> system does not allow "ulimit -u 512", do we need the platform
> specific prereq check?  I am wondering if the second line alone is
> sufficient.
>

Yes,the second line alone is sufficient.

> Also, 512 is not a number I would exactly call "limit forking".
> Does it have to be so high, I wonder.  Of course it cannot be so low
> like 3 or 8 or even 32, as per-user limitation counts your window
> manager and shells running in other windows.
>

It's hard to say.
I've tried adjusting it to 256, but the test cases in next patch will always
fail with the following "err":

    ./test-lib.sh: fork: Resource temporarily unavailable

> What you ideally want is an option that lets you limit the number of
> processes the shell that issued the ulimit call can spawn
> simultaneously, but I didn't find it in "man bash/dash/ksh".
>

Maybe I should use "lib-bash.sh" instead of "test-lib.sh" just like t9902
and t9903?
The different meanings of "-p" in bash and dash really make this tricky.

Thanks.
-Han Xin

> > +'
> > +
> >  build_option () {
> >       git version --build-options |
> >       sed -ne "s/^$1: //p"

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Re: [PATCH v2 2/2] commit-graph.c: no lazy fetch in lookup_commit_in_graph()
  2022-06-24 16:56           ` Junio C Hamano
@ 2022-06-25  2:25             ` Han Xin
  2022-06-25  2:31               ` Han Xin
  0 siblings, 1 reply; 50+ messages in thread
From: Han Xin @ 2022-06-25  2:25 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: chiyutianyi, derrickstolee, git, haiyangtand, Jonathan Tan,
	Taylor Blau, Patrick Steinhardt

On Sat, Jun 25, 2022 at 12:56 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> Han Xin <hanxin.hx@bytedance.com> writes:
>
> > If a commit is in the commit graph, we would expect the commit to also
> > be present.
>
> > When we found the commit in the graph in lookup_commit_in_graph(),
> > but the commit is missing from the repository, we will try
> > promisor_remote_get_direct() and then enter another loop.  While
> > sometimes it will finally succeed because it cannot fork
> > subprocess,
>
> Is that a mode of "succeed"-ing?  Or merely a way to exit an endless
> loop that does not make any progress with a failure?

For the user, "fetch-pack" does succeed, because in
deref_without_lazy_fetch(), even if lookup_commit_in_graph() fails to
lazy fetch the lost commit, the following oid_object_info_extended()
will help us complete the previous work.

In a sense, this infinite loop is based on the fact that infinite processes
can be created.

However, your attempt to express the reasoning bellow is clearer.

>
> > it has exhausted the local process resources and can be harmful to the
> > remote service.
> >
> > Signed-off-by: Han Xin <hanxin.hx@bytedance.com>
> > ---
>
> I think the single-liner change in the patch is a good one, but I am
> having a hard time to agree with the reasoning above that explains
> why it is a good change.
>
> Here is an attempt to express a reasoning I can understand, can
> agree with, and (I think) better describes why the change is a good
> one.  Does my understanding of the problem and the solution totally
> misses the mark?
>
>         The commit-graph is used to opportunistically optimize
>         accesses to certain pieces of information on commit objects,
>         and lookup_commit_in_graph() tries to say "no" when the
>         requested commit does not locally exist by returning NULL,
>         in which case the caller can ask for (which may result in
>         on-demand fetching from a promisor remote) and parse the
>         commit object itself.
>
>         However, it uses a wrong helper, repo_has_object_file(), to
>         do so.  This helper not only checks if an object is
>         immediately available in the local object store, but also
>         tries to fetch from a promisor remote.  But the fetch
>         machinery calls lookup_commit_in_graph(), thus causing an
>         infinite loop.
>
>         We should make lookup_commit_in_graph() expect that a commit
>         given to it can be legitimately missing from the local
>         object store, by using the has_object_file() helper instead.
>
> > diff --git a/t/t5329-no-lazy-fetch-with-commit-graph.sh b/t/t5329-no-lazy-fetch-with-commit-graph.sh
> > new file mode 100755
> > index 0000000000..4d25d2c950
> > --- /dev/null
> > +++ b/t/t5329-no-lazy-fetch-with-commit-graph.sh
>
> Hmph, does this short-test need a completely new file?
>
> > @@ -0,0 +1,47 @@
> > +#!/bin/sh
> > +
> > +test_description='test for no lazy fetch with the commit-graph'
> > +
> > +. ./test-lib.sh
> > +
> > +test_expect_success 'setup: prepare a repository with a commit' '
> > +     git init with-commit &&
> > +     test_commit -C with-commit the-commit &&
> > +     oid=$(git -C with-commit rev-parse HEAD)
> > +'
> > +
> > +test_expect_success 'setup: prepare a repository with commit-graph contains the commit' '
> > +     git init with-commit-graph &&
> > +     echo "$(pwd)/with-commit/.git/objects" \
> > +             >with-commit-graph/.git/objects/info/alternates &&
> > +     # create a ref that points to the commit in alternates
> > +     git -C with-commit-graph update-ref refs/ref_to_the_commit "$oid" &&
> > +     # prepare some other objects to commit-graph
> > +     test_commit -C with-commit-graph somthing &&
>
> somthing? something?

Nod.

>
> > +     git -c gc.writeCommitGraph=true -C with-commit-graph gc &&
> > +     test_path_is_file with-commit-graph/.git/objects/info/commit-graph
> > +'
> > +
> > +test_expect_success 'setup: change the alternates to what without the commit' '
> > +     git init --bare without-commit &&
> > +     echo "$(pwd)/without-commit/objects" \
> > +             >with-commit-graph/.git/objects/info/alternates &&
>
> Doesn't this deliberately _corrupt_ the with-commit-graph repository
> that depended on the object whose name is $oid in with-commit
> repository?  Do we require a corrupt repository to trigger the "bug"?
>

The "bug" depends on the commit exist in the commit-graph but
missing in the repository.

I didn't find a better way to make this kind of scene.

This bug was first found when alternates and commit-graph were
both used. Since the promise did not maintain all the references,
I suspect that the "auto gc" during the update process of the promise
caused the loss of the unreachable commits in the promise.

> > +     test_must_fail git -C with-commit-graph cat-file -e $oid
> > +'
> > +
> > +test_expect_success 'setup: prepare another commit to fetch' '
> > +     test_commit -C with-commit another-commit &&
> > +     anycommit=$(git -C with-commit rev-parse HEAD)
>
> anycommit?  another_commit?  Be consistent in naming.
>

Nod.

> > +'
> > +
> > +test_expect_success ULIMIT_PROCESSES 'fetch any commit from promisor with the usage of the commit graph' '
>
> So we did all of the above set-up sequences only to skip the most
> interesting test, if we were testing with "dash"?  I suspect that it
> may be cleaner to put the prerequisite to the whole file with the
> "early test_done" trick like t0051 and t3008.
>

It make sense to me.

Thanks.
-Han Xin

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Re: [PATCH v2 2/2] commit-graph.c: no lazy fetch in lookup_commit_in_graph()
  2022-06-25  2:25             ` Han Xin
@ 2022-06-25  2:31               ` Han Xin
  0 siblings, 0 replies; 50+ messages in thread
From: Han Xin @ 2022-06-25  2:31 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: chiyutianyi, derrickstolee, git, haiyangtand, Jonathan Tan,
	Taylor Blau, Patrick Steinhardt

On Sat, Jun 25, 2022 at 10:25 AM Han Xin <hanxin.hx@bytedance.com> wrote:
>
> The "bug" depends on the commit exist in the commit-graph but
> missing in the repository.
>
> I didn't find a better way to make this kind of scene.
>
> This bug was first found when alternates and commit-graph were
> both used. Since the promise did not maintain all the references,
> I suspect that the "auto gc" during the update process of the promise
> caused the loss of the unreachable commits in the promise.
>

Some mistakes here:
This bug was first found when alternates and commit-graph were
both used. Since the promise did not maintain all the references,
I suspect that the "auto gc" during the update of the alternates
caused the loss of the unreachable commits.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v2 1/2] test-lib.sh: add limited processes to test-lib
  2022-06-25  1:35             ` Han Xin
@ 2022-06-27 12:22               ` Junio C Hamano
  0 siblings, 0 replies; 50+ messages in thread
From: Junio C Hamano @ 2022-06-27 12:22 UTC (permalink / raw)
  To: Han Xin
  Cc: chiyutianyi, derrickstolee, git, haiyangtand, Jonathan Tan,
	Taylor Blau, Patrick Steinhardt

Han Xin <hanxin.hx@bytedance.com> writes:

> Maybe I should use "lib-bash.sh" instead of "test-lib.sh" just like t9902
> and t9903?
> The different meanings of "-p" in bash and dash really make this tricky.

I do not think lib-bash.sh is appropriate for this.

t9902/9903 are about command line prompt and completion support
_FOR_ bash users.  By including lib-bash.sh, even if your "usual"
shell is not bash, you can run these two scripts under bash, as long
as you have it installed.

For the purpose of testing these bash specific features, that
framework makes quite a lot of sense.  Those who are happy to have
dash on their system without having to install bash would have no
reason to see these two tests to pass, as they do not care about
bash at all.

What the test under discussion is doing is quite different.  Instead
of forcing to re-spawn bash when the user's shell is not bash, you'd
want to adjust how you invoke "ulimit" if it is not bash, something
like

run_with_limited_processes () {
	(ulimit ${ulimit_max_process-"-p"} 512 && "$@")
}

test_lazy_prereq ULIMIT_PROCESSES '
	# bash and ksh use "ulimit -u", dash uses "ulimit -p"
	if test -n "$BASH_VERSION"
	then
		ulimit_max_process="-u"
	elif test -n "$KSH_VERSION"
	then
		ulimit_max_process="-u"
	fi
        run_with_limited_processes true
'

perhaps?

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH v3 0/2] no lazy fetch in lookup_commit_in_graph()
  2022-06-24  5:27       ` [PATCH v2 0/2] " Han Xin
  2022-06-24  5:27         ` [PATCH v2 1/2] test-lib.sh: add limited processes to test-lib Han Xin
  2022-06-24  5:27         ` [PATCH v2 2/2] commit-graph.c: no lazy fetch in lookup_commit_in_graph() Han Xin
@ 2022-06-28  2:02         ` Han Xin
  2022-06-28  2:02           ` [PATCH v3 1/2] test-lib.sh: add limited processes to test-lib Han Xin
                             ` (3 more replies)
  2 siblings, 4 replies; 50+ messages in thread
From: Han Xin @ 2022-06-28  2:02 UTC (permalink / raw)
  To: hanxin.hx
  Cc: chiyutianyi, derrickstolee, git, haiyangtand, jonathantanmy, me,
	Junio C Hamano, ps

This patch fixes the following issue:
When we found the commit in the graph in lookup_commit_in_graph(), but
the commit is missing from the repository, we will try
promisor_remote_get_direct() and then enter another loop.

Then we will go into an endless loop:
  git fetch -> deref_without_lazy_fetch() ->
    lookup_commit_in_graph() -> repo_has_object_file() ->
      promisor_remote_get_direct() -> fetch_objects() ->
        git fetch (a new loop round)

Changes since v2:

* Remove test_have_prereq() from ULIMIT_PROCESSES as
  "run_with_limited_processses true" is enough.

* Teach run_with_limited_processses() to support dash and zsh.

* Skip the whole test file if ulimit is not avaliable.

* Minor grammar/comment etc. fixes throughout.

Han Xin (2):
  test-lib.sh: add limited processes to test-lib
  commit-graph.c: no lazy fetch in lookup_commit_in_graph()

 commit-graph.c                             |  2 +-
 t/t5329-no-lazy-fetch-with-commit-graph.sh | 53 ++++++++++++++++++++++
 t/test-lib.sh                              | 16 +++++++
 3 files changed, 70 insertions(+), 1 deletion(-)
 create mode 100755 t/t5329-no-lazy-fetch-with-commit-graph.sh

Range-diff against v2:
1:  442a4c351d ! 1:  ad0a539759 test-lib.sh: add limited processes to test-lib
    @@ t/test-lib.sh: test_lazy_prereq ULIMIT_FILE_DESCRIPTORS '
      '
      
     +run_with_limited_processses () {
    -+	(ulimit -u 512 && "$@")
    ++	# bash and ksh use "ulimit -u", dash uses "ulimit -p"
    ++	if test -n "$BASH_VERSION"
    ++	then
    ++		ulimit_max_process="-u"
    ++	elif test -n "$KSH_VERSION"
    ++	then
    ++		ulimit_max_process="-u"
    ++	fi
    ++	(ulimit ${ulimit_max_process-"-p"} 512 && "$@")
     +}
     +
     +test_lazy_prereq ULIMIT_PROCESSES '
    -+	test_have_prereq !HPPA,!MINGW,!CYGWIN &&
     +	run_with_limited_processses true
     +'
     +
2:  a7d456db9b ! 2:  3cdb1abd43 commit-graph.c: no lazy fetch in lookup_commit_in_graph()
    @@ Metadata
      ## Commit message ##
         commit-graph.c: no lazy fetch in lookup_commit_in_graph()
     
    -    If a commit is in the commit graph, we would expect the commit to also
    -    be present. So we should use has_object() instead of
    -    repo_has_object_file(), which will help us avoid getting into an endless
    -    loop of lazy fetch.
    +    The commit-graph is used to opportunistically optimize accesses to
    +    certain pieces of information on commit objects, and
    +    lookup_commit_in_graph() tries to say "no" when the requested commit
    +    does not locally exist by returning NULL, in which case the caller
    +    can ask for (which may result in on-demand fetching from a promisor
    +    remote) and parse the commit object itself.
     
    -    When we found the commit in the graph in lookup_commit_in_graph(), but
    -    the commit is missing from the repository, we will try
    -    promisor_remote_get_direct() and then enter another loop. While
    -    sometimes it will finally succeed because it cannot fork subprocess,
    -    it has exhausted the local process resources and can be harmful to the
    -    remote service.
    +    However, it uses a wrong helper, repo_has_object_file(), to do so.
    +    This helper not only checks if an object is mmediately available in
    +    the local object store, but also tries to fetch from a promisor remote.
    +    But the fetch machinery calls lookup_commit_in_graph(), thus causing an
    +    infinite loop.
    +
    +    We should make lookup_commit_in_graph() expect that a commit given to it
    +    can be legitimately missing from the local object store, by using the
    +    has_object_file() helper instead.
     
         Signed-off-by: Han Xin <hanxin.hx@bytedance.com>
     
    @@ t/t5329-no-lazy-fetch-with-commit-graph.sh (new)
     +
     +. ./test-lib.sh
     +
    ++if ! test_have_prereq ULIMIT_PROCESSES
    ++then
    ++	skip_all='skipping tests for no lazy fetch with the commit-graph, ulimit processes not available'
    ++	test_done
    ++fi
    ++
     +test_expect_success 'setup: prepare a repository with a commit' '
     +	git init with-commit &&
     +	test_commit -C with-commit the-commit &&
    @@ t/t5329-no-lazy-fetch-with-commit-graph.sh (new)
     +	# create a ref that points to the commit in alternates
     +	git -C with-commit-graph update-ref refs/ref_to_the_commit "$oid" &&
     +	# prepare some other objects to commit-graph
    -+	test_commit -C with-commit-graph somthing &&
    ++	test_commit -C with-commit-graph something &&
     +	git -c gc.writeCommitGraph=true -C with-commit-graph gc &&
     +	test_path_is_file with-commit-graph/.git/objects/info/commit-graph
     +'
    @@ t/t5329-no-lazy-fetch-with-commit-graph.sh (new)
     +	test_must_fail git -C with-commit-graph cat-file -e $oid
     +'
     +
    -+test_expect_success 'setup: prepare another commit to fetch' '
    -+	test_commit -C with-commit another-commit &&
    ++test_expect_success 'setup: prepare any commit to fetch' '
    ++	test_commit -C with-commit any-commit &&
     +	anycommit=$(git -C with-commit rev-parse HEAD)
     +'
     +
    -+test_expect_success ULIMIT_PROCESSES 'fetch any commit from promisor with the usage of the commit graph' '
    ++test_expect_success 'fetch any commit from promisor with the usage of the commit graph' '
     +	git -C with-commit-graph remote add origin "$(pwd)/with-commit" &&
     +	git -C with-commit-graph config remote.origin.promisor true &&
     +	git -C with-commit-graph config remote.origin.partialclonefilter blob:none &&
-- 
2.36.1


^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH v3 1/2] test-lib.sh: add limited processes to test-lib
  2022-06-28  2:02         ` [PATCH v3 0/2] " Han Xin
@ 2022-06-28  2:02           ` Han Xin
  2022-06-28  2:02           ` [PATCH v3 2/2] commit-graph.c: no lazy fetch in lookup_commit_in_graph() Han Xin
                             ` (2 subsequent siblings)
  3 siblings, 0 replies; 50+ messages in thread
From: Han Xin @ 2022-06-28  2:02 UTC (permalink / raw)
  To: hanxin.hx
  Cc: chiyutianyi, derrickstolee, git, haiyangtand, jonathantanmy, me,
	Junio C Hamano, ps

We will use the lazy prerequisite ULIMIT_PROCESSES in a follow-up
commit.

With run_with_limited_processses() we can limit forking subprocesses and
fail reliably in some test cases.

Signed-off-by: Han Xin <hanxin.hx@bytedance.com>
---
 t/test-lib.sh | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/t/test-lib.sh b/t/test-lib.sh
index 8ba5ca1534..655d6d543f 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -1816,6 +1816,22 @@ test_lazy_prereq ULIMIT_FILE_DESCRIPTORS '
 	run_with_limited_open_files true
 '
 
+run_with_limited_processses () {
+	# bash and ksh use "ulimit -u", dash uses "ulimit -p"
+	if test -n "$BASH_VERSION"
+	then
+		ulimit_max_process="-u"
+	elif test -n "$KSH_VERSION"
+	then
+		ulimit_max_process="-u"
+	fi
+	(ulimit ${ulimit_max_process-"-p"} 512 && "$@")
+}
+
+test_lazy_prereq ULIMIT_PROCESSES '
+	run_with_limited_processses true
+'
+
 build_option () {
 	git version --build-options |
 	sed -ne "s/^$1: //p"
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v3 2/2] commit-graph.c: no lazy fetch in lookup_commit_in_graph()
  2022-06-28  2:02         ` [PATCH v3 0/2] " Han Xin
  2022-06-28  2:02           ` [PATCH v3 1/2] test-lib.sh: add limited processes to test-lib Han Xin
@ 2022-06-28  2:02           ` Han Xin
  2022-06-28  7:49             ` Ævar Arnfjörð Bjarmason
  2022-06-30 17:37           ` test name conflict + js/ci-github-workflow-markup regression (was: [PATCH v3 0/2] no lazy fetch in lookup_commit_in_graph()) Ævar Arnfjörð Bjarmason
  2022-07-01  1:34           ` [PATCH v4 0/1] no lazy fetch in lookup_commit_in_graph() Han Xin
  3 siblings, 1 reply; 50+ messages in thread
From: Han Xin @ 2022-06-28  2:02 UTC (permalink / raw)
  To: hanxin.hx
  Cc: chiyutianyi, derrickstolee, git, haiyangtand, jonathantanmy, me,
	Junio C Hamano, ps

The commit-graph is used to opportunistically optimize accesses to
certain pieces of information on commit objects, and
lookup_commit_in_graph() tries to say "no" when the requested commit
does not locally exist by returning NULL, in which case the caller
can ask for (which may result in on-demand fetching from a promisor
remote) and parse the commit object itself.

However, it uses a wrong helper, repo_has_object_file(), to do so.
This helper not only checks if an object is mmediately available in
the local object store, but also tries to fetch from a promisor remote.
But the fetch machinery calls lookup_commit_in_graph(), thus causing an
infinite loop.

We should make lookup_commit_in_graph() expect that a commit given to it
can be legitimately missing from the local object store, by using the
has_object_file() helper instead.

Signed-off-by: Han Xin <hanxin.hx@bytedance.com>
---
 commit-graph.c                             |  2 +-
 t/t5329-no-lazy-fetch-with-commit-graph.sh | 53 ++++++++++++++++++++++
 2 files changed, 54 insertions(+), 1 deletion(-)
 create mode 100755 t/t5329-no-lazy-fetch-with-commit-graph.sh

diff --git a/commit-graph.c b/commit-graph.c
index 2b52818731..2dd9bcc7ea 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -907,7 +907,7 @@ struct commit *lookup_commit_in_graph(struct repository *repo, const struct obje
 		return NULL;
 	if (!search_commit_pos_in_graph(id, repo->objects->commit_graph, &pos))
 		return NULL;
-	if (!repo_has_object_file(repo, id))
+	if (!has_object(repo, id, 0))
 		return NULL;
 
 	commit = lookup_commit(repo, id);
diff --git a/t/t5329-no-lazy-fetch-with-commit-graph.sh b/t/t5329-no-lazy-fetch-with-commit-graph.sh
new file mode 100755
index 0000000000..d7877a5758
--- /dev/null
+++ b/t/t5329-no-lazy-fetch-with-commit-graph.sh
@@ -0,0 +1,53 @@
+#!/bin/sh
+
+test_description='test for no lazy fetch with the commit-graph'
+
+. ./test-lib.sh
+
+if ! test_have_prereq ULIMIT_PROCESSES
+then
+	skip_all='skipping tests for no lazy fetch with the commit-graph, ulimit processes not available'
+	test_done
+fi
+
+test_expect_success 'setup: prepare a repository with a commit' '
+	git init with-commit &&
+	test_commit -C with-commit the-commit &&
+	oid=$(git -C with-commit rev-parse HEAD)
+'
+
+test_expect_success 'setup: prepare a repository with commit-graph contains the commit' '
+	git init with-commit-graph &&
+	echo "$(pwd)/with-commit/.git/objects" \
+		>with-commit-graph/.git/objects/info/alternates &&
+	# create a ref that points to the commit in alternates
+	git -C with-commit-graph update-ref refs/ref_to_the_commit "$oid" &&
+	# prepare some other objects to commit-graph
+	test_commit -C with-commit-graph something &&
+	git -c gc.writeCommitGraph=true -C with-commit-graph gc &&
+	test_path_is_file with-commit-graph/.git/objects/info/commit-graph
+'
+
+test_expect_success 'setup: change the alternates to what without the commit' '
+	git init --bare without-commit &&
+	echo "$(pwd)/without-commit/objects" \
+		>with-commit-graph/.git/objects/info/alternates &&
+	test_must_fail git -C with-commit-graph cat-file -e $oid
+'
+
+test_expect_success 'setup: prepare any commit to fetch' '
+	test_commit -C with-commit any-commit &&
+	anycommit=$(git -C with-commit rev-parse HEAD)
+'
+
+test_expect_success 'fetch any commit from promisor with the usage of the commit graph' '
+	git -C with-commit-graph remote add origin "$(pwd)/with-commit" &&
+	git -C with-commit-graph config remote.origin.promisor true &&
+	git -C with-commit-graph config remote.origin.partialclonefilter blob:none &&
+	run_with_limited_processses env GIT_TRACE="$(pwd)/trace" \
+		git -C with-commit-graph fetch origin $anycommit 2>err &&
+	test_i18ngrep ! "fatal: promisor-remote: unable to fork off fetch subprocess" err &&
+	test $(grep "fetch origin" trace | wc -l) -eq 1
+'
+
+test_done
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 2/2] commit-graph.c: no lazy fetch in lookup_commit_in_graph()
  2022-06-28  2:02           ` [PATCH v3 2/2] commit-graph.c: no lazy fetch in lookup_commit_in_graph() Han Xin
@ 2022-06-28  7:49             ` Ævar Arnfjörð Bjarmason
  2022-06-28 17:36               ` Junio C Hamano
  2022-06-29  2:08               ` Han Xin
  0 siblings, 2 replies; 50+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-06-28  7:49 UTC (permalink / raw)
  To: Han Xin
  Cc: chiyutianyi, derrickstolee, git, haiyangtand, jonathantanmy, me,
	Junio C Hamano, ps


On Tue, Jun 28 2022, Han Xin wrote:

> The commit-graph is used to opportunistically optimize accesses to
> certain pieces of information on commit objects, and
> lookup_commit_in_graph() tries to say "no" when the requested commit
> does not locally exist by returning NULL, in which case the caller
> can ask for (which may result in on-demand fetching from a promisor
> remote) and parse the commit object itself.
>
> However, it uses a wrong helper, repo_has_object_file(), to do so.
> This helper not only checks if an object is mmediately available in
> the local object store, but also tries to fetch from a promisor remote.
> But the fetch machinery calls lookup_commit_in_graph(), thus causing an
> infinite loop.
>
> We should make lookup_commit_in_graph() expect that a commit given to it
> can be legitimately missing from the local object store, by using the
> has_object_file() helper instead.
>
> Signed-off-by: Han Xin <hanxin.hx@bytedance.com>
> ---
>  commit-graph.c                             |  2 +-
>  t/t5329-no-lazy-fetch-with-commit-graph.sh | 53 ++++++++++++++++++++++
>  2 files changed, 54 insertions(+), 1 deletion(-)
>  create mode 100755 t/t5329-no-lazy-fetch-with-commit-graph.sh
>
> diff --git a/commit-graph.c b/commit-graph.c
> index 2b52818731..2dd9bcc7ea 100644
> --- a/commit-graph.c
> +++ b/commit-graph.c
> @@ -907,7 +907,7 @@ struct commit *lookup_commit_in_graph(struct repository *repo, const struct obje
>  		return NULL;
>  	if (!search_commit_pos_in_graph(id, repo->objects->commit_graph, &pos))
>  		return NULL;
> -	if (!repo_has_object_file(repo, id))
> +	if (!has_object(repo, id, 0))
>  		return NULL;
>  
>  	commit = lookup_commit(repo, id);
> diff --git a/t/t5329-no-lazy-fetch-with-commit-graph.sh b/t/t5329-no-lazy-fetch-with-commit-graph.sh
> new file mode 100755
> index 0000000000..d7877a5758
> --- /dev/null
> +++ b/t/t5329-no-lazy-fetch-with-commit-graph.sh
> @@ -0,0 +1,53 @@
> +#!/bin/sh
> +
> +test_description='test for no lazy fetch with the commit-graph'
> +
> +. ./test-lib.sh
> +
> +if ! test_have_prereq ULIMIT_PROCESSES

I think the prereq in 1/2 would be better off squashed into this commit.

> +then
> +	skip_all='skipping tests for no lazy fetch with the commit-graph, ulimit processes not available'
> +	test_done
> +fi
> +
> +test_expect_success 'setup: prepare a repository with a commit' '
> +	git init with-commit &&
> +	test_commit -C with-commit the-commit &&
> +	oid=$(git -C with-commit rev-parse HEAD)
> +'
> +
> +test_expect_success 'setup: prepare a repository with commit-graph contains the commit' '
> +	git init with-commit-graph &&
> +	echo "$(pwd)/with-commit/.git/objects" \

nit: you can use $PWD instead of $(pwd).

> +		>with-commit-graph/.git/objects/info/alternates &&
> +	# create a ref that points to the commit in alternates
> +	git -C with-commit-graph update-ref refs/ref_to_the_commit "$oid" &&
> +	# prepare some other objects to commit-graph
> +	test_commit -C with-commit-graph something &&
> +	git -c gc.writeCommitGraph=true -C with-commit-graph gc &&
> +	test_path_is_file with-commit-graph/.git/objects/info/commit-graph
> +'
> +
> +test_expect_success 'setup: change the alternates to what without the commit' '
> +	git init --bare without-commit &&

Maybe run a successful cat-file here...
> +	echo "$(pwd)/without-commit/objects" \
> +		>with-commit-graph/.git/objects/info/alternates &&
> +	test_must_fail git -C with-commit-graph cat-file -e $oid
...to show that it fails after the "echo" above?
> +'
> +
> +test_expect_success 'setup: prepare any commit to fetch' '
> +	test_commit -C with-commit any-commit &&
> +	anycommit=$(git -C with-commit rev-parse HEAD)

I think this would be better just added before a \n\n in the next test.
> +'
> +
> +test_expect_success 'fetch any commit from promisor with the usage of the commit graph' '
> +	git -C with-commit-graph remote add origin "$(pwd)/with-commit" &&
> +	git -C with-commit-graph config remote.origin.promisor true &&
> +	git -C with-commit-graph config remote.origin.partialclonefilter blob:none &&
> +	run_with_limited_processses env GIT_TRACE="$(pwd)/trace" \
> +		git -C with-commit-graph fetch origin $anycommit 2>err &&


> +	test_i18ngrep ! "fatal: promisor-remote: unable to fork off fetch subprocess" err &&
> +	test $(grep "fetch origin" trace | wc -l) -eq 1


Use "grep", not "test_i18ngrep", and this should use "test_line_count".

But actually better yet: this whole thing looks like it could use
"test_subcommand" instead, couldn't it?

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 2/2] commit-graph.c: no lazy fetch in lookup_commit_in_graph()
  2022-06-28  7:49             ` Ævar Arnfjörð Bjarmason
@ 2022-06-28 17:36               ` Junio C Hamano
  2022-06-30 12:21                 ` Johannes Schindelin
  2022-06-29  2:08               ` Han Xin
  1 sibling, 1 reply; 50+ messages in thread
From: Junio C Hamano @ 2022-06-28 17:36 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Han Xin, chiyutianyi, derrickstolee, git, haiyangtand,
	jonathantanmy, me, ps

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

>> +test_description='test for no lazy fetch with the commit-graph'
>> +
>> +. ./test-lib.sh
>> +
>> +if ! test_have_prereq ULIMIT_PROCESSES
>
> I think the prereq in 1/2 would be better off squashed into this commit.

Good thinking.  It also may make sense to implement it in this file,
without touching test-lib.sh at all.

>> +test_expect_success 'setup: prepare a repository with commit-graph contains the commit' '
>> +	git init with-commit-graph &&
>> +	echo "$(pwd)/with-commit/.git/objects" \
>> +		>with-commit-graph/.git/objects/info/alternates &&
>
> nit: you can use $PWD instead of $(pwd).

We can, and it would not make any difference on non-Windows.  

But which one should we use to cater to Windows?  $(pwd) is a full
path in Windows notation "C:\Program Files\Git\..." while $PWD is
MSYS style "/C/Program Files/Git/..." or something like that, IIRC?

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Re: [PATCH v3 2/2] commit-graph.c: no lazy fetch in lookup_commit_in_graph()
  2022-06-28  7:49             ` Ævar Arnfjörð Bjarmason
  2022-06-28 17:36               ` Junio C Hamano
@ 2022-06-29  2:08               ` Han Xin
  1 sibling, 0 replies; 50+ messages in thread
From: Han Xin @ 2022-06-29  2:08 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: chiyutianyi, derrickstolee, git, haiyangtand, Jonathan Tan,
	Taylor Blau, Junio C Hamano, Patrick Steinhardt

On Tue, Jun 28, 2022 at 3:53 PM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
> > +     test_i18ngrep ! "fatal: promisor-remote: unable to fork off fetch subprocess" err &&
> > +     test $(grep "fetch origin" trace | wc -l) -eq 1
>
>
> Use "grep", not "test_i18ngrep", and this should use "test_line_count".
>
> But actually better yet: this whole thing looks like it could use
> "test_subcommand" instead, couldn't it?

When using test_subcommand() we should give all the args,
if we remove or add any args later, this test case will always
pass even without this fix. So, is this test case still strict?

    run_with_limited_processses env GIT_TRACE2_EVENT="$(PWD)/trace.txt" \
        git -C with-commit-graph fetch origin $anycommit &&
    test_subcommand ! git -c fetch.negotiationAlgorithm=noop \
        fetch origin --no-tags --no-write-fetch-head \
        --recurse-submodules=no --filter=blob:none \
        --stdin <trace.txt

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 2/2] commit-graph.c: no lazy fetch in lookup_commit_in_graph()
  2022-06-28 17:36               ` Junio C Hamano
@ 2022-06-30 12:21                 ` Johannes Schindelin
  2022-06-30 13:43                   ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 50+ messages in thread
From: Johannes Schindelin @ 2022-06-30 12:21 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Ævar Arnfjörð Bjarmason, Han Xin, chiyutianyi,
	derrickstolee, git, haiyangtand, jonathantanmy, me, ps

[-- Attachment #1: Type: text/plain, Size: 927 bytes --]

Hi Junio,

On Tue, 28 Jun 2022, Junio C Hamano wrote:

> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
>
> >> +test_expect_success 'setup: prepare a repository with commit-graph contains the commit' '
> >> +	git init with-commit-graph &&
> >> +	echo "$(pwd)/with-commit/.git/objects" \
> >> +		>with-commit-graph/.git/objects/info/alternates &&
> >
> > nit: you can use $PWD instead of $(pwd).
>
> We can, and it would not make any difference on non-Windows.
>
> But which one should we use to cater to Windows?  $(pwd) is a full
> path in Windows notation "C:\Program Files\Git\..." while $PWD is
> MSYS style "/C/Program Files/Git/..." or something like that, IIRC?

Indeed, and since the `alternates` file is supposed to be read by
`git.exe`, a non-MSYS program, the original was good, and the nit
suggested the incorrect form.

Thank you for catching this before it was worsimproved,
Dscho

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 2/2] commit-graph.c: no lazy fetch in lookup_commit_in_graph()
  2022-06-30 12:21                 ` Johannes Schindelin
@ 2022-06-30 13:43                   ` Ævar Arnfjörð Bjarmason
  2022-06-30 15:40                     ` Junio C Hamano
  0 siblings, 1 reply; 50+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-06-30 13:43 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Junio C Hamano, Han Xin, chiyutianyi, derrickstolee, git,
	haiyangtand, jonathantanmy, me, ps


On Thu, Jun 30 2022, Johannes Schindelin wrote:

> Hi Junio,
>
> On Tue, 28 Jun 2022, Junio C Hamano wrote:
>
>> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
>>
>> >> +test_expect_success 'setup: prepare a repository with commit-graph contains the commit' '
>> >> +	git init with-commit-graph &&
>> >> +	echo "$(pwd)/with-commit/.git/objects" \
>> >> +		>with-commit-graph/.git/objects/info/alternates &&
>> >
>> > nit: you can use $PWD instead of $(pwd).
>>
>> We can, and it would not make any difference on non-Windows.
>>
>> But which one should we use to cater to Windows?  $(pwd) is a full
>> path in Windows notation "C:\Program Files\Git\..." while $PWD is
>> MSYS style "/C/Program Files/Git/..." or something like that, IIRC?
>
> Indeed, and since the `alternates` file is supposed to be read by
> `git.exe`, a non-MSYS program, the original was good, and the nit
> suggested the incorrect form.

I looked at t5615-alternate-env.sh which does the equivalent of:

	GIT_ALTERNATE_OBJECT_DIRECTORIES="$PWD/one.git/objects:$PWD/two.git/objects" \
        	git cat-file [...]

We run that test on all our platforms, does the $PWD form work in the
environment variable, but not when we write it to the "alternates" file?
Or is there some other subtlety there that I'm missing?


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 2/2] commit-graph.c: no lazy fetch in lookup_commit_in_graph()
  2022-06-30 13:43                   ` Ævar Arnfjörð Bjarmason
@ 2022-06-30 15:40                     ` Junio C Hamano
  2022-06-30 18:47                       ` Ævar Arnfjörð Bjarmason
  2022-07-01 19:31                       ` Johannes Schindelin
  0 siblings, 2 replies; 50+ messages in thread
From: Junio C Hamano @ 2022-06-30 15:40 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Johannes Schindelin, Han Xin, chiyutianyi, derrickstolee, git,
	haiyangtand, jonathantanmy, me, ps

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> On Thu, Jun 30 2022, Johannes Schindelin wrote:
>
>> Hi Junio,
>>
>> On Tue, 28 Jun 2022, Junio C Hamano wrote:
>>
>>> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
>>>
>>> >> +test_expect_success 'setup: prepare a repository with commit-graph contains the commit' '
>>> >> +	git init with-commit-graph &&
>>> >> +	echo "$(pwd)/with-commit/.git/objects" \
>>> >> +		>with-commit-graph/.git/objects/info/alternates &&
>>> >
>>> > nit: you can use $PWD instead of $(pwd).
>>>
>>> We can, and it would not make any difference on non-Windows.
>>>
>>> But which one should we use to cater to Windows?  $(pwd) is a full
>>> path in Windows notation "C:\Program Files\Git\..." while $PWD is
>>> MSYS style "/C/Program Files/Git/..." or something like that, IIRC?
>>
>> Indeed, and since the `alternates` file is supposed to be read by
>> `git.exe`, a non-MSYS program, the original was good, and the nit
>> suggested the incorrect form.
>
> I looked at t5615-alternate-env.sh which does the equivalent of:
>
> 	GIT_ALTERNATE_OBJECT_DIRECTORIES="$PWD/one.git/objects:$PWD/two.git/objects" \
>         	git cat-file [...]
>
> We run that test on all our platforms, does the $PWD form work in the
> environment variable, but not when we write it to the "alternates" file?
> Or is there some other subtlety there that I'm missing?

I am also curious to see a clear and concise explanation so that we
do not have to repeat this discussion later.  We have

 - When a test checks for an absolute path that a git command generated,
   construct the expected value using $(pwd) rather than $PWD,
   $TEST_DIRECTORY, or $TRASH_DIRECTORY. It makes a difference on
   Windows, where the shell (MSYS bash) mangles absolute path names.
   For details, see the commit message of 4114156ae9.

in t/README, but even with the log mesasge of 4114156a (Tests on
Windows: $(pwd) must return Windows-style paths, 2009-03-13) [*1*],
I have no idea what makes the thing you found in t5615 work and your
suggestion to use $PWD in the new one not work.

GIT_ALTERNATE_OBJECT_DIRECTORIES is a PATH_SEP (not necessarily a
colon) separated list, and I think the way t5615 uses it is broken
on Windows where PATH_SEP is defined as semicolon without the $PWD
vs $(pwd) issue.  Is the test checking the right thing?


[Footnote]

*1*

    Tests on Windows: $(pwd) must return Windows-style paths

    Many tests pass $(pwd) in some form to git and later test that the output
    of git contains the correct value of $(pwd). For example, the test of
    'git remote show' sets up a remote that contains $(pwd) and then the
    expected result must contain $(pwd).

    Again, MSYS-bash's path mangling kicks in: Plain $(pwd) uses the MSYS style
    absolute path /c/path/to/git. The test case would write this name into
    the 'expect' file. But when git is invoked, MSYS-bash converts this name to
    the Windows style path c:/path/to/git, and git would produce this form in
    the result; the test would fail.

    We fix this by passing -W to bash's pwd that produces the Windows-style
    path.

    There are a two cases that need an accompanying change:

    - In t1504 the value of $(pwd) becomes part of a path list. In this case,
      the lone 'c' in something like /foo:c:/path/to/git:/bar inhibits
      MSYS-bashes path mangling; IOW in this case we want the /c/path/to/git
      form to allow path mangling. We use $PWD instead of $(pwd), which always
      has the latter form.

    - In t6200, $(pwd) - the Windows style path - must be used to construct the
      expected result because that is the path form that git sees. (The change
      in the test itself is just for consistency: 'git fetch' always sees the
      Windows-style path, with or without the change.)

    Signed-off-by: Johannes Sixt <j6t@kdbg.org>


^ permalink raw reply	[flat|nested] 50+ messages in thread

* test name conflict + js/ci-github-workflow-markup regression (was: [PATCH v3 0/2] no lazy fetch in lookup_commit_in_graph())
  2022-06-28  2:02         ` [PATCH v3 0/2] " Han Xin
  2022-06-28  2:02           ` [PATCH v3 1/2] test-lib.sh: add limited processes to test-lib Han Xin
  2022-06-28  2:02           ` [PATCH v3 2/2] commit-graph.c: no lazy fetch in lookup_commit_in_graph() Han Xin
@ 2022-06-30 17:37           ` Ævar Arnfjörð Bjarmason
  2022-07-01  1:34           ` [PATCH v4 0/1] no lazy fetch in lookup_commit_in_graph() Han Xin
  3 siblings, 0 replies; 50+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-06-30 17:37 UTC (permalink / raw)
  To: Han Xin
  Cc: chiyutianyi, derrickstolee, git, haiyangtand, jonathantanmy, me,
	Junio C Hamano, ps, Johannes Schindelin


On Tue, Jun 28 2022, Han Xin wrote:

>  t/t5329-no-lazy-fetch-with-commit-graph.sh | 53 ++++++++++++++++++++++

This fails "make test" since 5b92477f896 (builtin/gc.c: conditionally
avoid pruning objects via loose, 2022-05-20), i.e. we have another
t/t5329* test now.

Per $subject the CI output for this is now a bit cryptic, it lands you
on a failed run-build-and-test.sh step with:
	
	[...]
	=> Run tests
	cat: 't/test-results/*.exit': No such file or directory
	=== Failed test: * ===
	The full logs are in the 'print test failures' step below.
	See also the 'failed-tests-*' artifacts attached to this run.
	cat: 't/test-results/*.markup': No such file or directory
	Error: Process completed with exit code 1.

The last line of the suggested next step is then:

	Build job failed before the tests could have been run

Going earlier and expanding the "Run tests" step we can see the issue:
	
	duplicate test numbers: t5329
	make[1]: *** [Makefile:86: test-lint-duplicates] Error 1
	make[1]: *** Waiting for unfinished jobs....
	make[1]: Leaving directory '/home/runner/work/git/git/t'
	make: *** [Makefile:3065: test] Error 2
	+ res=2
	+ rm exit.status
	+ end_group
	+ test -n t
	+ set +x

Since the CI topic in $subject we've ran the "print failures" step
separately from the "make" invocation, and therefore have to guess at
why we fail, whereas before we'd get that output from "make" itself.

Johannes, is this something you can fix?

In any case, for this topic the fix is simple: The test needs to be
renamed for a re-roll,

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 2/2] commit-graph.c: no lazy fetch in lookup_commit_in_graph()
  2022-06-30 15:40                     ` Junio C Hamano
@ 2022-06-30 18:47                       ` Ævar Arnfjörð Bjarmason
  2022-07-01 19:31                       ` Johannes Schindelin
  1 sibling, 0 replies; 50+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-06-30 18:47 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Johannes Schindelin, Han Xin, chiyutianyi, derrickstolee, git,
	haiyangtand, jonathantanmy, me, ps


On Thu, Jun 30 2022, Junio C Hamano wrote:

> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
>
>> On Thu, Jun 30 2022, Johannes Schindelin wrote:
>>
>>> Hi Junio,
>>>
>>> On Tue, 28 Jun 2022, Junio C Hamano wrote:
>>>
>>>> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
>>>>
>>>> >> +test_expect_success 'setup: prepare a repository with commit-graph contains the commit' '
>>>> >> +	git init with-commit-graph &&
>>>> >> +	echo "$(pwd)/with-commit/.git/objects" \
>>>> >> +		>with-commit-graph/.git/objects/info/alternates &&
>>>> >
>>>> > nit: you can use $PWD instead of $(pwd).
>>>>
>>>> We can, and it would not make any difference on non-Windows.
>>>>
>>>> But which one should we use to cater to Windows?  $(pwd) is a full
>>>> path in Windows notation "C:\Program Files\Git\..." while $PWD is
>>>> MSYS style "/C/Program Files/Git/..." or something like that, IIRC?
>>>
>>> Indeed, and since the `alternates` file is supposed to be read by
>>> `git.exe`, a non-MSYS program, the original was good, and the nit
>>> suggested the incorrect form.
>>
>> I looked at t5615-alternate-env.sh which does the equivalent of:
>>
>> 	GIT_ALTERNATE_OBJECT_DIRECTORIES="$PWD/one.git/objects:$PWD/two.git/objects" \
>>         	git cat-file [...]
>>
>> We run that test on all our platforms, does the $PWD form work in the
>> environment variable, but not when we write it to the "alternates" file?
>> Or is there some other subtlety there that I'm missing?
>
> I am also curious to see a clear and concise explanation so that we
> do not have to repeat this discussion later.  We have
>
>  - When a test checks for an absolute path that a git command generated,
>    construct the expected value using $(pwd) rather than $PWD,
>    $TEST_DIRECTORY, or $TRASH_DIRECTORY. It makes a difference on
>    Windows, where the shell (MSYS bash) mangles absolute path names.
>    For details, see the commit message of 4114156ae9.
>
> in t/README, but even with the log mesasge of 4114156a (Tests on
> Windows: $(pwd) must return Windows-style paths, 2009-03-13) [*1*],
> I have no idea what makes the thing you found in t5615 work and your
> suggestion to use $PWD in the new one not work.
>
> GIT_ALTERNATE_OBJECT_DIRECTORIES is a PATH_SEP (not necessarily a
> colon) separated list, and I think the way t5615 uses it is broken
> on Windows where PATH_SEP is defined as semicolon without the $PWD
> vs $(pwd) issue.  Is the test checking the right thing?

Whatever th explanation is CI passed with a $(pwd) -> $PWD repacement in
the test being introduced here:
https://github.com/avar/git/runs/7136686130?check_suite_focus=true

Diff here:
https://github.com/avar/git/commit/606d0060a57b7021396919044c7696489b7835cd

So either $PWD is fine there, or our Windows CI doesn't reflect this
particular caveat on some Windows systems, or the test is erroneously
passing with an invalid value. Knowing which of those it is would be
very useful...

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH v4 0/1] no lazy fetch in lookup_commit_in_graph()
  2022-06-28  2:02         ` [PATCH v3 0/2] " Han Xin
                             ` (2 preceding siblings ...)
  2022-06-30 17:37           ` test name conflict + js/ci-github-workflow-markup regression (was: [PATCH v3 0/2] no lazy fetch in lookup_commit_in_graph()) Ævar Arnfjörð Bjarmason
@ 2022-07-01  1:34           ` Han Xin
  2022-07-01  1:34             ` [PATCH v4 1/1] commit-graph.c: " Han Xin
                               ` (2 more replies)
  3 siblings, 3 replies; 50+ messages in thread
From: Han Xin @ 2022-07-01  1:34 UTC (permalink / raw)
  To: hanxin.hx
  Cc: chiyutianyi, derrickstolee, git, haiyangtand, jonathantanmy, me,
	Junio C Hamano, ps

Changes since v3:

* Move run_with_limited_processses() into 
  t5330-no-lazy-fetch-with-commit-graph.sh without touching test-lib.sh.

* Squash "setup: prepare any commit to fetch" into the main body of the
  test.

* Minor grammar/comment etc. fixes throughout.

Han Xin (1):
  commit-graph.c: no lazy fetch in lookup_commit_in_graph()

 commit-graph.c                             |  2 +-
 t/t5330-no-lazy-fetch-with-commit-graph.sh | 70 ++++++++++++++++++++++
 2 files changed, 71 insertions(+), 1 deletion(-)
 create mode 100755 t/t5330-no-lazy-fetch-with-commit-graph.sh

Range-diff against v3:
1:  ad0a539759 < -:  ---------- test-lib.sh: add limited processes to test-lib
2:  3cdb1abd43 ! 1:  96d4bb7150 commit-graph.c: no lazy fetch in lookup_commit_in_graph()
    @@ commit-graph.c: struct commit *lookup_commit_in_graph(struct repository *repo, c
      
      	commit = lookup_commit(repo, id);
     
    - ## t/t5329-no-lazy-fetch-with-commit-graph.sh (new) ##
    + ## t/t5330-no-lazy-fetch-with-commit-graph.sh (new) ##
     @@
     +#!/bin/sh
     +
    @@ t/t5329-no-lazy-fetch-with-commit-graph.sh (new)
     +
     +. ./test-lib.sh
     +
    ++run_with_limited_processses () {
    ++	# bash and ksh use "ulimit -u", dash uses "ulimit -p"
    ++	if test -n "$BASH_VERSION"
    ++	then
    ++		ulimit_max_process="-u"
    ++	elif test -n "$KSH_VERSION"
    ++	then
    ++		ulimit_max_process="-u"
    ++	fi
    ++	(ulimit ${ulimit_max_process-"-p"} 512 && "$@")
    ++}
    ++
    ++test_lazy_prereq ULIMIT_PROCESSES '
    ++	run_with_limited_processses true
    ++'
    ++
     +if ! test_have_prereq ULIMIT_PROCESSES
     +then
     +	skip_all='skipping tests for no lazy fetch with the commit-graph, ulimit processes not available'
    @@ t/t5329-no-lazy-fetch-with-commit-graph.sh (new)
     +
     +test_expect_success 'setup: change the alternates to what without the commit' '
     +	git init --bare without-commit &&
    ++	git -C with-commit-graph cat-file -e $oid &&
     +	echo "$(pwd)/without-commit/objects" \
     +		>with-commit-graph/.git/objects/info/alternates &&
     +	test_must_fail git -C with-commit-graph cat-file -e $oid
     +'
     +
    -+test_expect_success 'setup: prepare any commit to fetch' '
    -+	test_commit -C with-commit any-commit &&
    -+	anycommit=$(git -C with-commit rev-parse HEAD)
    -+'
    -+
     +test_expect_success 'fetch any commit from promisor with the usage of the commit graph' '
    ++	# setup promisor and prepare any commit to fetch
     +	git -C with-commit-graph remote add origin "$(pwd)/with-commit" &&
     +	git -C with-commit-graph config remote.origin.promisor true &&
     +	git -C with-commit-graph config remote.origin.partialclonefilter blob:none &&
    -+	run_with_limited_processses env GIT_TRACE="$(pwd)/trace" \
    ++	test_commit -C with-commit any-commit &&
    ++	anycommit=$(git -C with-commit rev-parse HEAD) &&
    ++
    ++	run_with_limited_processses env GIT_TRACE="$(pwd)/trace.txt" \
     +		git -C with-commit-graph fetch origin $anycommit 2>err &&
    -+	test_i18ngrep ! "fatal: promisor-remote: unable to fork off fetch subprocess" err &&
    -+	test $(grep "fetch origin" trace | wc -l) -eq 1
    ++	! grep "fatal: promisor-remote: unable to fork off fetch subprocess" err &&
    ++	grep "git fetch origin" trace.txt >actual &&
    ++	test_line_count = 1 actual
     +'
     +
     +test_done
-- 
2.36.1


^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH v4 1/1] commit-graph.c: no lazy fetch in lookup_commit_in_graph()
  2022-07-01  1:34           ` [PATCH v4 0/1] no lazy fetch in lookup_commit_in_graph() Han Xin
@ 2022-07-01  1:34             ` Han Xin
  2022-07-09 12:23               ` Michael J Gruber
  2022-07-12  6:50             ` [PATCH v5 0/1] " Han Xin
  2022-07-12  8:01             ` [PATCH v1] t5330: remove run_with_limited_processses() Han Xin
  2 siblings, 1 reply; 50+ messages in thread
From: Han Xin @ 2022-07-01  1:34 UTC (permalink / raw)
  To: hanxin.hx
  Cc: chiyutianyi, derrickstolee, git, haiyangtand, jonathantanmy, me,
	Junio C Hamano, ps

The commit-graph is used to opportunistically optimize accesses to
certain pieces of information on commit objects, and
lookup_commit_in_graph() tries to say "no" when the requested commit
does not locally exist by returning NULL, in which case the caller
can ask for (which may result in on-demand fetching from a promisor
remote) and parse the commit object itself.

However, it uses a wrong helper, repo_has_object_file(), to do so.
This helper not only checks if an object is mmediately available in
the local object store, but also tries to fetch from a promisor remote.
But the fetch machinery calls lookup_commit_in_graph(), thus causing an
infinite loop.

We should make lookup_commit_in_graph() expect that a commit given to it
can be legitimately missing from the local object store, by using the
has_object_file() helper instead.

Signed-off-by: Han Xin <hanxin.hx@bytedance.com>
---
 commit-graph.c                             |  2 +-
 t/t5330-no-lazy-fetch-with-commit-graph.sh | 70 ++++++++++++++++++++++
 2 files changed, 71 insertions(+), 1 deletion(-)
 create mode 100755 t/t5330-no-lazy-fetch-with-commit-graph.sh

diff --git a/commit-graph.c b/commit-graph.c
index 92d4503336..2b04ef072d 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -898,7 +898,7 @@ struct commit *lookup_commit_in_graph(struct repository *repo, const struct obje
 		return NULL;
 	if (!search_commit_pos_in_graph(id, repo->objects->commit_graph, &pos))
 		return NULL;
-	if (!repo_has_object_file(repo, id))
+	if (!has_object(repo, id, 0))
 		return NULL;
 
 	commit = lookup_commit(repo, id);
diff --git a/t/t5330-no-lazy-fetch-with-commit-graph.sh b/t/t5330-no-lazy-fetch-with-commit-graph.sh
new file mode 100755
index 0000000000..be33334229
--- /dev/null
+++ b/t/t5330-no-lazy-fetch-with-commit-graph.sh
@@ -0,0 +1,70 @@
+#!/bin/sh
+
+test_description='test for no lazy fetch with the commit-graph'
+
+. ./test-lib.sh
+
+run_with_limited_processses () {
+	# bash and ksh use "ulimit -u", dash uses "ulimit -p"
+	if test -n "$BASH_VERSION"
+	then
+		ulimit_max_process="-u"
+	elif test -n "$KSH_VERSION"
+	then
+		ulimit_max_process="-u"
+	fi
+	(ulimit ${ulimit_max_process-"-p"} 512 && "$@")
+}
+
+test_lazy_prereq ULIMIT_PROCESSES '
+	run_with_limited_processses true
+'
+
+if ! test_have_prereq ULIMIT_PROCESSES
+then
+	skip_all='skipping tests for no lazy fetch with the commit-graph, ulimit processes not available'
+	test_done
+fi
+
+test_expect_success 'setup: prepare a repository with a commit' '
+	git init with-commit &&
+	test_commit -C with-commit the-commit &&
+	oid=$(git -C with-commit rev-parse HEAD)
+'
+
+test_expect_success 'setup: prepare a repository with commit-graph contains the commit' '
+	git init with-commit-graph &&
+	echo "$(pwd)/with-commit/.git/objects" \
+		>with-commit-graph/.git/objects/info/alternates &&
+	# create a ref that points to the commit in alternates
+	git -C with-commit-graph update-ref refs/ref_to_the_commit "$oid" &&
+	# prepare some other objects to commit-graph
+	test_commit -C with-commit-graph something &&
+	git -c gc.writeCommitGraph=true -C with-commit-graph gc &&
+	test_path_is_file with-commit-graph/.git/objects/info/commit-graph
+'
+
+test_expect_success 'setup: change the alternates to what without the commit' '
+	git init --bare without-commit &&
+	git -C with-commit-graph cat-file -e $oid &&
+	echo "$(pwd)/without-commit/objects" \
+		>with-commit-graph/.git/objects/info/alternates &&
+	test_must_fail git -C with-commit-graph cat-file -e $oid
+'
+
+test_expect_success 'fetch any commit from promisor with the usage of the commit graph' '
+	# setup promisor and prepare any commit to fetch
+	git -C with-commit-graph remote add origin "$(pwd)/with-commit" &&
+	git -C with-commit-graph config remote.origin.promisor true &&
+	git -C with-commit-graph config remote.origin.partialclonefilter blob:none &&
+	test_commit -C with-commit any-commit &&
+	anycommit=$(git -C with-commit rev-parse HEAD) &&
+
+	run_with_limited_processses env GIT_TRACE="$(pwd)/trace.txt" \
+		git -C with-commit-graph fetch origin $anycommit 2>err &&
+	! grep "fatal: promisor-remote: unable to fork off fetch subprocess" err &&
+	grep "git fetch origin" trace.txt >actual &&
+	test_line_count = 1 actual
+'
+
+test_done
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 2/2] commit-graph.c: no lazy fetch in lookup_commit_in_graph()
  2022-06-30 15:40                     ` Junio C Hamano
  2022-06-30 18:47                       ` Ævar Arnfjörð Bjarmason
@ 2022-07-01 19:31                       ` Johannes Schindelin
  2022-07-01 20:47                         ` Junio C Hamano
  1 sibling, 1 reply; 50+ messages in thread
From: Johannes Schindelin @ 2022-07-01 19:31 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Ævar Arnfjörð Bjarmason, Han Xin, chiyutianyi,
	derrickstolee, git, haiyangtand, jonathantanmy, me, ps

[-- Attachment #1: Type: text/plain, Size: 7105 bytes --]

Hi Junio,

On Thu, 30 Jun 2022, Junio C Hamano wrote:

> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
>
> > On Thu, Jun 30 2022, Johannes Schindelin wrote:
> >
> >> On Tue, 28 Jun 2022, Junio C Hamano wrote:
> >>
> >>> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
> >>>
> >>> >> +test_expect_success 'setup: prepare a repository with commit-graph contains the commit' '
> >>> >> +	git init with-commit-graph &&
> >>> >> +	echo "$(pwd)/with-commit/.git/objects" \
> >>> >> +		>with-commit-graph/.git/objects/info/alternates &&
> >>> >
> >>> > nit: you can use $PWD instead of $(pwd).
> >>>
> >>> We can, and it would not make any difference on non-Windows.
> >>>
> >>> But which one should we use to cater to Windows?  $(pwd) is a full
> >>> path in Windows notation "C:\Program Files\Git\..." while $PWD is
> >>> MSYS style "/C/Program Files/Git/..." or something like that, IIRC?
> >>
> >> Indeed, and since the `alternates` file is supposed to be read by
> >> `git.exe`, a non-MSYS program, the original was good, and the nit
> >> suggested the incorrect form.
> >
> > I looked at t5615-alternate-env.sh which does the equivalent of:
> >
> > 	GIT_ALTERNATE_OBJECT_DIRECTORIES="$PWD/one.git/objects:$PWD/two.git/objects" \
> >         	git cat-file [...]
> >
> > We run that test on all our platforms, does the $PWD form work in the
> > environment variable, but not when we write it to the "alternates" file?
> > Or is there some other subtlety there that I'm missing?
>
> I am also curious to see a clear and concise explanation so that we
> do not have to repeat this discussion later.

Unfortunately, I do not see any way to explain this concisely: MSYS2 does
hard-to-explain things here, in the hopes to Do The Right Thing (most of
the time, anyways).

Whenever you call a non-MSYS program from an MSYS program (and remember,
an MSYS program is a program that uses the MSYS2 runtime that acts as a
POSIX emulation layer), "magic" things are done. In our context,
`bash.exe` is an MSYS program, and the non-MSYS program that is called is
`git.exe`.

So what are those "magic" things? The command-line arguments and the
environment variables are auto-converted: everything that looks like a
Unix-style path (or path list, like the `PATH` environment variable) is
converted to a Windows-style path or path list (on Windows, the colon
cannot be the separator in `PATH`, therefore the semicolon is used).

And this is where it gets _really_ tricky to explain what is going on:
what _does_ look like a Unix-style path? The exact rules are convoluted
and hard to explain, but they work _most of the time_. For example,
`/usr/bin:/hello` is converted to `C:\Program Files\Git\usr\bin;C:\Program
Files\Git\hello` or something like that. But `kernel.org:/home/gitster` is
not, because it looks more like an SSH path. Similarly, `C:/Program Files`
is interpreted as a Windows-style path, even if it could technically be a
Unix-style path list.

Now, if you call `git.exe -C /blabla <command>`, it works, because
`git.exe` is a non-MSYS program, therefore that `/blabla` is converted to
a Windows-style path before executing `git.exe`. However, when you write a
file via `echo /blabla >file`, that `echo` is either the Bash built-in, or
it is an MSYS program, and no argument conversion takes place. If you
_then_ ask `git.exe` to read and interpret the file as a path, it won't
know what to do with that Unix-style path.

You can substitute `$PWD` for `/blabla` in all of this, and it will hold
true just the same.

So what makes `pwd` special?

Well, `pwd.exe` itself is an MSYS program, so it would still report a path
that `git.exe` cannot understand. But in Git's test suite, we specifically
override `pwd` to be a shell function that calls `pwd.exe -W`, which does
output Windows-style paths.

The thing that makes that `GIT_*=$PWD git ...` call work is that the
environment is automagically converted because `git` is a non-MSYS
program. The thing that makes `echo $PWD >.git/objects/info/alternates`
not work is that `echo` _is_ an MSYS program (or Bash built-in, which is
the same thing here, for all practical purposes), so it writes the path
verbatim into that file, but then we expect `git.exe` to read this file
and interpret it as a list of paths.

Hopefully that clarifies the scenario a bit, even if it is far from a
concise explanation (I did edit this mail multiple times for clarity and
brevity, though, as I do with pretty much all of my mails).

Ciao,
Dscho

> We have
>
>  - When a test checks for an absolute path that a git command generated,
>    construct the expected value using $(pwd) rather than $PWD,
>    $TEST_DIRECTORY, or $TRASH_DIRECTORY. It makes a difference on
>    Windows, where the shell (MSYS bash) mangles absolute path names.
>    For details, see the commit message of 4114156ae9.
>
> in t/README, but even with the log mesasge of 4114156a (Tests on
> Windows: $(pwd) must return Windows-style paths, 2009-03-13) [*1*],
> I have no idea what makes the thing you found in t5615 work and your
> suggestion to use $PWD in the new one not work.
>
> GIT_ALTERNATE_OBJECT_DIRECTORIES is a PATH_SEP (not necessarily a
> colon) separated list, and I think the way t5615 uses it is broken
> on Windows where PATH_SEP is defined as semicolon without the $PWD
> vs $(pwd) issue.  Is the test checking the right thing?
>
>
> [Footnote]
>
> *1*
>
>     Tests on Windows: $(pwd) must return Windows-style paths
>
>     Many tests pass $(pwd) in some form to git and later test that the output
>     of git contains the correct value of $(pwd). For example, the test of
>     'git remote show' sets up a remote that contains $(pwd) and then the
>     expected result must contain $(pwd).
>
>     Again, MSYS-bash's path mangling kicks in: Plain $(pwd) uses the MSYS style
>     absolute path /c/path/to/git. The test case would write this name into
>     the 'expect' file. But when git is invoked, MSYS-bash converts this name to
>     the Windows style path c:/path/to/git, and git would produce this form in
>     the result; the test would fail.
>
>     We fix this by passing -W to bash's pwd that produces the Windows-style
>     path.
>
>     There are a two cases that need an accompanying change:
>
>     - In t1504 the value of $(pwd) becomes part of a path list. In this case,
>       the lone 'c' in something like /foo:c:/path/to/git:/bar inhibits
>       MSYS-bashes path mangling; IOW in this case we want the /c/path/to/git
>       form to allow path mangling. We use $PWD instead of $(pwd), which always
>       has the latter form.
>
>     - In t6200, $(pwd) - the Windows style path - must be used to construct the
>       expected result because that is the path form that git sees. (The change
>       in the test itself is just for consistency: 'git fetch' always sees the
>       Windows-style path, with or without the change.)
>
>     Signed-off-by: Johannes Sixt <j6t@kdbg.org>
>
>

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 2/2] commit-graph.c: no lazy fetch in lookup_commit_in_graph()
  2022-07-01 19:31                       ` Johannes Schindelin
@ 2022-07-01 20:47                         ` Junio C Hamano
  0 siblings, 0 replies; 50+ messages in thread
From: Junio C Hamano @ 2022-07-01 20:47 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Ævar Arnfjörð Bjarmason, Han Xin, chiyutianyi,
	derrickstolee, git, haiyangtand, jonathantanmy, me, ps

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> Whenever you call a non-MSYS program from an MSYS program (and remember,
> an MSYS program is a program that uses the MSYS2 runtime that acts as a
> POSIX emulation layer), "magic" things are done. In our context,
> `bash.exe` is an MSYS program, and the non-MSYS program that is called is
> `git.exe`.
>
> So what are those "magic" things? The command-line arguments and the
> environment variables are auto-converted: everything that looks like a
> Unix-style path (or path list, like the `PATH` environment variable) is
> converted to a Windows-style path or path list (on Windows, the colon
> cannot be the separator in `PATH`, therefore the semicolon is used).
>
> And this is where it gets _really_ tricky to explain what is going on:
> what _does_ look like a Unix-style path? The exact rules are convoluted
> and hard to explain, but they work _most of the time_. For example,
> `/usr/bin:/hello` is converted to `C:\Program Files\Git\usr\bin;C:\Program
> Files\Git\hello` or something like that. But `kernel.org:/home/gitster` is
> not, because it looks more like an SSH path. Similarly, `C:/Program Files`
> is interpreted as a Windows-style path, even if it could technically be a
> Unix-style path list.
>
> Now, if you call `git.exe -C /blabla <command>`, it works, because
> `git.exe` is a non-MSYS program, therefore that `/blabla` is converted to
> a Windows-style path before executing `git.exe`. However, when you write a
> file via `echo /blabla >file`, that `echo` is either the Bash built-in, or
> it is an MSYS program, and no argument conversion takes place. If you
> _then_ ask `git.exe` to read and interpret the file as a path, it won't
> know what to do with that Unix-style path.
>
> You can substitute `$PWD` for `/blabla` in all of this, and it will hold
> true just the same.
>
> So what makes `pwd` special?
>
> Well, `pwd.exe` itself is an MSYS program, so it would still report a path
> that `git.exe` cannot understand. But in Git's test suite, we specifically
> override `pwd` to be a shell function that calls `pwd.exe -W`, which does
> output Windows-style paths.
>
> The thing that makes that `GIT_*=$PWD git ...` call work is that the
> environment is automagically converted because `git` is a non-MSYS
> program. The thing that makes `echo $PWD >.git/objects/info/alternates`
> not work is that `echo` _is_ an MSYS program (or Bash built-in, which is
> the same thing here, for all practical purposes), so it writes the path
> verbatim into that file, but then we expect `git.exe` to read this file
> and interpret it as a list of paths.

----- 8< --------- 8< --------- 8< --------- 8< --------- 8< -----

> Hopefully that clarifies the scenario a bit, even if it is far from a
> concise explanation (I did edit this mail multiple times for clarity and
> brevity, though, as I do with pretty much all of my mails).

Certainly it does help.  Thanks.

I wonder if it makes sense to keep a copy of the bulk of your
response in t/ somewhere, and refer to it from t/README, to help
fellow non-Windows developers to avoid breaking tests on Windows
without knowing.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v4 1/1] commit-graph.c: no lazy fetch in lookup_commit_in_graph()
  2022-07-01  1:34             ` [PATCH v4 1/1] commit-graph.c: " Han Xin
@ 2022-07-09 12:23               ` Michael J Gruber
  2022-07-11 15:09                 ` Jeff King
  0 siblings, 1 reply; 50+ messages in thread
From: Michael J Gruber @ 2022-07-09 12:23 UTC (permalink / raw)
  To: hanxin.hx
  Cc: chiyutianyi, derrickstolee, git, haiyangtand, jonathantanmy, me,
	Junio C Hamano, ps

Han Xin venit, vidit, dixit 2022-07-01 03:34:30:
> The commit-graph is used to opportunistically optimize accesses to
> certain pieces of information on commit objects, and
> lookup_commit_in_graph() tries to say "no" when the requested commit
> does not locally exist by returning NULL, in which case the caller
> can ask for (which may result in on-demand fetching from a promisor
> remote) and parse the commit object itself.
> 
> However, it uses a wrong helper, repo_has_object_file(), to do so.
> This helper not only checks if an object is mmediately available in
> the local object store, but also tries to fetch from a promisor remote.
> But the fetch machinery calls lookup_commit_in_graph(), thus causing an
> infinite loop.
> 
> We should make lookup_commit_in_graph() expect that a commit given to it
> can be legitimately missing from the local object store, by using the
> has_object_file() helper instead.
> 
> Signed-off-by: Han Xin <hanxin.hx@bytedance.com>
> ---
>  commit-graph.c                             |  2 +-
>  t/t5330-no-lazy-fetch-with-commit-graph.sh | 70 ++++++++++++++++++++++
>  2 files changed, 71 insertions(+), 1 deletion(-)
>  create mode 100755 t/t5330-no-lazy-fetch-with-commit-graph.sh
> 
> diff --git a/commit-graph.c b/commit-graph.c
> index 92d4503336..2b04ef072d 100644
> --- a/commit-graph.c
> +++ b/commit-graph.c
> @@ -898,7 +898,7 @@ struct commit *lookup_commit_in_graph(struct repository *repo, const struct obje
>                 return NULL;
>         if (!search_commit_pos_in_graph(id, repo->objects->commit_graph, &pos))
>                 return NULL;
> -       if (!repo_has_object_file(repo, id))
> +       if (!has_object(repo, id, 0))
>                 return NULL;
>  
>         commit = lookup_commit(repo, id);
> diff --git a/t/t5330-no-lazy-fetch-with-commit-graph.sh b/t/t5330-no-lazy-fetch-with-commit-graph.sh
> new file mode 100755
> index 0000000000..be33334229
> --- /dev/null
> +++ b/t/t5330-no-lazy-fetch-with-commit-graph.sh
> @@ -0,0 +1,70 @@
> +#!/bin/sh
> +
> +test_description='test for no lazy fetch with the commit-graph'
> +
> +. ./test-lib.sh
> +
> +run_with_limited_processses () {
> +       # bash and ksh use "ulimit -u", dash uses "ulimit -p"
> +       if test -n "$BASH_VERSION"
> +       then
> +               ulimit_max_process="-u"
> +       elif test -n "$KSH_VERSION"
> +       then
> +               ulimit_max_process="-u"
> +       fi
> +       (ulimit ${ulimit_max_process-"-p"} 512 && "$@")
> +}

This new test fails for me unless I increase max_processes. 1024 works.

I haven't bisected the number of prcesses ... This is higly system
dependent. I even run a slim environment (i3wm) but having chrome or
such running probably makes quite a difference.

512 is probably OK in CI in an isolated environment but is too low on a
typical "What you mean I'm not working? I'm waiting for the test run!"
developper workstation.

Conversely, which number would be too high to catch what the test is
supposed to catch? Does it incur a big performance penalty to go as high
as possible?

> +
> +test_lazy_prereq ULIMIT_PROCESSES '
> +       run_with_limited_processses true
> +'
> +
> +if ! test_have_prereq ULIMIT_PROCESSES
> +then
> +       skip_all='skipping tests for no lazy fetch with the commit-graph, ulimit processes not available'
> +       test_done
> +fi
> +
> +test_expect_success 'setup: prepare a repository with a commit' '
> +       git init with-commit &&
> +       test_commit -C with-commit the-commit &&
> +       oid=$(git -C with-commit rev-parse HEAD)
> +'
> +
> +test_expect_success 'setup: prepare a repository with commit-graph contains the commit' '
> +       git init with-commit-graph &&
> +       echo "$(pwd)/with-commit/.git/objects" \
> +               >with-commit-graph/.git/objects/info/alternates &&
> +       # create a ref that points to the commit in alternates
> +       git -C with-commit-graph update-ref refs/ref_to_the_commit "$oid" &&
> +       # prepare some other objects to commit-graph
> +       test_commit -C with-commit-graph something &&
> +       git -c gc.writeCommitGraph=true -C with-commit-graph gc &&
> +       test_path_is_file with-commit-graph/.git/objects/info/commit-graph
> +'
> +
> +test_expect_success 'setup: change the alternates to what without the commit' '
> +       git init --bare without-commit &&
> +       git -C with-commit-graph cat-file -e $oid &&
> +       echo "$(pwd)/without-commit/objects" \
> +               >with-commit-graph/.git/objects/info/alternates &&
> +       test_must_fail git -C with-commit-graph cat-file -e $oid
> +'
> +
> +test_expect_success 'fetch any commit from promisor with the usage of the commit graph' '
> +       # setup promisor and prepare any commit to fetch
> +       git -C with-commit-graph remote add origin "$(pwd)/with-commit" &&
> +       git -C with-commit-graph config remote.origin.promisor true &&
> +       git -C with-commit-graph config remote.origin.partialclonefilter blob:none &&
> +       test_commit -C with-commit any-commit &&
> +       anycommit=$(git -C with-commit rev-parse HEAD) &&
> +
> +       run_with_limited_processses env GIT_TRACE="$(pwd)/trace.txt" \
> +               git -C with-commit-graph fetch origin $anycommit 2>err &&

That empty line abobe makes me nervous, especially when a test fails for
very unclear reasons like here. Is it necessary?

If the answer is "to separate setup and test" then the solution is to
separate setup and test ...

> +       ! grep "fatal: promisor-remote: unable to fork off fetch subprocess" err &&
> +       grep "git fetch origin" trace.txt >actual &&
> +       test_line_count = 1 actual
> +'
> +
> +test_done
> -- 
> 2.36.1
> 
>

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v4 1/1] commit-graph.c: no lazy fetch in lookup_commit_in_graph()
  2022-07-09 12:23               ` Michael J Gruber
@ 2022-07-11 15:09                 ` Jeff King
  2022-07-11 20:17                   ` Junio C Hamano
  0 siblings, 1 reply; 50+ messages in thread
From: Jeff King @ 2022-07-11 15:09 UTC (permalink / raw)
  To: Michael J Gruber
  Cc: hanxin.hx, chiyutianyi, derrickstolee, git, haiyangtand,
	jonathantanmy, me, Junio C Hamano, ps

On Sat, Jul 09, 2022 at 02:23:36PM +0200, Michael J Gruber wrote:

> > +run_with_limited_processses () {
> > +       # bash and ksh use "ulimit -u", dash uses "ulimit -p"
> > +       if test -n "$BASH_VERSION"
> > +       then
> > +               ulimit_max_process="-u"
> > +       elif test -n "$KSH_VERSION"
> > +       then
> > +               ulimit_max_process="-u"
> > +       fi
> > +       (ulimit ${ulimit_max_process-"-p"} 512 && "$@")
> > +}
> 
> This new test fails for me unless I increase max_processes. 1024 works.
> 
> I haven't bisected the number of prcesses ... This is higly system
> dependent. I even run a slim environment (i3wm) but having chrome or
> such running probably makes quite a difference.
> 
> 512 is probably OK in CI in an isolated environment but is too low on a
> typical "What you mean I'm not working? I'm waiting for the test run!"
> developper workstation.
> 
> Conversely, which number would be too high to catch what the test is
> supposed to catch? Does it incur a big performance penalty to go as high
> as possible?

This bit me, too. It works if I run it standalone:

  $ ./t5330-no-lazy-fetch-with-commit-graph.sh 
  ok 1 - setup: prepare a repository with a commit
  ok 2 - setup: prepare a repository with commit-graph contains the commit
  ok 3 - setup: change the alternates to what without the commit
  ok 4 - fetch any commit from promisor with the usage of the commit graph
  # passed all 4 test(s)

but it fails when I run the whole test suite with "prove -j32". Or even
easier, just run it under "--stress":

  $ ./t5330-no-lazy-fetch-with-commit-graph.sh  --stress
  [...]
  + run_with_limited_processses env GIT_TRACE=/home/peff/compile/git/t/trash directory.t5330-no-lazy-fetch-with-commit-graph.stress-30/trace.txt git -C with-commit-graph fetch origin ba19607defe740988a69e98bced331083e02bdd6
error: last command exited with $?=128
not ok 4 - fetch any commit from promisor with the usage of the commit graph

  $ cat trash*.stress-failed/err
  [...]
  error: cannot fork() for index-pack: Resource temporarily unavailable
  fatal: fetch-pack: unable to fork off index-pack

-Peff

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v4 1/1] commit-graph.c: no lazy fetch in lookup_commit_in_graph()
  2022-07-11 15:09                 ` Jeff King
@ 2022-07-11 20:17                   ` Junio C Hamano
  2022-07-12  1:52                     ` [External] " Han Xin
  0 siblings, 1 reply; 50+ messages in thread
From: Junio C Hamano @ 2022-07-11 20:17 UTC (permalink / raw)
  To: Jeff King
  Cc: Michael J Gruber, hanxin.hx, chiyutianyi, derrickstolee, git,
	haiyangtand, jonathantanmy, me, ps

Jeff King <peff@peff.net> writes:

>> 512 is probably OK in CI in an isolated environment but is too low on a
>> typical "What you mean I'm not working? I'm waiting for the test run!"
>> developper workstation.
>> 
>> Conversely, which number would be too high to catch what the test is
>> supposed to catch? Does it incur a big performance penalty to go as high
>> as possible?
>
> This bit me, too. It works if I run it standalone:
>
>   $ ./t5330-no-lazy-fetch-with-commit-graph.sh 
>   ok 1 - setup: prepare a repository with a commit
>   ok 2 - setup: prepare a repository with commit-graph contains the commit
>   ok 3 - setup: change the alternates to what without the commit
>   ok 4 - fetch any commit from promisor with the usage of the commit graph
>   # passed all 4 test(s)
>
> but it fails when I run the whole test suite with "prove -j32". Or even
> easier, just run it under "--stress":

Understandable.  I am usually on a datacentre VM without graphical
UI so the process count there is much lower than on a typical
developer workstation.

I wonder if we can just run the test without any limit?  If in an
unattended CI situation, hopefully they will kick the job out due to
quota, and on a developer workstation, there may be processes killed
left and right, but that is only when the "infinite respawning" bug
reappears.


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [External] Re: [PATCH v4 1/1] commit-graph.c: no lazy fetch in lookup_commit_in_graph()
  2022-07-11 20:17                   ` Junio C Hamano
@ 2022-07-12  1:52                     ` Han Xin
  2022-07-12  5:23                       ` Junio C Hamano
  0 siblings, 1 reply; 50+ messages in thread
From: Han Xin @ 2022-07-12  1:52 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Jeff King, Michael J Gruber, chiyutianyi, derrickstolee, git,
	haiyangtand, jonathantanmy, me, ps

On Tue, Jul 12, 2022 at 4:17 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> Jeff King <peff@peff.net> writes:
>
> >> 512 is probably OK in CI in an isolated environment but is too low on a
> >> typical "What you mean I'm not working? I'm waiting for the test run!"
> >> developper workstation.
> >>
> >> Conversely, which number would be too high to catch what the test is
> >> supposed to catch? Does it incur a big performance penalty to go as high
> >> as possible?
> >
> > This bit me, too. It works if I run it standalone:
> >
> >   $ ./t5330-no-lazy-fetch-with-commit-graph.sh
> >   ok 1 - setup: prepare a repository with a commit
> >   ok 2 - setup: prepare a repository with commit-graph contains the commit
> >   ok 3 - setup: change the alternates to what without the commit
> >   ok 4 - fetch any commit from promisor with the usage of the commit graph
> >   # passed all 4 test(s)
> >
> > but it fails when I run the whole test suite with "prove -j32". Or even
> > easier, just run it under "--stress":
>
> Understandable.  I am usually on a datacentre VM without graphical
> UI so the process count there is much lower than on a typical
> developer workstation.
>
> I wonder if we can just run the test without any limit?  If in an
> unattended CI situation, hopefully they will kick the job out due to
> quota, and on a developer workstation, there may be processes killed
> left and right, but that is only when the "infinite respawning" bug
> reappears.
>

The tricky thing about using ulimit is that it's tied to the entire development
station. I have tried to run the test without any limit [1], it did finally be
canceled after 6 hours.

Remove the "ulimit", once the infinite loop reappears, this test seems like a
bomb to developers that quickly consumes all resources? While "1024"
works fine with "--stress" , there are reasons to wonder if it's enough, or
maybe we can take a value like 10240 that we wouldn't normally reach?

1. https://github.com/chiyutianyi/git/runs/6962635320

Thanks
-Han Xin

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [External] Re: [PATCH v4 1/1] commit-graph.c: no lazy fetch in lookup_commit_in_graph()
  2022-07-12  1:52                     ` [External] " Han Xin
@ 2022-07-12  5:23                       ` Junio C Hamano
  2022-07-12  5:32                         ` Han Xin
  2022-07-12  6:37                         ` [External] " Jeff King
  0 siblings, 2 replies; 50+ messages in thread
From: Junio C Hamano @ 2022-07-12  5:23 UTC (permalink / raw)
  To: Han Xin
  Cc: Jeff King, Michael J Gruber, chiyutianyi, derrickstolee, git,
	haiyangtand, jonathantanmy, me, ps

Han Xin <hanxin.hx@bytedance.com> writes:

>> I wonder if we can just run the test without any limit?  If in an
>> unattended CI situation, hopefully they will kick the job out due to
>> quota, and on a developer workstation, there may be processes killed
>> left and right, but that is only when the "infinite respawning" bug
>> reappears.
>>
>
> The tricky thing about using ulimit is that it's tied to the entire development
> station. I have tried to run the test without any limit [1], it did finally be
> canceled after 6 hours.

I am not worried so much about developer workstation, which people
are sitting in front of.  They can ^C any runaway test way before 6
hours just fine.

I am assuming that we do not have to be worried about CI settings
too much, either, as they should already be prepared to catch
run-away processes.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Re: [PATCH v4 1/1] commit-graph.c: no lazy fetch in lookup_commit_in_graph()
  2022-07-12  5:23                       ` Junio C Hamano
@ 2022-07-12  5:32                         ` Han Xin
  2022-07-12  6:37                         ` [External] " Jeff King
  1 sibling, 0 replies; 50+ messages in thread
From: Han Xin @ 2022-07-12  5:32 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Jeff King, Michael J Gruber, chiyutianyi, derrickstolee, git,
	haiyangtand, Jonathan Tan, Taylor Blau, Patrick Steinhardt

On Tue, Jul 12, 2022 at 1:23 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Han Xin <hanxin.hx@bytedance.com> writes:
>
> >> I wonder if we can just run the test without any limit?  If in an
> >> unattended CI situation, hopefully they will kick the job out due to
> >> quota, and on a developer workstation, there may be processes killed
> >> left and right, but that is only when the "infinite respawning" bug
> >> reappears.
> >>
> >
> > The tricky thing about using ulimit is that it's tied to the entire development
> > station. I have tried to run the test without any limit [1], it did finally be
> > canceled after 6 hours.
>
> I am not worried so much about developer workstation, which people
> are sitting in front of.  They can ^C any runaway test way before 6
> hours just fine.
>
> I am assuming that we do not have to be worried about CI settings
> too much, either, as they should already be prepared to catch
> run-away processes.

OK, let's remove the "ulimit" and leave it to the followup checks.

Thanks.
-Han Xin

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [External] Re: [PATCH v4 1/1] commit-graph.c: no lazy fetch in lookup_commit_in_graph()
  2022-07-12  5:23                       ` Junio C Hamano
  2022-07-12  5:32                         ` Han Xin
@ 2022-07-12  6:37                         ` Jeff King
  2022-07-12 14:19                           ` Junio C Hamano
  1 sibling, 1 reply; 50+ messages in thread
From: Jeff King @ 2022-07-12  6:37 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Han Xin, Michael J Gruber, chiyutianyi, derrickstolee, git,
	haiyangtand, jonathantanmy, me, ps

On Mon, Jul 11, 2022 at 10:23:01PM -0700, Junio C Hamano wrote:

> > The tricky thing about using ulimit is that it's tied to the entire development
> > station. I have tried to run the test without any limit [1], it did finally be
> > canceled after 6 hours.
> 
> I am not worried so much about developer workstation, which people
> are sitting in front of.  They can ^C any runaway test way before 6
> hours just fine.
> 
> I am assuming that we do not have to be worried about CI settings
> too much, either, as they should already be prepared to catch
> run-away processes.

Agreed. Also, I think that although it's natural to worry about a bug we
know about causing an infinite loop, it's much more likely that a _new_
bug will cause one. I.e., every test we already carry is a candidate to
accidentally loop forever in this way. This is just the one we happen to
have seen. Once fixed, I don't know that it's at any more risk of
reocurring than any other problem.

-Peff

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH v5 0/1] no lazy fetch in lookup_commit_in_graph()
  2022-07-01  1:34           ` [PATCH v4 0/1] no lazy fetch in lookup_commit_in_graph() Han Xin
  2022-07-01  1:34             ` [PATCH v4 1/1] commit-graph.c: " Han Xin
@ 2022-07-12  6:50             ` Han Xin
  2022-07-12  6:50               ` [PATCH v5 1/1] commit-graph.c: " Han Xin
  2022-07-12  6:58               ` [PATCH v5 0/1] " Jeff King
  2022-07-12  8:01             ` [PATCH v1] t5330: remove run_with_limited_processses() Han Xin
  2 siblings, 2 replies; 50+ messages in thread
From: Han Xin @ 2022-07-12  6:50 UTC (permalink / raw)
  To: hanxin.hx
  Cc: chiyutianyi, derrickstolee, git, haiyangtand, jonathantanmy, peff,
	git, me, Junio C Hamano, ps

This patch fixes the following issue:
When we found the commit in the graph in lookup_commit_in_graph(), 
but the commit is missing from the repository, we will try
promisor_remote_get_direct() and then enter another loop.

Then we will go into an endless loop:
  git fetch -> deref_without_lazy_fetch() ->
    lookup_commit_in_graph() -> repo_has_object_file() ->
      promisor_remote_get_direct() -> fetch_objects() ->
        git fetch (a new loop round)

Changes since v4:

* Remove run_with_limited_processses() as it can be catched by CI
  settings and developer workstation. Keeping it will make a trouble
  when there are too many prcesses or stress is used.

Han Xin (1):
  commit-graph.c: no lazy fetch in lookup_commit_in_graph()

 commit-graph.c                             |  2 +-
 t/t5330-no-lazy-fetch-with-commit-graph.sh | 47 ++++++++++++++++++++++
 2 files changed, 48 insertions(+), 1 deletion(-)
 create mode 100755 t/t5330-no-lazy-fetch-with-commit-graph.sh

Range-diff against v4:
1:  96d4bb7150 ! 1:  3ffeed50de commit-graph.c: no lazy fetch in lookup_commit_in_graph()
    @@ t/t5330-no-lazy-fetch-with-commit-graph.sh (new)
     +
     +. ./test-lib.sh
     +
    -+run_with_limited_processses () {
    -+	# bash and ksh use "ulimit -u", dash uses "ulimit -p"
    -+	if test -n "$BASH_VERSION"
    -+	then
    -+		ulimit_max_process="-u"
    -+	elif test -n "$KSH_VERSION"
    -+	then
    -+		ulimit_max_process="-u"
    -+	fi
    -+	(ulimit ${ulimit_max_process-"-p"} 512 && "$@")
    -+}
    -+
    -+test_lazy_prereq ULIMIT_PROCESSES '
    -+	run_with_limited_processses true
    -+'
    -+
    -+if ! test_have_prereq ULIMIT_PROCESSES
    -+then
    -+	skip_all='skipping tests for no lazy fetch with the commit-graph, ulimit processes not available'
    -+	test_done
    -+fi
    -+
     +test_expect_success 'setup: prepare a repository with a commit' '
     +	git init with-commit &&
     +	test_commit -C with-commit the-commit &&
    @@ t/t5330-no-lazy-fetch-with-commit-graph.sh (new)
     +	git -C with-commit-graph config remote.origin.partialclonefilter blob:none &&
     +	test_commit -C with-commit any-commit &&
     +	anycommit=$(git -C with-commit rev-parse HEAD) &&
    -+
    -+	run_with_limited_processses env GIT_TRACE="$(pwd)/trace.txt" \
    ++	GIT_TRACE="$(pwd)/trace.txt" \
     +		git -C with-commit-graph fetch origin $anycommit 2>err &&
     +	! grep "fatal: promisor-remote: unable to fork off fetch subprocess" err &&
     +	grep "git fetch origin" trace.txt >actual &&
-- 
2.36.1


^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH v5 1/1] commit-graph.c: no lazy fetch in lookup_commit_in_graph()
  2022-07-12  6:50             ` [PATCH v5 0/1] " Han Xin
@ 2022-07-12  6:50               ` Han Xin
  2022-07-12  9:50                 ` Ævar Arnfjörð Bjarmason
  2022-07-12  6:58               ` [PATCH v5 0/1] " Jeff King
  1 sibling, 1 reply; 50+ messages in thread
From: Han Xin @ 2022-07-12  6:50 UTC (permalink / raw)
  To: hanxin.hx
  Cc: chiyutianyi, derrickstolee, git, haiyangtand, jonathantanmy, peff,
	git, me, Junio C Hamano, ps

The commit-graph is used to opportunistically optimize accesses to
certain pieces of information on commit objects, and
lookup_commit_in_graph() tries to say "no" when the requested commit
does not locally exist by returning NULL, in which case the caller
can ask for (which may result in on-demand fetching from a promisor
remote) and parse the commit object itself.

However, it uses a wrong helper, repo_has_object_file(), to do so.
This helper not only checks if an object is mmediately available in
the local object store, but also tries to fetch from a promisor remote.
But the fetch machinery calls lookup_commit_in_graph(), thus causing an
infinite loop.

We should make lookup_commit_in_graph() expect that a commit given to it
can be legitimately missing from the local object store, by using the
has_object_file() helper instead.

Signed-off-by: Han Xin <hanxin.hx@bytedance.com>
---
 commit-graph.c                             |  2 +-
 t/t5330-no-lazy-fetch-with-commit-graph.sh | 47 ++++++++++++++++++++++
 2 files changed, 48 insertions(+), 1 deletion(-)
 create mode 100755 t/t5330-no-lazy-fetch-with-commit-graph.sh

diff --git a/commit-graph.c b/commit-graph.c
index 92d4503336..2b04ef072d 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -898,7 +898,7 @@ struct commit *lookup_commit_in_graph(struct repository *repo, const struct obje
 		return NULL;
 	if (!search_commit_pos_in_graph(id, repo->objects->commit_graph, &pos))
 		return NULL;
-	if (!repo_has_object_file(repo, id))
+	if (!has_object(repo, id, 0))
 		return NULL;
 
 	commit = lookup_commit(repo, id);
diff --git a/t/t5330-no-lazy-fetch-with-commit-graph.sh b/t/t5330-no-lazy-fetch-with-commit-graph.sh
new file mode 100755
index 0000000000..2cc7fd7a47
--- /dev/null
+++ b/t/t5330-no-lazy-fetch-with-commit-graph.sh
@@ -0,0 +1,47 @@
+#!/bin/sh
+
+test_description='test for no lazy fetch with the commit-graph'
+
+. ./test-lib.sh
+
+test_expect_success 'setup: prepare a repository with a commit' '
+	git init with-commit &&
+	test_commit -C with-commit the-commit &&
+	oid=$(git -C with-commit rev-parse HEAD)
+'
+
+test_expect_success 'setup: prepare a repository with commit-graph contains the commit' '
+	git init with-commit-graph &&
+	echo "$(pwd)/with-commit/.git/objects" \
+		>with-commit-graph/.git/objects/info/alternates &&
+	# create a ref that points to the commit in alternates
+	git -C with-commit-graph update-ref refs/ref_to_the_commit "$oid" &&
+	# prepare some other objects to commit-graph
+	test_commit -C with-commit-graph something &&
+	git -c gc.writeCommitGraph=true -C with-commit-graph gc &&
+	test_path_is_file with-commit-graph/.git/objects/info/commit-graph
+'
+
+test_expect_success 'setup: change the alternates to what without the commit' '
+	git init --bare without-commit &&
+	git -C with-commit-graph cat-file -e $oid &&
+	echo "$(pwd)/without-commit/objects" \
+		>with-commit-graph/.git/objects/info/alternates &&
+	test_must_fail git -C with-commit-graph cat-file -e $oid
+'
+
+test_expect_success 'fetch any commit from promisor with the usage of the commit graph' '
+	# setup promisor and prepare any commit to fetch
+	git -C with-commit-graph remote add origin "$(pwd)/with-commit" &&
+	git -C with-commit-graph config remote.origin.promisor true &&
+	git -C with-commit-graph config remote.origin.partialclonefilter blob:none &&
+	test_commit -C with-commit any-commit &&
+	anycommit=$(git -C with-commit rev-parse HEAD) &&
+	GIT_TRACE="$(pwd)/trace.txt" \
+		git -C with-commit-graph fetch origin $anycommit 2>err &&
+	! grep "fatal: promisor-remote: unable to fork off fetch subprocess" err &&
+	grep "git fetch origin" trace.txt >actual &&
+	test_line_count = 1 actual
+'
+
+test_done
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: [PATCH v5 0/1] no lazy fetch in lookup_commit_in_graph()
  2022-07-12  6:50             ` [PATCH v5 0/1] " Han Xin
  2022-07-12  6:50               ` [PATCH v5 1/1] commit-graph.c: " Han Xin
@ 2022-07-12  6:58               ` Jeff King
  1 sibling, 0 replies; 50+ messages in thread
From: Jeff King @ 2022-07-12  6:58 UTC (permalink / raw)
  To: Han Xin
  Cc: chiyutianyi, derrickstolee, git, haiyangtand, jonathantanmy, git,
	me, Junio C Hamano, ps

On Tue, Jul 12, 2022 at 02:50:46PM +0800, Han Xin wrote:

> Changes since v4:
> 
> * Remove run_with_limited_processses() as it can be catched by CI
>   settings and developer workstation. Keeping it will make a trouble
>   when there are too many prcesses or stress is used.

Your v4 is already in 'next', so I think rather than replacing the
patch, we'd want a commit on top to remove run_with_limited_processes().

-Peff

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH v1] t5330: remove run_with_limited_processses()
  2022-07-01  1:34           ` [PATCH v4 0/1] no lazy fetch in lookup_commit_in_graph() Han Xin
  2022-07-01  1:34             ` [PATCH v4 1/1] commit-graph.c: " Han Xin
  2022-07-12  6:50             ` [PATCH v5 0/1] " Han Xin
@ 2022-07-12  8:01             ` Han Xin
  2 siblings, 0 replies; 50+ messages in thread
From: Han Xin @ 2022-07-12  8:01 UTC (permalink / raw)
  To: hanxin.hx
  Cc: chiyutianyi, derrickstolee, git, haiyangtand, jonathantanmy, peff,
	git, me, Junio C Hamano, ps

run_with_limited_processses() is used to end the loop faster when an
infinite loop happen. But "ulimit" is tied to the entire development
station, and the test will fail due to too many other processes or using
"--stress".

Without run_with_limited_processses() the infinite loop can also be
stopped due to global configrations or quotas, and the verification
still works fine. So let's remove run_with_limited_processses().

Signed-off-by: Han Xin <hanxin.hx@bytedance.com>
---
 t/t5330-no-lazy-fetch-with-commit-graph.sh | 25 +---------------------
 1 file changed, 1 insertion(+), 24 deletions(-)

diff --git a/t/t5330-no-lazy-fetch-with-commit-graph.sh b/t/t5330-no-lazy-fetch-with-commit-graph.sh
index be33334229..2cc7fd7a47 100755
--- a/t/t5330-no-lazy-fetch-with-commit-graph.sh
+++ b/t/t5330-no-lazy-fetch-with-commit-graph.sh
@@ -4,28 +4,6 @@ test_description='test for no lazy fetch with the commit-graph'
 
 . ./test-lib.sh
 
-run_with_limited_processses () {
-	# bash and ksh use "ulimit -u", dash uses "ulimit -p"
-	if test -n "$BASH_VERSION"
-	then
-		ulimit_max_process="-u"
-	elif test -n "$KSH_VERSION"
-	then
-		ulimit_max_process="-u"
-	fi
-	(ulimit ${ulimit_max_process-"-p"} 512 && "$@")
-}
-
-test_lazy_prereq ULIMIT_PROCESSES '
-	run_with_limited_processses true
-'
-
-if ! test_have_prereq ULIMIT_PROCESSES
-then
-	skip_all='skipping tests for no lazy fetch with the commit-graph, ulimit processes not available'
-	test_done
-fi
-
 test_expect_success 'setup: prepare a repository with a commit' '
 	git init with-commit &&
 	test_commit -C with-commit the-commit &&
@@ -59,8 +37,7 @@ test_expect_success 'fetch any commit from promisor with the usage of the commit
 	git -C with-commit-graph config remote.origin.partialclonefilter blob:none &&
 	test_commit -C with-commit any-commit &&
 	anycommit=$(git -C with-commit rev-parse HEAD) &&
-
-	run_with_limited_processses env GIT_TRACE="$(pwd)/trace.txt" \
+	GIT_TRACE="$(pwd)/trace.txt" \
 		git -C with-commit-graph fetch origin $anycommit 2>err &&
 	! grep "fatal: promisor-remote: unable to fork off fetch subprocess" err &&
 	grep "git fetch origin" trace.txt >actual &&
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: [PATCH v5 1/1] commit-graph.c: no lazy fetch in lookup_commit_in_graph()
  2022-07-12  6:50               ` [PATCH v5 1/1] commit-graph.c: " Han Xin
@ 2022-07-12  9:50                 ` Ævar Arnfjörð Bjarmason
  2022-07-13  1:26                   ` Han Xin
  0 siblings, 1 reply; 50+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-07-12  9:50 UTC (permalink / raw)
  To: Han Xin
  Cc: chiyutianyi, derrickstolee, git, haiyangtand, jonathantanmy, peff,
	git, me, Junio C Hamano, ps


On Tue, Jul 12 2022, Han Xin wrote:

> The commit-graph is used to opportunistically optimize accesses to
> certain pieces of information on commit objects, and
> lookup_commit_in_graph() tries to say "no" when the requested commit
> does not locally exist by returning NULL, in which case the caller
> can ask for (which may result in on-demand fetching from a promisor
> remote) and parse the commit object itself.
>
> However, it uses a wrong helper, repo_has_object_file(), to do so.
> This helper not only checks if an object is mmediately available in
> the local object store, but also tries to fetch from a promisor remote.
> But the fetch machinery calls lookup_commit_in_graph(), thus causing an
> infinite loop.
>
> We should make lookup_commit_in_graph() expect that a commit given to it
> can be legitimately missing from the local object store, by using the
> has_object_file() helper instead.
>
> Signed-off-by: Han Xin <hanxin.hx@bytedance.com>
> ---
>  commit-graph.c                             |  2 +-
>  t/t5330-no-lazy-fetch-with-commit-graph.sh | 47 ++++++++++++++++++++++
>  2 files changed, 48 insertions(+), 1 deletion(-)
>  create mode 100755 t/t5330-no-lazy-fetch-with-commit-graph.sh
>
> diff --git a/commit-graph.c b/commit-graph.c
> index 92d4503336..2b04ef072d 100644
> --- a/commit-graph.c
> +++ b/commit-graph.c
> @@ -898,7 +898,7 @@ struct commit *lookup_commit_in_graph(struct repository *repo, const struct obje
>  		return NULL;
>  	if (!search_commit_pos_in_graph(id, repo->objects->commit_graph, &pos))
>  		return NULL;
> -	if (!repo_has_object_file(repo, id))
> +	if (!has_object(repo, id, 0))
>  		return NULL;
>  
>  	commit = lookup_commit(repo, id);
> diff --git a/t/t5330-no-lazy-fetch-with-commit-graph.sh b/t/t5330-no-lazy-fetch-with-commit-graph.sh
> new file mode 100755
> index 0000000000..2cc7fd7a47
> --- /dev/null
> +++ b/t/t5330-no-lazy-fetch-with-commit-graph.sh
> @@ -0,0 +1,47 @@
> +#!/bin/sh
> +
> +test_description='test for no lazy fetch with the commit-graph'
> +
> +. ./test-lib.sh
> +
> +test_expect_success 'setup: prepare a repository with a commit' '
> +	git init with-commit &&
> +	test_commit -C with-commit the-commit &&
> +	oid=$(git -C with-commit rev-parse HEAD)
> +'
> +
> +test_expect_success 'setup: prepare a repository with commit-graph contains the commit' '
> +	git init with-commit-graph &&
> +	echo "$(pwd)/with-commit/.git/objects" \
> +		>with-commit-graph/.git/objects/info/alternates &&
> +	# create a ref that points to the commit in alternates
> +	git -C with-commit-graph update-ref refs/ref_to_the_commit "$oid" &&
> +	# prepare some other objects to commit-graph
> +	test_commit -C with-commit-graph something &&
> +	git -c gc.writeCommitGraph=true -C with-commit-graph gc &&
> +	test_path_is_file with-commit-graph/.git/objects/info/commit-graph
> +'
> +
> +test_expect_success 'setup: change the alternates to what without the commit' '
> +	git init --bare without-commit &&
> +	git -C with-commit-graph cat-file -e $oid &&
> +	echo "$(pwd)/without-commit/objects" \
> +		>with-commit-graph/.git/objects/info/alternates &&
> +	test_must_fail git -C with-commit-graph cat-file -e $oid
> +'
> +
> +test_expect_success 'fetch any commit from promisor with the usage of the commit graph' '
> +	# setup promisor and prepare any commit to fetch
> +	git -C with-commit-graph remote add origin "$(pwd)/with-commit" &&
> +	git -C with-commit-graph config remote.origin.promisor true &&
> +	git -C with-commit-graph config remote.origin.partialclonefilter blob:none &&
> +	test_commit -C with-commit any-commit &&
> +	anycommit=$(git -C with-commit rev-parse HEAD) &&
> +	GIT_TRACE="$(pwd)/trace.txt" \
> +		git -C with-commit-graph fetch origin $anycommit 2>err &&
> +	! grep "fatal: promisor-remote: unable to fork off fetch subprocess" err &&

This part seems quite odd, we tested the exit code, so here we're being
paranoid about not getting a specific "fatal" error message.

It seems more worthwhile to test the warnings we emit, which in this
case seem to be duplicated (but that's probably an existing issue...).


> +	grep "git fetch origin" trace.txt >actual &&
> +	test_line_count = 1 actual
> +'

I wondered if "test_subcomand" here would be better, i.e. fewer things
scraping GIT_TRACE, and using the machine-readable GIT_TRACE2_EVENT
instead...

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [External] Re: [PATCH v4 1/1] commit-graph.c: no lazy fetch in lookup_commit_in_graph()
  2022-07-12  6:37                         ` [External] " Jeff King
@ 2022-07-12 14:19                           ` Junio C Hamano
  0 siblings, 0 replies; 50+ messages in thread
From: Junio C Hamano @ 2022-07-12 14:19 UTC (permalink / raw)
  To: Jeff King
  Cc: Han Xin, Michael J Gruber, chiyutianyi, derrickstolee, git,
	haiyangtand, jonathantanmy, me, ps

Jeff King <peff@peff.net> writes:

> ... it's much more likely that a _new_
> bug will cause one. I.e., every test we already carry is a candidate to
> accidentally loop forever in this way. This is just the one we happen to
> have seen. Once fixed, I don't know that it's at any more risk of
> reocurring than any other problem.

Thanks.  

This kind of perspective is why I love to have you on the list.
Once said, it is so obvious but somehow I (or other people) failed
to phrase it so clearly.

Very much appreciated.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Re: [PATCH v5 1/1] commit-graph.c: no lazy fetch in lookup_commit_in_graph()
  2022-07-12  9:50                 ` Ævar Arnfjörð Bjarmason
@ 2022-07-13  1:26                   ` Han Xin
  0 siblings, 0 replies; 50+ messages in thread
From: Han Xin @ 2022-07-13  1:26 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: chiyutianyi, derrickstolee, git, haiyangtand, Jonathan Tan,
	Jeff King, Michael J Gruber, Taylor Blau, Junio C Hamano,
	Patrick Steinhardt

On Tue, Jul 12, 2022 at 5:52 PM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
> > +test_expect_success 'fetch any commit from promisor with the usage of the commit graph' '
> > +     # setup promisor and prepare any commit to fetch
> > +     git -C with-commit-graph remote add origin "$(pwd)/with-commit" &&
> > +     git -C with-commit-graph config remote.origin.promisor true &&
> > +     git -C with-commit-graph config remote.origin.partialclonefilter blob:none &&
> > +     test_commit -C with-commit any-commit &&
> > +     anycommit=$(git -C with-commit rev-parse HEAD) &&
> > +     GIT_TRACE="$(pwd)/trace.txt" \
> > +             git -C with-commit-graph fetch origin $anycommit 2>err &&
> > +     ! grep "fatal: promisor-remote: unable to fork off fetch subprocess" err &&
>
> This part seems quite odd, we tested the exit code, so here we're being
> paranoid about not getting a specific "fatal" error message.
>
> It seems more worthwhile to test the warnings we emit, which in this
> case seem to be duplicated (but that's probably an existing issue...).
>

So you mean the grep here is redundant?

>
> > +     grep "git fetch origin" trace.txt >actual &&
> > +     test_line_count = 1 actual
> > +'
>
> I wondered if "test_subcomand" here would be better, i.e. fewer things
> scraping GIT_TRACE, and using the machine-readable GIT_TRACE2_EVENT
> instead...

I threw this question up front but got no response.

When using test_subcommand() we should give all the args,
if we remove or add any args later, this test case will always
pass even without this fix. So, is this test case still strict?

    GIT_TRACE2_EVENT="$(PWD)/trace.txt" \
        git -C with-commit-graph fetch origin $anycommit &&
    test_subcommand ! git -c fetch.negotiationAlgorithm=noop \
        fetch origin --no-tags --no-write-fetch-head \
        --recurse-submodules=no --filter=blob:none \
        --stdin <trace.txt

Existing usages are as follows, and they all have fewer parameters:

     test_subcommand ! git gc --quiet <run-config.txt &&

     test_subcommand ! git multi-pack-index write --no-progress <trace-A

     test_subcommand ! git pack-refs --all --prune \
          <incremental-daily.txt &&

Thanks
-Han Xin

^ permalink raw reply	[flat|nested] 50+ messages in thread

end of thread, other threads:[~2022-07-13  1:27 UTC | newest]

Thread overview: 50+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-14  7:25 An endless loop fetching issue with partial clone, alternates and commit graph Haiyng Tan
2022-06-15  2:18 ` Taylor Blau
2022-06-16  3:38   ` [RFC PATCH 0/2] " Han Xin
2022-06-16  3:38     ` [RFC PATCH 1/2] commit-graph.c: add "flags" to lookup_commit_in_graph() Han Xin
2022-06-16  3:38     ` [RFC PATCH 2/2] fetch-pack.c: pass "oi_flags" " Han Xin
2022-06-17 21:47     ` [RFC PATCH 0/2] Re: An endless loop fetching issue with partial clone, alternates and commit graph Jonathan Tan
2022-06-18  3:01     ` [PATCH v1] commit-graph.c: no lazy fetch in lookup_commit_in_graph() Han Xin
2022-06-20  7:07       ` Patrick Steinhardt
2022-06-20  8:53         ` [External] " 欣韩
2022-06-20  9:05           ` Patrick Steinhardt
2022-06-21 18:23       ` Jonathan Tan
2022-06-22  3:17         ` Han Xin
2022-06-24  5:27       ` [PATCH v2 0/2] " Han Xin
2022-06-24  5:27         ` [PATCH v2 1/2] test-lib.sh: add limited processes to test-lib Han Xin
2022-06-24 16:03           ` Junio C Hamano
2022-06-25  1:35             ` Han Xin
2022-06-27 12:22               ` Junio C Hamano
2022-06-24  5:27         ` [PATCH v2 2/2] commit-graph.c: no lazy fetch in lookup_commit_in_graph() Han Xin
2022-06-24 16:56           ` Junio C Hamano
2022-06-25  2:25             ` Han Xin
2022-06-25  2:31               ` Han Xin
2022-06-28  2:02         ` [PATCH v3 0/2] " Han Xin
2022-06-28  2:02           ` [PATCH v3 1/2] test-lib.sh: add limited processes to test-lib Han Xin
2022-06-28  2:02           ` [PATCH v3 2/2] commit-graph.c: no lazy fetch in lookup_commit_in_graph() Han Xin
2022-06-28  7:49             ` Ævar Arnfjörð Bjarmason
2022-06-28 17:36               ` Junio C Hamano
2022-06-30 12:21                 ` Johannes Schindelin
2022-06-30 13:43                   ` Ævar Arnfjörð Bjarmason
2022-06-30 15:40                     ` Junio C Hamano
2022-06-30 18:47                       ` Ævar Arnfjörð Bjarmason
2022-07-01 19:31                       ` Johannes Schindelin
2022-07-01 20:47                         ` Junio C Hamano
2022-06-29  2:08               ` Han Xin
2022-06-30 17:37           ` test name conflict + js/ci-github-workflow-markup regression (was: [PATCH v3 0/2] no lazy fetch in lookup_commit_in_graph()) Ævar Arnfjörð Bjarmason
2022-07-01  1:34           ` [PATCH v4 0/1] no lazy fetch in lookup_commit_in_graph() Han Xin
2022-07-01  1:34             ` [PATCH v4 1/1] commit-graph.c: " Han Xin
2022-07-09 12:23               ` Michael J Gruber
2022-07-11 15:09                 ` Jeff King
2022-07-11 20:17                   ` Junio C Hamano
2022-07-12  1:52                     ` [External] " Han Xin
2022-07-12  5:23                       ` Junio C Hamano
2022-07-12  5:32                         ` Han Xin
2022-07-12  6:37                         ` [External] " Jeff King
2022-07-12 14:19                           ` Junio C Hamano
2022-07-12  6:50             ` [PATCH v5 0/1] " Han Xin
2022-07-12  6:50               ` [PATCH v5 1/1] commit-graph.c: " Han Xin
2022-07-12  9:50                 ` Ævar Arnfjörð Bjarmason
2022-07-13  1:26                   ` Han Xin
2022-07-12  6:58               ` [PATCH v5 0/1] " Jeff King
2022-07-12  8:01             ` [PATCH v1] t5330: remove run_with_limited_processses() Han Xin

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).