git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: ZheNing Hu <adlternative@gmail.com>
To: Derrick Stolee <derrickstolee@github.com>
Cc: "ZheNing Hu via GitGitGadget" <gitgitgadget@gmail.com>,
	"Git List" <git@vger.kernel.org>,
	"Christian Couder" <christian.couder@gmail.com>,
	"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
	"Jeff King" <peff@peff.net>,
	"Jeff Hostetler" <jeffhost@microsoft.com>,
	"Junio C Hamano" <gitster@pobox.com>,
	"Johannes Schindelin" <johannes.schindelin@gmx.de>
Subject: Re: [PATCH 1/3] commit-graph: let commit graph respect commit graft
Date: Sun, 4 Sep 2022 13:57:18 +0800	[thread overview]
Message-ID: <CAOLTT8SMQ9nPPs0OnxhH4n_A46WqW8Fv=priFs=NLuBycoBuug@mail.gmail.com> (raw)
In-Reply-To: <8b9e8c2d-7a64-2d66-83a8-2a7daff9a81c@github.com>

Derrick Stolee <derrickstolee@github.com> 于2022年9月2日周五 03:18写道:
>
> On 9/1/2022 5:41 AM, ZheNing Hu via GitGitGadget wrote:
> > From: ZheNing Hu <adlternative@gmail.com>
> >
> > In repo_parse_commit_internal(), if we want to use
> > commit graph, it will call parse_commit_in_graph() to
> > parse commit's content from commit graph, otherwise
> > call repo_read_object_file() to parse commit's content
> > from commit object.
> >
> > repo_read_object_file() will respect commit graft,
> > which can correctly amend commit's parents. But
> > parse_commit_in_graph() not. Inconsistencies here may
> > result in incorrect processing of shallow clone.
> >
> > So let parse_commit_in_graph() respect commit graft as
> > repo_read_object_file() does, which can solve this problem.
>
> If grafts or replace-objects exist, then the commit-graph
> is disabled and this code will never be called. I would
> expect a test case demonstrating the change in behavior
> here, but that is impossible.
>

Thanks for the clarification.
I don't really know what's the wrong here, but just let do a little test:

1. Revert this commit 19fd72c34dcd1332df638d76b0b028e9d9da3d41
$ git revert 19fd72

2. Clone the git repo
$ git clone --bare git@github.com:git/git.git

3. Write commit graph
$ git commit-graph write

4. Use the depth=<depth> to clone (depth=1)
$  git clone --no-checkout --no-local --=depth=1 git.git git1
Cloning into 'git1'...
remote: Enumerating objects: 4306, done.
remote: Counting objects: 100% (4306/4306), done.
remote: Compressing objects: 100% (3785/3785), done.

4.  Use the depth=<depth> to clone (depth=2)
$  git clone --no-checkout --no-local --=depth=2 git.git git2
Cloning into 'git2'...
remote: Enumerating objects: 4311, done.
remote: Counting objects: 100% (4311/4311), done.
remote: Compressing objects: 100% (3788/3788), done.

5. Use the depth filter to clone (depth=1)
$  git clone --no-checkout --no-local --filter=depth:1 git.git git3
Cloning into 'git3'...
remote: Enumerating objects: 4306, done.
remote: Counting objects: 100% (4306/4306), done.
remote: Compressing objects: 100% (3785/3785), done.

6. Use the depth filter to clone (depth=2)
$  git clone --no-checkout --no-local --filter=depth:2 git.git git4
Cloning into 'git4'...
remote: Enumerating objects: 322987, done.
remote: Counting objects: 100% (322987/322987), done.
remote: Compressing objects: 100% (77441/77441), done.

As we can see, when we use --filter=depth:<depth> (depth >= 2),
it seems like we clone a lot of objects. The result is significantly
different from git clone --depth=<depth> (depth >= 2).

So I debug it by reproducing the git pack-objects process:

I find there are different action between --filter=depth:<depth> and
 --depth=<depth> .

--filter=depth:<depth> will be successfully resolved commit parents in
parse_commit_in_graph(),

Call stack( cmd_pack_objects -> get_object_list -> traverse_commit_list ->
traverse_commit_list_filtered -> do_traverse -> get_revision ->
get_revision_internal -> get_revision_1 -> process_parents ->
repo_parse_commit_gently -> repo_parse_commit_internal ->
parse_commit_in_graph)

--depth=<depth> will failed in parse_commit_in_graph(), and call
repo_read_object_file() to resolved commit parents.

Call stack( cmd_pack_objects -> get_object_list -> traverse_commit_list ->
traverse_commit_list_filtered -> do_traverse -> get_revision ->
get_revision_internal -> get_revision_1 -> process_parents ->
repo_parse_commit_gently -> repo_parse_commit_internal ->
repo_read_object_file)

> The commit-graph parsing should not be bogged down with
> this logic.
>

So I try to fix this problem by let commit-graph respect commit-graft.
I don't know if I overlook something before...

> Thanks,
> -Stolee
>

Thanks,
ZheNing Hu

  reply	other threads:[~2022-09-04  6:01 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-01  9:41 [PATCH 0/3] list-object-filter: introduce depth filter ZheNing Hu via GitGitGadget
2022-09-01  9:41 ` [PATCH 1/3] commit-graph: let commit graph respect commit graft ZheNing Hu via GitGitGadget
2022-09-01 19:18   ` Derrick Stolee
2022-09-04  5:57     ` ZheNing Hu [this message]
2022-09-01  9:41 ` [PATCH 2/3] list-object-filter: pass traversal_context in filter_init_fn ZheNing Hu via GitGitGadget
2022-09-01  9:41 ` [PATCH 3/3] list-object-filter: introduce depth filter ZheNing Hu via GitGitGadget
2022-09-01 19:24 ` [PATCH 0/3] " Derrick Stolee
2022-09-02 13:48   ` Johannes Schindelin
2022-09-04  9:14     ` ZheNing Hu
2022-09-07 10:18       ` Johannes Schindelin
2022-09-11 10:59         ` ZheNing Hu
2022-09-04  7:27   ` ZheNing Hu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAOLTT8SMQ9nPPs0OnxhH4n_A46WqW8Fv=priFs=NLuBycoBuug@mail.gmail.com' \
    --to=adlternative@gmail.com \
    --cc=avarab@gmail.com \
    --cc=christian.couder@gmail.com \
    --cc=derrickstolee@github.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=gitster@pobox.com \
    --cc=jeffhost@microsoft.com \
    --cc=johannes.schindelin@gmx.de \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).