From: Derrick Stolee <derrickstolee@github.com>
To: Jacob Keller <jacob.e.keller@intel.com>, git@vger.kernel.org
Cc: Junio C Hamano <gitster@pobox.com>,
Jacob Keller <jacob.keller@gmail.com>
Subject: Re: [PATCH] name-rev: use generation numbers if available
Date: Mon, 28 Feb 2022 14:50:55 -0500 [thread overview]
Message-ID: <e4096124-e566-0842-f17c-366645c3e37c@github.com> (raw)
In-Reply-To: <20220228190738.2112503-1-jacob.e.keller@intel.com>
On 2/28/2022 2:07 PM, Jacob Keller wrote:
> From: Jacob Keller <jacob.keller@gmail.com>
>
> If a commit in a sequence of linear history has a non-monotonically
> increasing commit timestamp, git name-rev might not properly name the
> commit.
>
> This occurs because name-rev uses a heuristic of the commit date to
> avoid searching down tags which lead to commits that are older than the
> named commit. This is intended to avoid work on larger repositories.
>
> This heuristic impacts git name-rev, and by extension git describe
> --contains which is built on top of name-rev.
>
> Further more, if --annotate-stdin is used, the heuristic is not enabled
> because the full history has to be analyzed anyways. This results in
> some confusion if a user sees that --annotate-stdin works but a normal
> name-rev does not.
>
> If the repository has a commit graph, we can use the generation numbers
> instead of using the commit dates. This is essentially the same check
> except that generation numbers make it exact, where the commit date
> heuristic could be incorrect due to clock errors.
>
> Add a test case which covers this behavior and shows how the commit
> graph makes the name-rev process work.
>
> Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
> ---
> The initial implementation of this came from [1]. Should this have Stolee's
> sign-off?
>
> [1]: https://lore.kernel.org/git/42d2a9fe-a3f2-f841-dcd1-27a0440521b6@github.com/
I think your implementation is sufficiently different (and better)
that you don't need my co-authorship _or_ sign-off.
> +static void set_commit_cutoff(struct commit *commit)
> +{
> + timestamp_t generation;
> +
> + if (cutoff > commit->date)
> + cutoff = commit->date;
> +
> + generation = commit_graph_generation(commit);
> + if (generation_cutoff > generation)
> + generation_cutoff = generation;
> +}
I appreciate that you split this out into its own method to isolate
the logic.
> +/* Check if a commit is before the cutoff. Prioritize generation numbers
> + * first, but use the commit timestamp if we lack generation data.
> + */
> +static int commit_is_before_cutoff(struct commit *commit)
> +{
> + if (generation_cutoff < GENERATION_NUMBER_INFINITY)
> + return commit_graph_generation(commit) < generation_cutoff;
> +
> + return commit->date < cutoff;
> +}
There are two subtle things going on here when generation_cutoff is
zero:
1. In a commit-graph with topological levels _or_ generation numbers v2,
commit_graph_generation(commit) will always be positive, so we don't
need to do the lookup.
2. If the commit-graph was written by an older Git version before
topological levels were implemented, then the generation of commits
in the commit-graph are all zero(!). This means that the logic here
would be incorrect for the "all" case.
The fix for both cases is to return 1 if generation_cutoff is zero:
if (!generaton_cutoff)
return 1;
> - if (start_commit->date < cutoff)
> + if (commit_is_before_cutoff(start_commit))
> - if (parent->date < cutoff)
> + if (commit_is_before_cutoff(parent))
Nice replacements.
> - if (all || annotate_stdin)
> + if (all || annotate_stdin) {
> + generation_cutoff = 0;
> cutoff = 0;
> + }
Good.
> - if (commit) {
> - if (cutoff > commit->date)
> - cutoff = commit->date;
> - }
> + if (commit)
> + set_commit_cutoff(commit);
Another nice replacement.
> +# A-B-C-D-E-main
> +#
> +# Where C has a non-monotonically increasing commit timestamp w.r.t. other
> +# commits
> +test_expect_success 'non-monotonic commit dates setup' '
> + UNIX_EPOCH_ZERO="@0 +0000" &&
> + git init non-monotonic &&
> + test_commit -C non-monotonic A &&
> + test_commit -C non-monotonic --no-tag B &&
> + test_commit -C non-monotonic --no-tag --date "$UNIX_EPOCH_ZERO" C &&
> + test_commit -C non-monotonic D &&
> + test_commit -C non-monotonic E
> +'
> +
> +test_expect_success 'name-rev with commitGraph handles non-monotonic timestamps' '
> + test_config -C non-monotonic core.commitGraph true &&
> + (
> + cd non-monotonic &&
> +
> + # Ensure commit graph is up to date
> + git -c gc.writeCommitGraph=true gc &&
"git commit-graph write --reachable" would suffice here.
> +
> + echo "main~3 tags/D~2" >expect &&
> + git name-rev --tags main~3 >actual &&
> +
> + test_cmp expect actual
> + )
> +'
> +
> +test_expect_success 'name-rev --all works with non-monotonic' '
This is working because of the commit-graph, right? We still have
it from the previous test, so we aren't testing that this works
when we only have the commit date as a cutoff.
> + (
> + cd non-monotonic &&
> +
> + cat >expect <<-\EOF &&
> + E
> + D
> + D~1
> + D~2
> + A
> + EOF
> +
> + git log --pretty=%H >revs &&
> + git name-rev --tags --annotate-stdin --name-only <revs >actual &&
> +
> + test_cmp expect actual
> + )
Do you want to include a test showing the "expected" behavior
when there isn't a commit-graph file? You might need to delete
an existing commit-graph (it will exist in the case of
GIT_TEST_COMMIT_GRAPH=1).
I also see that you intended to test the "--all" option, which
is not included in your test. That's probably the real key to
getting this test to work correctly. Deleting the graph will
probably cause a failure on this test unless "--all" is added.
Thanks,
-Stolee
next prev parent reply other threads:[~2022-02-28 19:57 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-02-28 19:07 [PATCH] name-rev: use generation numbers if available Jacob Keller
2022-02-28 19:50 ` Derrick Stolee [this message]
2022-02-28 20:20 ` Keller, Jacob E
2022-02-28 20:24 ` Derrick Stolee
2022-02-28 20:59 ` Keller, Jacob E
-- strict thread matches above, loose matches on Subject: below --
2022-02-28 21:50 [PATCH v2 0/1] " Jacob Keller
2022-02-28 21:50 ` [PATCH] " Jacob Keller
2022-03-01 2:36 ` Junio C Hamano
2022-03-01 7:08 ` Jacob Keller
2022-03-01 7:09 ` Jacob Keller
2022-03-01 7:33 ` Junio C Hamano
2022-03-01 15:09 ` Derrick Stolee
2022-03-01 19:52 ` Keller, Jacob E
2022-03-01 19:56 ` Derrick Stolee
2022-03-01 20:22 ` Junio C Hamano
2022-03-01 22:46 ` Keller, Jacob E
2022-03-03 1:10 ` Junio C Hamano
2022-03-07 20:22 ` Jacob Keller
2022-03-07 20:26 ` Derrick Stolee
2022-03-07 22:30 ` Keller, Jacob E
2022-03-07 22:43 ` Derrick Stolee
2022-03-07 22:52 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e4096124-e566-0842-f17c-366645c3e37c@github.com \
--to=derrickstolee@github.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=jacob.e.keller@intel.com \
--cc=jacob.keller@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).