git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Derrick Stolee <derrickstolee@github.com>
To: Jacob Keller <jacob.e.keller@intel.com>, git@vger.kernel.org
Cc: Junio C Hamano <gitster@pobox.com>,
	Jacob Keller <jacob.keller@gmail.com>
Subject: Re: [PATCH] name-rev: use generation numbers if available
Date: Mon, 28 Feb 2022 14:50:55 -0500	[thread overview]
Message-ID: <e4096124-e566-0842-f17c-366645c3e37c@github.com> (raw)
In-Reply-To: <20220228190738.2112503-1-jacob.e.keller@intel.com>

On 2/28/2022 2:07 PM, Jacob Keller wrote:
> From: Jacob Keller <jacob.keller@gmail.com>
> 
> If a commit in a sequence of linear history has a non-monotonically
> increasing commit timestamp, git name-rev might not properly name the
> commit.
> 
> This occurs because name-rev uses a heuristic of the commit date to
> avoid searching down tags which lead to commits that are older than the
> named commit. This is intended to avoid work on larger repositories.
> 
> This heuristic impacts git name-rev, and by extension git describe
> --contains which is built on top of name-rev.
> 
> Further more, if --annotate-stdin is used, the heuristic is not enabled
> because the full history has to be analyzed anyways. This results in
> some confusion if a user sees that --annotate-stdin works but a normal
> name-rev does not.
> 
> If the repository has a commit graph, we can use the generation numbers
> instead of using the commit dates. This is essentially the same check
> except that generation numbers make it exact, where the commit date
> heuristic could be incorrect due to clock errors.
> 
> Add a test case which covers this behavior and shows how the commit
> graph makes the name-rev process work.
> 
> Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
> ---
> The initial implementation of this came from [1]. Should this have Stolee's
> sign-off?
> 
> [1]: https://lore.kernel.org/git/42d2a9fe-a3f2-f841-dcd1-27a0440521b6@github.com/

I think your implementation is sufficiently different (and better)
that you don't need my co-authorship _or_ sign-off.

> +static void set_commit_cutoff(struct commit *commit)
> +{
> +	timestamp_t generation;
> +
> +	if (cutoff > commit->date)
> +		cutoff = commit->date;
> +
> +	generation = commit_graph_generation(commit);
> +	if (generation_cutoff > generation)
> +		generation_cutoff = generation;
> +}

I appreciate that you split this out into its own method to isolate
the logic.

> +/* Check if a commit is before the cutoff. Prioritize generation numbers
> + * first, but use the commit timestamp if we lack generation data.
> + */
> +static int commit_is_before_cutoff(struct commit *commit)
> +{
> +	if (generation_cutoff < GENERATION_NUMBER_INFINITY)
> +		return commit_graph_generation(commit) < generation_cutoff;
> +
> +	return commit->date < cutoff;
> +}

There are two subtle things going on here when generation_cutoff is
zero:

1. In a commit-graph with topological levels _or_ generation numbers v2,
   commit_graph_generation(commit) will always be positive, so we don't
   need to do the lookup.

2. If the commit-graph was written by an older Git version before
   topological levels were implemented, then the generation of commits
   in the commit-graph are all zero(!). This means that the logic here
   would be incorrect for the "all" case.

The fix for both cases is to return 1 if generation_cutoff is zero:

	if (!generaton_cutoff)
		return 1;
> -	if (start_commit->date < cutoff)
> +	if (commit_is_before_cutoff(start_commit))

> -			if (parent->date < cutoff)
> +			if (commit_is_before_cutoff(parent))

Nice replacements.

> -	if (all || annotate_stdin)
> +	if (all || annotate_stdin) {
> +		generation_cutoff = 0;
>  		cutoff = 0;
> +	}

Good.

> -		if (commit) {
> -			if (cutoff > commit->date)
> -				cutoff = commit->date;
> -		}
> +		if (commit)
> +			set_commit_cutoff(commit);

Another nice replacement.

> +# A-B-C-D-E-main
> +#
> +# Where C has a non-monotonically increasing commit timestamp w.r.t. other
> +# commits
> +test_expect_success 'non-monotonic commit dates setup' '
> +	UNIX_EPOCH_ZERO="@0 +0000" &&
> +	git init non-monotonic &&
> +	test_commit -C non-monotonic A &&
> +	test_commit -C non-monotonic --no-tag B &&
> +	test_commit -C non-monotonic --no-tag --date "$UNIX_EPOCH_ZERO" C &&
> +	test_commit -C non-monotonic D &&
> +	test_commit -C non-monotonic E
> +'
> +
> +test_expect_success 'name-rev with commitGraph handles non-monotonic timestamps' '
> +	test_config -C non-monotonic core.commitGraph true &&
> +	(
> +		cd non-monotonic &&
> +
> +		# Ensure commit graph is up to date
> +		git -c gc.writeCommitGraph=true gc &&

"git commit-graph write --reachable" would suffice here.


> +
> +		echo "main~3 tags/D~2" >expect &&
> +		git name-rev --tags main~3 >actual &&
> +
> +		test_cmp expect actual
> +	)
> +'
> +
> +test_expect_success 'name-rev --all works with non-monotonic' '

This is working because of the commit-graph, right? We still have
it from the previous test, so we aren't testing that this works
when we only have the commit date as a cutoff.

> +	(
> +		cd non-monotonic &&
> +
> +		cat >expect <<-\EOF &&
> +		E
> +		D
> +		D~1
> +		D~2
> +		A
> +		EOF
> +
> +		git log --pretty=%H >revs &&
> +		git name-rev --tags --annotate-stdin --name-only <revs >actual &&
> +
> +		test_cmp expect actual
> +	)

Do you want to include a test showing the "expected" behavior
when there isn't a commit-graph file? You might need to delete
an existing commit-graph (it will exist in the case of
GIT_TEST_COMMIT_GRAPH=1).

I also see that you intended to test the "--all" option, which
is not included in your test. That's probably the real key to
getting this test to work correctly. Deleting the graph will
probably cause a failure on this test unless "--all" is added.

Thanks,
-Stolee


  reply	other threads:[~2022-02-28 19:57 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-28 19:07 [PATCH] name-rev: use generation numbers if available Jacob Keller
2022-02-28 19:50 ` Derrick Stolee [this message]
2022-02-28 20:20   ` Keller, Jacob E
2022-02-28 20:24     ` Derrick Stolee
2022-02-28 20:59       ` Keller, Jacob E
  -- strict thread matches above, loose matches on Subject: below --
2022-02-28 21:50 [PATCH v2 0/1] " Jacob Keller
2022-02-28 21:50 ` [PATCH] " Jacob Keller
2022-03-01  2:36   ` Junio C Hamano
2022-03-01  7:08     ` Jacob Keller
2022-03-01  7:09       ` Jacob Keller
2022-03-01  7:33       ` Junio C Hamano
2022-03-01 15:09         ` Derrick Stolee
2022-03-01 19:52           ` Keller, Jacob E
2022-03-01 19:56             ` Derrick Stolee
2022-03-01 20:22               ` Junio C Hamano
2022-03-01 22:46                 ` Keller, Jacob E
2022-03-03  1:10                   ` Junio C Hamano
2022-03-07 20:22                     ` Jacob Keller
2022-03-07 20:26                       ` Derrick Stolee
2022-03-07 22:30                         ` Keller, Jacob E
2022-03-07 22:43                           ` Derrick Stolee
2022-03-07 22:52                           ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e4096124-e566-0842-f17c-366645c3e37c@github.com \
    --to=derrickstolee@github.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jacob.e.keller@intel.com \
    --cc=jacob.keller@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).