git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jonathan Tan <jonathantanmy@google.com>
To: Elijah Newren via GitGitGadget <gitgitgadget@gmail.com>
Cc: Jonathan Tan <jonathantanmy@google.com>,
	git@vger.kernel.org, Elijah Newren <newren@gmail.com>
Subject: Re: [PATCH v2 2/2] revision: allow --ancestry-path to take an argument
Date: Thu, 18 Aug 2022 15:24:16 -0700	[thread overview]
Message-ID: <20220818222416.3567602-1-jonathantanmy@google.com> (raw)
In-Reply-To: <f580ec6d06072ea6ed2ecc4f8142b94fccbe4c0f.1660803467.git.gitgitgadget@gmail.com>

Thanks - overall this looks good. I only have some minor textual
comments and one small code comment.

"Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
> From: Elijah Newren <newren@gmail.com>
> 
> We have long allowed users to run e.g.
>     git log --ancestry-path master..seen
> which shows all commits which satisfy all three of these criteria:
>   * are an ancestor of seen
>   * are not an ancestor master

are not an ancestor *of* master

> diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt
> index 2f85726745a..001e49cee55 100644
> --- a/Documentation/rev-list-options.txt
> +++ b/Documentation/rev-list-options.txt
> @@ -389,12 +389,15 @@ Default mode::
>  	merges from the resulting history, as there are no selected
>  	commits contributing to this merge.
>  
> ---ancestry-path::
> +--ancestry-path[=<commit>]::
>  	When given a range of commits to display (e.g. 'commit1..commit2'
> -	or 'commit2 {caret}commit1'), only display commits that exist
> -	directly on the ancestry chain between the 'commit1' and
> -	'commit2', i.e. commits that are both descendants of 'commit1',
> -	and ancestors of 'commit2'.
> +	or 'commit2 {caret}commit1'), only display commits in that
> +	range where <commit> is part of the ancestry chain.  By "part of
> +	the ancestry chain", we mean including <commit> itself and
> +	commits that are either ancestors or descendants of <commit>.
> +	If no commit is specified, use 'commit1' (the excluded part of
> +	the range) as <commit>.  Can be passed multiple times to look for
> +	commits in the ancestry range of multiple commits.

"Ancestry chain" seems to be used multiple times in the Git codebase,
apparently with slightly different definitions, so probably best to
clear up at least this part by not reusing the term. It's also probably
not worth introducing a new term "ancestry range". Maybe rewrite to
say:

  When given a range of commits to display (e.g. 'commit1..commit2'
  or 'commit2 {caret}commit1'), only display commits in that
  range that are ancestors of <commit>, descendants of <commit>, or
  <commit> itself. If no commit is specified, use 'commit1' (the
  excluded part of the range) as <commit>. Can be passed multiple times;
  if so, a commit is included if it is any of the commits given or if it
  is an ancestor or descendant of one of them.

> @@ -568,11 +571,10 @@ Note the major differences in `N`, `P`, and `Q` over `--full-history`:
>  
>  There is another simplification mode available:
>  
> ---ancestry-path::
> -	Limit the displayed commits to those directly on the ancestry
> -	chain between the ``from'' and ``to'' commits in the given commit
> -	range. I.e. only display commits that are ancestor of the ``to''
> -	commit and descendants of the ``from'' commit.
> +--ancestry-path[=<commit>]::
> +	Limit the displayed commits to those containing <commit> in their
> +	ancestry path.  I.e. only display <commit> and commits which have
> +	<commit> as either a direct ancestor or descendant.

Can we refer back to the documentation of --ancestry-path instead?

> @@ -1304,13 +1304,20 @@ static int still_interesting(struct commit_list *src, timestamp_t date, int slop
>  }
>  
>  /*
> - * "rev-list --ancestry-path A..B" computes commits that are ancestors
> + * "rev-list --ancestry-path=C A..B" computes commits that are ancestors
>   * of B but not ancestors of A but further limits the result to those
> - * that are descendants of A.  This takes the list of bottom commits and
> - * the result of "A..B" without --ancestry-path, and limits the latter
> - * further to the ones that can reach one of the commits in "bottom".
> + * that have C in their ancestry path (i.e. are either ancestors of C,
> + * descendants of C, or are C).  If multiple --ancestry-path=$COMMITTISH
> + * arguments are supplied, we limit the result to those that have at
> + * least one of those COMMITTISH in their ancestry path. If
> + * --ancestry-path is specified with no commit, we use all bottom
> + * commits for C.
> + *
> + * Before this function is called, ancestors of C will have already been
> + * marked with ANCESTRY_PATH previously, so we just need to also mark
> + * the descendants here, then collect both sets of commits.
>   */
> -static void limit_to_ancestry(struct commit_list *bottom, struct commit_list *list)
> +static void limit_to_ancestry(struct commit_list *bottoms, struct commit_list *list)

I thought the original description of this function ("This takes the
list...") to be clear and it would be nice to retain it. So, e.g.:

  "rev-list --ancestry-path=C_0 [--ancestry-path=C_1 ...] A..B" computes commits
  that are ancestors of B but not ancestors of A but further limits the
  result to those that have any of C in their ancestry path (i.e. are
  either ancestors of any of C, descendants of any of C, or are any of
  C). If --ancestry-path is specified with no commit, we use all bottom
  commits for C.
  
  Before this function is called, ancestors of C will have already been
  marked with ANCESTRY_PATH previously.

  This takes the list of bottom commits and the result of "A..B" without
  --ancestry-path, and limits the latter further to the ones that have
  any of C in their ancestry path. Since the ancestors of C have already
  been marked (a prerequisite of this function), we just need to mark
  the descendants, then exclude any commit that does not have any of
  these marks.

Optional: Besides that, from what I can tell, sometimes the C commits
themselves are marked with ANCESTRY_PATH (when they are explicitly
specified) and sometimes they are not (when they are not explicitly
specified). It's not a bug here, but it might be worth handling that in
the ancestry_path_need_bottoms codepath (instead of explicitly setting
TMP_MARK on the bottoms in limit_to_ancestry() - that way, I think we
can also use ANCESTRY_PATH instead of TMP_MARK throughout the ancestry
path codepaths, but I haven't tested it), at least to prevent possible
future bugs.

> @@ -2213,7 +2220,7 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
>  			       const struct setup_revision_opt* opt)
>  {
>  	const char *arg = argv[0];
> -	const char *optarg;
> +	const char *optarg = NULL;
>  	int argcount;
>  	const unsigned hexsz = the_hash_algo->hexsz;

[snip]

> @@ -2280,10 +2287,26 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
>  		revs->first_parent_only = 1;
>  	} else if (!strcmp(arg, "--exclude-first-parent-only")) {
>  		revs->exclude_first_parent_only = 1;
> -	} else if (!strcmp(arg, "--ancestry-path")) {
> +	} else if (!strcmp(arg, "--ancestry-path") ||
> +		   skip_prefix(arg, "--ancestry-path=", &optarg)) {

This and the above hunk might cause bugs if --ancestry-path was first
specified with a commit and then specified without. Probably best to
break this up into separate "else if" branches, even though there is a
bit of code duplication (and also remove the "= NULL" addition in the
above hunk).

> @@ -164,6 +165,7 @@ struct rev_info {
>  			cherry_mark:1,
>  			bisect:1,
>  			ancestry_path:1,
> +			ancestry_path_need_bottoms:1,

Might be better named as ancestry_path_implicit_bottoms? And probably
worth documenting, e.g.

  True if --ancestry-path was specified without an argument. The bottom
  revisions are implicitly the arguments in this case.

  parent reply	other threads:[~2022-08-18 22:25 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-17  2:48 [PATCH 0/2] Allow --ancestry-path to take an argument Elijah Newren via GitGitGadget
2022-08-17  2:48 ` [PATCH 1/2] rev-list-options.txt: fix simple typo Elijah Newren via GitGitGadget
2022-08-17  2:48 ` [PATCH 2/2] revision: allow --ancestry-path to take an argument Elijah Newren via GitGitGadget
2022-08-17 22:42   ` Junio C Hamano
2022-08-18  4:01     ` Elijah Newren
2022-08-18  6:17 ` [PATCH v2 0/2] Allow " Elijah Newren via GitGitGadget
2022-08-18  6:17   ` [PATCH v2 1/2] rev-list-options.txt: fix simple typo Elijah Newren via GitGitGadget
2022-08-18  6:17   ` [PATCH v2 2/2] revision: allow --ancestry-path to take an argument Elijah Newren via GitGitGadget
2022-08-18 15:30     ` Derrick Stolee
2022-08-18 15:50       ` Ævar Arnfjörð Bjarmason
2022-08-18 16:51         ` Derrick Stolee
2022-08-18 16:56         ` Eric Sunshine
2022-08-19  1:12         ` Elijah Newren
2022-08-19  2:45           ` Ævar Arnfjörð Bjarmason
2022-08-18 16:53       ` Eric Sunshine
2022-08-19  1:01       ` Elijah Newren
2022-08-18 22:24     ` Jonathan Tan [this message]
2022-08-19  1:23       ` Elijah Newren
2022-08-19 17:25         ` Jonathan Tan
2022-08-18 16:32   ` [PATCH v2 0/2] Allow " Junio C Hamano
2022-08-19  4:28   ` [PATCH v3 0/3] " Elijah Newren via GitGitGadget
2022-08-19  4:28     ` [PATCH v3 1/3] rev-list-options.txt: fix simple typo Elijah Newren via GitGitGadget
2022-08-19  4:28     ` [PATCH v3 2/3] t6019: modernize tests with helper Derrick Stolee via GitGitGadget
2022-08-19  4:28     ` [PATCH v3 3/3] revision: allow --ancestry-path to take an argument Elijah Newren via GitGitGadget
2022-08-19 17:54       ` Junio C Hamano
2022-08-20  0:10         ` Elijah Newren
2022-08-19 12:53     ` [PATCH v3 0/3] Allow " Derrick Stolee
2022-08-19 21:08       ` Junio C Hamano
2022-08-20  0:13         ` Elijah Newren

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220818222416.3567602-1-jonathantanmy@google.com \
    --to=jonathantanmy@google.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=newren@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).