git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* git log --since to not stop after first old commit?
@ 2022-04-01  8:21 Miklos Vajna
  2022-04-01  9:57 ` Ævar Arnfjörð Bjarmason
  2022-04-01 17:51 ` Junio C Hamano
  0 siblings, 2 replies; 24+ messages in thread
From: Miklos Vajna @ 2022-04-01  8:21 UTC (permalink / raw)
  To: git

Hi,

I wanted to look at commits of a contributor from the last year, and
noticed that I only see commits from this year, not last year when I use:

        git log --author="that person" --since="1 year ago"

Digging around in the history, one other contributor pushed a mistake on
1st Jan, where the author date was supposed to be 2022-01-01, but
happened to be 2021-01-01. Knowing that, it makes sense that 'git log'
stopped at that commit by default.

I wonder though, is there any option to "force" git log to walk all
reachable commits from HEAD, but just show the ones which match the
--since criteria?

Or is this need so special that the best is to parse the output of 'git
rev-list' and do my own filtering for author and date?

Thanks,

Miklos

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: git log --since to not stop after first old commit?
  2022-04-01  8:21 git log --since to not stop after first old commit? Miklos Vajna
@ 2022-04-01  9:57 ` Ævar Arnfjörð Bjarmason
  2022-04-01 10:14   ` Miklos Vajna
  2022-04-01 17:51 ` Junio C Hamano
  1 sibling, 1 reply; 24+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-04-01  9:57 UTC (permalink / raw)
  To: Miklos Vajna; +Cc: git


On Fri, Apr 01 2022, Miklos Vajna wrote:

> Hi,
>
> I wanted to look at commits of a contributor from the last year, and
> noticed that I only see commits from this year, not last year when I use:
>
>         git log --author="that person" --since="1 year ago"
>
> Digging around in the history, one other contributor pushed a mistake on
> 1st Jan, where the author date was supposed to be 2022-01-01, but
> happened to be 2021-01-01. Knowing that, it makes sense that 'git log'
> stopped at that commit by default.

So (just making sure I understand this) in this case the --since option
is behaving as expected in the sense that the information in the commit
itself matches what it's finding, but what you'd really like for it to
consider some "adjusted" commit date?

I.e. to be smart enough to spot that it should include a commit from
2021 if all the preceding commits are from 2022, or some other similar
heuristic?

Or...

> I wonder though, is there any option to "force" git log to walk all
> reachable commits from HEAD, but just show the ones which match the
> --since criteria?

...did we stop the walk as soon as we saw that 2021 commit?

> Or is this need so special that the best is to parse the output of 'git
> rev-list' and do my own filtering for author and date?

I think this is somewhere between "we could grow a new feature to be
more helpful" (we adjust commit dates in other places, i.e. commit-graph
reachability), or "a bug" depending on the answersto the above, but I
obviously haven't dug much. Hope this helps!

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: git log --since to not stop after first old commit?
  2022-04-01  9:57 ` Ævar Arnfjörð Bjarmason
@ 2022-04-01 10:14   ` Miklos Vajna
  2022-04-01 13:51     ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 24+ messages in thread
From: Miklos Vajna @ 2022-04-01 10:14 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git

Hi Ævar,

On Fri, Apr 01, 2022 at 11:57:33AM +0200, Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:
> So (just making sure I understand this) in this case the --since option
> is behaving as expected in the sense that the information in the commit
> itself matches what it's finding, but what you'd really like for it to
> consider some "adjusted" commit date?
> 
> I.e. to be smart enough to spot that it should include a commit from
> 2021 if all the preceding commits are from 2022, or some other similar
> heuristic?

No heuristics. Just a way to not stop at the first commit that doesn't
match the --since criteria. Here is an example:

Given:

rm -rf .git file
git init
echo a > file
git add file
git commit -m init
echo a >> file
git add file
GIT_COMMITTER_DATE="2021-01-01 0:00" git commit -m second
echo a >> file
git add file
git commit -m third

When I do:

git log --pretty=oneline --since="2022-01-01"

Then current I get:

91a24b6ccba6b1d26c3bd5bcea7ff86e6997b599 (HEAD -> master) third

And I would like to have an opt-in way to instead get:

91a24b6ccba6b1d26c3bd5bcea7ff86e6997b599 (HEAD -> master) third
e259a40784d3d70f3878105adac380c8e8a8ae52 init

Arguing that both "init" and "third" was committed this year.

The question is if there is a way to do this already (perhaps I missed
something in the docs or didn't notice it while I briefly researched the
commit walk code), or in case I want to do this, then would it make
sense to have this feature in git or this is more a "run git rev-list
and do your own filtering" case?

Thanks,

Miklos

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: git log --since to not stop after first old commit?
  2022-04-01 10:14   ` Miklos Vajna
@ 2022-04-01 13:51     ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 24+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-04-01 13:51 UTC (permalink / raw)
  To: Miklos Vajna; +Cc: git


On Fri, Apr 01 2022, Miklos Vajna wrote:

> Hi Ævar,
>
> On Fri, Apr 01, 2022 at 11:57:33AM +0200, Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:
>> So (just making sure I understand this) in this case the --since option
>> is behaving as expected in the sense that the information in the commit
>> itself matches what it's finding, but what you'd really like for it to
>> consider some "adjusted" commit date?
>> 
>> I.e. to be smart enough to spot that it should include a commit from
>> 2021 if all the preceding commits are from 2022, or some other similar
>> heuristic?
>
> No heuristics. Just a way to not stop at the first commit that doesn't
> match the --since criteria. Here is an example:
>
> Given:
>
> rm -rf .git file
> git init
> echo a > file
> git add file
> git commit -m init
> echo a >> file
> git add file
> GIT_COMMITTER_DATE="2021-01-01 0:00" git commit -m second
> echo a >> file
> git add file
> git commit -m third
>
> When I do:
>
> git log --pretty=oneline --since="2022-01-01"
>
> Then current I get:
>
> 91a24b6ccba6b1d26c3bd5bcea7ff86e6997b599 (HEAD -> master) third
>
> And I would like to have an opt-in way to instead get:
>
> 91a24b6ccba6b1d26c3bd5bcea7ff86e6997b599 (HEAD -> master) third
> e259a40784d3d70f3878105adac380c8e8a8ae52 init
>
> Arguing that both "init" and "third" was committed this year.

Indeed.

> The question is if there is a way to do this already (perhaps I missed
> something in the docs or didn't notice it while I briefly researched the
> commit walk code), or in case I want to do this, then would it make
> sense to have this feature in git or this is more a "run git rev-list
> and do your own filtering" case?

I think it might make sense to have it as feature, but hopefully we
could piggy-back on the date adjustment that the commit-graph needs to
do already, I'm not sure if we save that information anywhere though...

The assumption with --since was that this sort of timestamp drift
wouldn't be this bad, so mostly it works out. It could be made to work
like --grep, but then it needs to walk the whole history if no other
limit is provided.

So it'll be very slow if you just want --since=2.weeks.ago, but
accurate.

I think an alternate solution to this in the meantime is to use "git
replace" as a band-aid, I haven't tried, but you should be able to
replace the relevant commit with one that has adjusted dates.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: git log --since to not stop after first old commit?
  2022-04-01  8:21 git log --since to not stop after first old commit? Miklos Vajna
  2022-04-01  9:57 ` Ævar Arnfjörð Bjarmason
@ 2022-04-01 17:51 ` Junio C Hamano
  2022-04-01 21:36   ` [PATCH] git-log: add a --since-as-filter option Miklos Vajna
  2022-04-07 15:43   ` git log --since to not stop after first old commit? Miklos Vajna
  1 sibling, 2 replies; 24+ messages in thread
From: Junio C Hamano @ 2022-04-01 17:51 UTC (permalink / raw)
  To: Miklos Vajna; +Cc: git

Miklos Vajna <vmiklos@vmiklos.hu> writes:

> I wanted to look at commits of a contributor from the last year, and
> noticed that I only see commits from this year, not last year when I use:
>
>         git log --author="that person" --since="1 year ago"
>
> Digging around in the history, one other contributor pushed a mistake on
> 1st Jan, where the author date was supposed to be 2022-01-01, but
> happened to be 2021-01-01. Knowing that, it makes sense that 'git log'
> stopped at that commit by default.
>
> I wonder though, is there any option to "force" git log to walk all
> reachable commits from HEAD, but just show the ones which match the
> --since criteria?
>
> Or is this need so special that the best is to parse the output of 'git
> rev-list' and do my own filtering for author and date?

Currently yes.  I am not sure if it is (or is not) worth changing,
though.

Many "git log" options make commits hidden from the output by
filtering each commit we find but keep digging the history further,
but some options make commits hidden by stopping the traversal.
"--until" is the former (there is no way to implement it as the
latter) but "--since" is the latter.

If we can implement more of these "commit hiding operations" as
traversal stoppers, it allows "log" to avoid doing unnecessary work,
and in a history without skewed timestamps, "--since" is a prime
candidate to take advantage of the fact that parent must be older
than any of its children (hence we can safely stop traversal once we
see an old enough commit) to be implemented as a "traversal stopper".
But in the presense of skewed timestamps, those commits behind one
commit with an incorrectly old timestamp will end up being hidden.

We could add a --since-as-filter= option or something, but then the
user needs to be careful when to stop (and digging down to the root
of the history, i.e. "never stop", may be an acceptable answer to
some projects).  We may be able to, when commit-graph (v2) with
adjusted timestamp data exist, stop before going down to the root,
but we would still need to add it as a different option because the
existing behaviour of "--since", to immediately stop upon seeing a
commit with a timestamp that is older than the given time, is so
well established and it would irritate users of existing scripts if
it changes all of a sudden.



^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH] git-log: add a --since-as-filter option
  2022-04-01 17:51 ` Junio C Hamano
@ 2022-04-01 21:36   ` Miklos Vajna
  2022-04-02 10:09     ` [PATCH v2] " Miklos Vajna
  2022-04-07 15:43   ` git log --since to not stop after first old commit? Miklos Vajna
  1 sibling, 1 reply; 24+ messages in thread
From: Miklos Vajna @ 2022-04-01 21:36 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Ævar Arnfjörð Bjarmason

This is similar to --since, but it will filter out not matching commits,
rather than stopping at the first not matching commit.

This is useful if you e.g. want to list the commits from the last year,
but one odd commit has a bad commit date and that would hide lots of
earlier commits in that range.

The behavior of --since is left unchanged, since it's valid to depend on
its current behavior.
---

Hi,

On Fri, Apr 01, 2022 at 10:51:34AM -0700, Junio C Hamano <gitster@pobox.com> wrote:
> We could add a --since-as-filter= option or something, but then the
> user needs to be careful when to stop (and digging down to the root
> of the history, i.e. "never stop", may be an acceptable answer to
> some projects).

Here is a patch that does this. As a somewhat arbitrary testcase, the
LibreOffice core.git repo has 474064 commits and --since-as-filter
finishes in 688 ms for a sample query (and expected output), while I got
empty output with --since previously. I would argue this is an
acceptable trade-off.

What do you think?

Thanks,

Miklos

 Documentation/rev-list-options.txt |  5 +++++
 revision.c                         | 10 ++++++++++
 revision.h                         |  1 +
 t/t4217-log-limit.sh               | 32 ++++++++++++++++++++++++++++++
 4 files changed, 48 insertions(+)
 create mode 100755 t/t4217-log-limit.sh

diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt
index fd4f4e26c9..ba01b1ba06 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -25,6 +25,11 @@ ordering and formatting options, such as `--reverse`.
 --after=<date>::
 	Show commits more recent than a specific date.
 
+--since-as-filter=<date>::
+	Show all commits more recent than a specific date. This visits all
+	commits in the range, rather than stopping at the first commit which is older
+	than a specific date.
+
 --until=<date>::
 --before=<date>::
 	Show commits older than a specific date.
diff --git a/revision.c b/revision.c
index 2646b78990..ebc95319d6 100644
--- a/revision.c
+++ b/revision.c
@@ -1440,6 +1440,9 @@ static int limit_list(struct rev_info *revs)
 		if (revs->min_age != -1 && (commit->date > revs->min_age) &&
 		    !revs->line_level_traverse)
 			continue;
+		if (revs->max_age_as_filter != -1 && (commit->date < revs->max_age_as_filter) &&
+		    !revs->line_level_traverse)
+			continue;
 		date = commit->date;
 		p = &commit_list_insert(commit, p)->next;
 
@@ -1838,6 +1841,7 @@ void repo_init_revisions(struct repository *r,
 	revs->dense = 1;
 	revs->prefix = prefix;
 	revs->max_age = -1;
+	revs->max_age_as_filter = -1;
 	revs->min_age = -1;
 	revs->skip_count = -1;
 	revs->max_count = -1;
@@ -2218,6 +2222,9 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
 	} else if ((argcount = parse_long_opt("since", argv, &optarg))) {
 		revs->max_age = approxidate(optarg);
 		return argcount;
+	} else if ((argcount = parse_long_opt("since-as-filter", argv, &optarg))) {
+		revs->max_age_as_filter = approxidate(optarg);
+		return argcount;
 	} else if ((argcount = parse_long_opt("after", argv, &optarg))) {
 		revs->max_age = approxidate(optarg);
 		return argcount;
@@ -3862,6 +3869,9 @@ enum commit_action get_commit_action(struct rev_info *revs, struct commit *commi
 	if (revs->min_age != -1 &&
 	    comparison_date(revs, commit) > revs->min_age)
 			return commit_ignore;
+	if (revs->max_age_as_filter != -1 &&
+	    comparison_date(revs, commit) < revs->max_age_as_filter)
+			return commit_ignore;
 	if (revs->min_parents || (revs->max_parents >= 0)) {
 		int n = commit_list_count(commit->parents);
 		if ((n < revs->min_parents) ||
diff --git a/revision.h b/revision.h
index 5bc59c7bfe..e80c148b19 100644
--- a/revision.h
+++ b/revision.h
@@ -263,6 +263,7 @@ struct rev_info {
 	int skip_count;
 	int max_count;
 	timestamp_t max_age;
+	timestamp_t max_age_as_filter;
 	timestamp_t min_age;
 	int min_parents;
 	int max_parents;
diff --git a/t/t4217-log-limit.sh b/t/t4217-log-limit.sh
new file mode 100755
index 0000000000..5b7d30d5ad
--- /dev/null
+++ b/t/t4217-log-limit.sh
@@ -0,0 +1,32 @@
+#!/bin/sh
+
+test_description='git log with filter options limiting the output'
+GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
+export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
+
+. ./test-lib.sh
+
+GIT_TEST_COMMIT_GRAPH=0
+GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS=0
+
+test_expect_success 'setup test' '
+	git init &&
+	echo a > file &&
+	git add file &&
+	GIT_COMMITTER_DATE="2022-02-01 0:00" git commit -m init &&
+	echo a >> file &&
+	git add file &&
+	GIT_COMMITTER_DATE="2021-01-01 0:00" git commit -m second &&
+	echo a >> file &&
+	git add file &&
+	GIT_COMMITTER_DATE="2022-03-01 0:00" git commit -m third
+'
+
+test_expect_success 'git log --since-as-filter' '
+	git log --since-as-filter="2022-01-01" --pretty="format:%s" > actual &&
+	test_i18ngrep init actual &&
+	! test_i18ngrep second actual &&
+	test_i18ngrep third actual
+'
+
+test_done
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2] git-log: add a --since-as-filter option
  2022-04-01 21:36   ` [PATCH] git-log: add a --since-as-filter option Miklos Vajna
@ 2022-04-02 10:09     ` Miklos Vajna
  0 siblings, 0 replies; 24+ messages in thread
From: Miklos Vajna @ 2022-04-02 10:09 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Ævar Arnfjörð Bjarmason

This is similar to --since, but it will filter out not matching commits,
rather than stopping at the first not matching commit.

This is useful if you e.g. want to list the commits from the last year,
but one odd commit has a bad commit date and that would hide lots of
earlier commits in that range.

The behavior of --since is left unchanged, since it's valid to depend on
its current behavior.

Signed-off-by: Miklos Vajna <vmiklos@vmiklos.hu>
---

Hi,

On Fri, Apr 01, 2022 at 11:36:48PM +0200, Miklos Vajna <vmiklos@vmiklos.hu> wrote:
> Here is a patch that does this.

Oops, forgot the sign-off, fixed now.

Regards,

Miklos

 Documentation/rev-list-options.txt |  5 +++++
 revision.c                         | 10 ++++++++++
 revision.h                         |  1 +
 t/t4217-log-limit.sh               | 32 ++++++++++++++++++++++++++++++
 4 files changed, 48 insertions(+)
 create mode 100755 t/t4217-log-limit.sh

diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt
index fd4f4e26c9..ba01b1ba06 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -25,6 +25,11 @@ ordering and formatting options, such as `--reverse`.
 --after=<date>::
 	Show commits more recent than a specific date.
 
+--since-as-filter=<date>::
+	Show all commits more recent than a specific date. This visits all
+	commits in the range, rather than stopping at the first commit which is older
+	than a specific date.
+
 --until=<date>::
 --before=<date>::
 	Show commits older than a specific date.
diff --git a/revision.c b/revision.c
index 2646b78990..ebc95319d6 100644
--- a/revision.c
+++ b/revision.c
@@ -1440,6 +1440,9 @@ static int limit_list(struct rev_info *revs)
 		if (revs->min_age != -1 && (commit->date > revs->min_age) &&
 		    !revs->line_level_traverse)
 			continue;
+		if (revs->max_age_as_filter != -1 && (commit->date < revs->max_age_as_filter) &&
+		    !revs->line_level_traverse)
+			continue;
 		date = commit->date;
 		p = &commit_list_insert(commit, p)->next;
 
@@ -1838,6 +1841,7 @@ void repo_init_revisions(struct repository *r,
 	revs->dense = 1;
 	revs->prefix = prefix;
 	revs->max_age = -1;
+	revs->max_age_as_filter = -1;
 	revs->min_age = -1;
 	revs->skip_count = -1;
 	revs->max_count = -1;
@@ -2218,6 +2222,9 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
 	} else if ((argcount = parse_long_opt("since", argv, &optarg))) {
 		revs->max_age = approxidate(optarg);
 		return argcount;
+	} else if ((argcount = parse_long_opt("since-as-filter", argv, &optarg))) {
+		revs->max_age_as_filter = approxidate(optarg);
+		return argcount;
 	} else if ((argcount = parse_long_opt("after", argv, &optarg))) {
 		revs->max_age = approxidate(optarg);
 		return argcount;
@@ -3862,6 +3869,9 @@ enum commit_action get_commit_action(struct rev_info *revs, struct commit *commi
 	if (revs->min_age != -1 &&
 	    comparison_date(revs, commit) > revs->min_age)
 			return commit_ignore;
+	if (revs->max_age_as_filter != -1 &&
+	    comparison_date(revs, commit) < revs->max_age_as_filter)
+			return commit_ignore;
 	if (revs->min_parents || (revs->max_parents >= 0)) {
 		int n = commit_list_count(commit->parents);
 		if ((n < revs->min_parents) ||
diff --git a/revision.h b/revision.h
index 5bc59c7bfe..e80c148b19 100644
--- a/revision.h
+++ b/revision.h
@@ -263,6 +263,7 @@ struct rev_info {
 	int skip_count;
 	int max_count;
 	timestamp_t max_age;
+	timestamp_t max_age_as_filter;
 	timestamp_t min_age;
 	int min_parents;
 	int max_parents;
diff --git a/t/t4217-log-limit.sh b/t/t4217-log-limit.sh
new file mode 100755
index 0000000000..5b7d30d5ad
--- /dev/null
+++ b/t/t4217-log-limit.sh
@@ -0,0 +1,32 @@
+#!/bin/sh
+
+test_description='git log with filter options limiting the output'
+GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
+export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
+
+. ./test-lib.sh
+
+GIT_TEST_COMMIT_GRAPH=0
+GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS=0
+
+test_expect_success 'setup test' '
+	git init &&
+	echo a > file &&
+	git add file &&
+	GIT_COMMITTER_DATE="2022-02-01 0:00" git commit -m init &&
+	echo a >> file &&
+	git add file &&
+	GIT_COMMITTER_DATE="2021-01-01 0:00" git commit -m second &&
+	echo a >> file &&
+	git add file &&
+	GIT_COMMITTER_DATE="2022-03-01 0:00" git commit -m third
+'
+
+test_expect_success 'git log --since-as-filter' '
+	git log --since-as-filter="2022-01-01" --pretty="format:%s" > actual &&
+	test_i18ngrep init actual &&
+	! test_i18ngrep second actual &&
+	test_i18ngrep third actual
+'
+
+test_done
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: git log --since to not stop after first old commit?
  2022-04-01 17:51 ` Junio C Hamano
  2022-04-01 21:36   ` [PATCH] git-log: add a --since-as-filter option Miklos Vajna
@ 2022-04-07 15:43   ` Miklos Vajna
  2022-04-08  2:30     ` Junio C Hamano
  1 sibling, 1 reply; 24+ messages in thread
From: Miklos Vajna @ 2022-04-07 15:43 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Hi Junio,

On Fri, Apr 01, 2022 at 10:51:34AM -0700, Junio C Hamano <gitster@pobox.com> wrote:
> We could add a --since-as-filter= option or something, but then the
> user needs to be careful when to stop (and digging down to the root
> of the history, i.e. "never stop", may be an acceptable answer to
> some projects).

I sent a patch to add such an option (which picks the "never stop"
behavior) on 1st, did you see that?

If the idea is OK in principle, but the patch needs tweaking, please let
me know.

Thanks,

Miklos

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: git log --since to not stop after first old commit?
  2022-04-07 15:43   ` git log --since to not stop after first old commit? Miklos Vajna
@ 2022-04-08  2:30     ` Junio C Hamano
  2022-04-08 18:19       ` Junio C Hamano
  2022-04-08 21:01       ` [PATCH v3] git-log: add a --since=... --as-filter option Miklos Vajna
  0 siblings, 2 replies; 24+ messages in thread
From: Junio C Hamano @ 2022-04-08  2:30 UTC (permalink / raw)
  To: Miklos Vajna; +Cc: git

Miklos Vajna <vmiklos@vmiklos.hu> writes:

> On Fri, Apr 01, 2022 at 10:51:34AM -0700, Junio C Hamano <gitster@pobox.com> wrote:
>> We could add a --since-as-filter= option or something, but then the
>> user needs to be careful when to stop (and digging down to the root
>> of the history, i.e. "never stop", may be an acceptable answer to
>> some projects).
>
> I sent a patch to add such an option (which picks the "never stop"
> behavior) on 1st, did you see that?
>
> If the idea is OK in principle, but the patch needs tweaking, please let
> me know.

As a single-shot change, "--since-as-filter" is certainly an easy to
explain approach of least resistance.

But when viewed from a higher level as a general design problem, I
am unsure if it is a good direction to go in.

Giving "--since" the "as-filter" variant sets a precedent, and
closes the door for a better UI that we can extend more generally
without having to add "--X-as-filter" for each and every conceivable
"--X" that is a traversal stopper into a filtering kind.



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: git log --since to not stop after first old commit?
  2022-04-08  2:30     ` Junio C Hamano
@ 2022-04-08 18:19       ` Junio C Hamano
       [not found]         ` <CANgJU+Wr+tKNPfeh4dst-E_LSnoYYmN1easqmkFUA9spp-rpKQ@mail.gmail.com>
  2022-04-08 21:01       ` [PATCH v3] git-log: add a --since=... --as-filter option Miklos Vajna
  1 sibling, 1 reply; 24+ messages in thread
From: Junio C Hamano @ 2022-04-08 18:19 UTC (permalink / raw)
  To: Miklos Vajna; +Cc: git

Junio C Hamano <gitster@pobox.com> writes:

> Miklos Vajna <vmiklos@vmiklos.hu> writes:
>
>> On Fri, Apr 01, 2022 at 10:51:34AM -0700, Junio C Hamano <gitster@pobox.com> wrote:
>>> We could add a --since-as-filter= option or something, but then the
>>> user needs to be careful when to stop (and digging down to the root
>>> of the history, i.e. "never stop", may be an acceptable answer to
>>> some projects).
>>
>> I sent a patch to add such an option (which picks the "never stop"
>> behavior) on 1st, did you see that?
>>
>> If the idea is OK in principle, but the patch needs tweaking, please let
>> me know.
>
> As a single-shot change, "--since-as-filter" is certainly an easy to
> explain approach of least resistance.
>
> But when viewed from a higher level as a general design problem, I
> am unsure if it is a good direction to go in.
>
> Giving "--since" the "as-filter" variant sets a precedent, and
> closes the door for a better UI that we can extend more generally
> without having to add "--X-as-filter" for each and every conceivable
> "--X" that is a traversal stopper into a filtering kind.

If we pursue the possibility further, perhaps we may realize that
there isn't much room for us to add too many "traversal stoppers" in
the future, in which case giving "as-filter" to a very limited few
traversal stoppers may not be too bad.  I just do not think we have
explored that enough to decide that "--since-as-filter" is a good UI
(and it is not a good timing for me to spend brain cycles on the
issue).

Thanks.


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v3] git-log: add a --since=... --as-filter option
  2022-04-08  2:30     ` Junio C Hamano
  2022-04-08 18:19       ` Junio C Hamano
@ 2022-04-08 21:01       ` Miklos Vajna
  2022-04-12  8:47         ` Ævar Arnfjörð Bjarmason
  1 sibling, 1 reply; 24+ messages in thread
From: Miklos Vajna @ 2022-04-08 21:01 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

This is similar to --since, but it will filter out not matching commits,
rather than stopping at the first not matching commit.

This is useful if you e.g. want to list the commits from the last year,
but one odd commit has a bad commit date and that would hide lots of
earlier commits in that range.

The behavior of --since is left unchanged, since it's valid to depend on
its current behavior.

Signed-off-by: Miklos Vajna <vmiklos@vmiklos.hu>
---

Hi Junio,

On Thu, Apr 07, 2022 at 07:30:33PM -0700, Junio C Hamano <gitster@pobox.com> wrote:
> As a single-shot change, "--since-as-filter" is certainly an easy to
> explain approach of least resistance.
> 
> But when viewed from a higher level as a general design problem, I
> am unsure if it is a good direction to go in.
> 
> Giving "--since" the "as-filter" variant sets a precedent, and
> closes the door for a better UI that we can extend more generally
> without having to add "--X-as-filter" for each and every conceivable
> "--X" that is a traversal stopper into a filtering kind.

I like the idea of doing this mode as "--since=... --as-filter". I can
still implement it just for --since=... initially, but it can be
extended for other flags as well in the future if there is a need.

> If we pursue the possibility further, perhaps we may realize that
> there isn't much room for us to add too many "traversal stoppers" in
> the future, in which case giving "as-filter" to a very limited few
> traversal stoppers may not be too bad.  I just do not think we have
> explored that enough to decide that "--since-as-filter" is a good UI

My understanding is that get_revision_1() has a special-case for the max
age case to be a "traversal stopper", and all other options are just 
filtering in limit_range(). But perhaps I missed something.

Here is an updated patch to do the new syntax.

Thanks,

Miklos

 Documentation/rev-list-options.txt |  5 +++++
 revision.c                         | 14 +++++++++++--
 revision.h                         |  1 +
 t/t4217-log-limit.sh               | 32 ++++++++++++++++++++++++++++++
 4 files changed, 50 insertions(+), 2 deletions(-)
 create mode 100755 t/t4217-log-limit.sh

diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt
index fd4f4e26c9..8565299264 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -25,6 +25,11 @@ ordering and formatting options, such as `--reverse`.
 --after=<date>::
 	Show commits more recent than a specific date.
 
+--as-filter::
+	When combined with `--since=<date>`, show all commits more recent than
+	a specific date. This visits all commits in the range, rather than stopping at
+	the first commit which is older than a specific date.
+
 --until=<date>::
 --before=<date>::
 	Show commits older than a specific date.
diff --git a/revision.c b/revision.c
index 7d435f8048..41ea72e516 100644
--- a/revision.c
+++ b/revision.c
@@ -1440,6 +1440,9 @@ static int limit_list(struct rev_info *revs)
 		if (revs->min_age != -1 && (commit->date > revs->min_age) &&
 		    !revs->line_level_traverse)
 			continue;
+		if (revs->max_age != -1 && revs->as_filter && (commit->date < revs->max_age) &&
+		    !revs->line_level_traverse)
+			continue;
 		date = commit->date;
 		p = &commit_list_insert(commit, p)->next;
 
@@ -1838,6 +1841,7 @@ void repo_init_revisions(struct repository *r,
 	revs->dense = 1;
 	revs->prefix = prefix;
 	revs->max_age = -1;
+	revs->as_filter = 0;
 	revs->min_age = -1;
 	revs->skip_count = -1;
 	revs->max_count = -1;
@@ -2218,6 +2222,9 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
 	} else if ((argcount = parse_long_opt("since", argv, &optarg))) {
 		revs->max_age = approxidate(optarg);
 		return argcount;
+	} else if (!strcmp(arg, "--as-filter")) {
+		revs->as_filter = 1;
+		return argcount;
 	} else if ((argcount = parse_long_opt("after", argv, &optarg))) {
 		revs->max_age = approxidate(optarg);
 		return argcount;
@@ -3365,7 +3372,7 @@ static void explore_walk_step(struct rev_info *revs)
 	if (revs->sort_order == REV_SORT_BY_AUTHOR_DATE)
 		record_author_date(&info->author_date, c);
 
-	if (revs->max_age != -1 && (c->date < revs->max_age))
+	if (revs->max_age != -1 && !revs->as_filter && (c->date < revs->max_age))
 		c->object.flags |= UNINTERESTING;
 
 	if (process_parents(revs, c, NULL, NULL) < 0)
@@ -3862,6 +3869,9 @@ enum commit_action get_commit_action(struct rev_info *revs, struct commit *commi
 	if (revs->min_age != -1 &&
 	    comparison_date(revs, commit) > revs->min_age)
 			return commit_ignore;
+	if (revs->max_age != -1 && revs->as_filter &&
+	    comparison_date(revs, commit) < revs->max_age)
+			return commit_ignore;
 	if (revs->min_parents || (revs->max_parents >= 0)) {
 		int n = commit_list_count(commit->parents);
 		if ((n < revs->min_parents) ||
@@ -4019,7 +4029,7 @@ static struct commit *get_revision_1(struct rev_info *revs)
 		 * that we'd otherwise have done in limit_list().
 		 */
 		if (!revs->limited) {
-			if (revs->max_age != -1 &&
+			if (revs->max_age != -1 && !revs->as_filter &&
 			    comparison_date(revs, commit) < revs->max_age)
 				continue;
 
diff --git a/revision.h b/revision.h
index 5bc59c7bfe..fe37ebd83d 100644
--- a/revision.h
+++ b/revision.h
@@ -263,6 +263,7 @@ struct rev_info {
 	int skip_count;
 	int max_count;
 	timestamp_t max_age;
+	int as_filter;
 	timestamp_t min_age;
 	int min_parents;
 	int max_parents;
diff --git a/t/t4217-log-limit.sh b/t/t4217-log-limit.sh
new file mode 100755
index 0000000000..a66830e3d7
--- /dev/null
+++ b/t/t4217-log-limit.sh
@@ -0,0 +1,32 @@
+#!/bin/sh
+
+test_description='git log with filter options limiting the output'
+GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
+export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
+
+. ./test-lib.sh
+
+GIT_TEST_COMMIT_GRAPH=0
+GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS=0
+
+test_expect_success 'setup test' '
+	git init &&
+	echo a > file &&
+	git add file &&
+	GIT_COMMITTER_DATE="2022-02-01 0:00" git commit -m init &&
+	echo a >> file &&
+	git add file &&
+	GIT_COMMITTER_DATE="2021-01-01 0:00" git commit -m second &&
+	echo a >> file &&
+	git add file &&
+	GIT_COMMITTER_DATE="2022-03-01 0:00" git commit -m third
+'
+
+test_expect_success 'git log --since-as-filter' '
+	git log --since="2022-01-01" --as-filter --pretty="format:%s" > actual &&
+	test_i18ngrep init actual &&
+	! test_i18ngrep second actual &&
+	test_i18ngrep third actual
+'
+
+test_done
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: git log --since to not stop after first old commit?
       [not found]         ` <CANgJU+Wr+tKNPfeh4dst-E_LSnoYYmN1easqmkFUA9spp-rpKQ@mail.gmail.com>
@ 2022-04-11  6:37           ` Miklos Vajna
  2022-04-11  9:18             ` demerphq
  2022-04-11 16:58           ` Junio C Hamano
  1 sibling, 1 reply; 24+ messages in thread
From: Miklos Vajna @ 2022-04-11  6:37 UTC (permalink / raw)
  To: demerphq; +Cc: Junio C Hamano, Git

Hi Yves,

On Sat, Apr 09, 2022 at 06:02:44AM +0200, demerphq <demerphq@gmail.com> wrote:
> When you do have the cycles perhaps it is worth considering whether
> splitting it up, so that --as-filter is a modifier for traversal stoppers,
> would avoid the problem of proliferating options.   Eg, instead of saying
> --since-as-filter you would say --since ... --as-filter.

[PATCH v3] in this thread is meant to implement this. I hope that helps.

Regards,

Miklos

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: git log --since to not stop after first old commit?
  2022-04-11  6:37           ` Miklos Vajna
@ 2022-04-11  9:18             ` demerphq
  0 siblings, 0 replies; 24+ messages in thread
From: demerphq @ 2022-04-11  9:18 UTC (permalink / raw)
  To: Miklos Vajna; +Cc: Junio C Hamano, Git

On Mon, 11 Apr 2022 at 08:37, Miklos Vajna <vmiklos@vmiklos.hu> wrote:
>
> Hi Yves,
>
> On Sat, Apr 09, 2022 at 06:02:44AM +0200, demerphq <demerphq@gmail.com> wrote:
> > When you do have the cycles perhaps it is worth considering whether
> > splitting it up, so that --as-filter is a modifier for traversal stoppers,
> > would avoid the problem of proliferating options.   Eg, instead of saying
> > --since-as-filter you would say --since ... --as-filter.
>
> [PATCH v3] in this thread is meant to implement this. I hope that helps.

That sounds good to me and IMO certainly preferable to a stack of
options with the suffix -as-filter. But please be aware that I am not
a person of significance to the git project and my opinion on this
probably isn't worth much. I just thought I would point out that it
might be a way to cleanly address Junio's concern about option
proliferation while still giving you what you want.

Cheers,
Yves

-- 
perl -Mre=debug -e "/just|another|perl|hacker/"

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: git log --since to not stop after first old commit?
       [not found]         ` <CANgJU+Wr+tKNPfeh4dst-E_LSnoYYmN1easqmkFUA9spp-rpKQ@mail.gmail.com>
  2022-04-11  6:37           ` Miklos Vajna
@ 2022-04-11 16:58           ` Junio C Hamano
  2022-04-22 18:48             ` Junio C Hamano
  1 sibling, 1 reply; 24+ messages in thread
From: Junio C Hamano @ 2022-04-11 16:58 UTC (permalink / raw)
  To: demerphq; +Cc: Miklos Vajna, Git

demerphq <demerphq@gmail.com> writes:

> On Sat, 9 Apr 2022 at 02:43, Junio C Hamano <gitster@pobox.com> wrote:
>
>> > Giving "--since" the "as-filter" variant sets a precedent, and
>> > closes the door for a better UI that we can extend more generally
>> > without having to add "--X-as-filter" for each and every conceivable
>> > "--X" that is a traversal stopper into a filtering kind.
>>
>> If we pursue the possibility further, perhaps we may realize that
>> there isn't much room for us to add too many "traversal stoppers" in
>> the future, in which case giving "as-filter" to a very limited few
>> traversal stoppers may not be too bad.  I just do not think we have
>> explored that enough to decide that "--since-as-filter" is a good UI
>> (and it is not a good timing for me to spend brain cycles on the
>> issue).
>
> When you do have the cycles perhaps it is worth considering whether
> splitting it up, so that --as-filter is a modifier for traversal stoppers,
> would avoid the problem of proliferating options.   Eg, instead of saying
> --since-as-filter you would say --since ... --as-filter. That way the
> stoppers where "filter like behavior" made sense could just check if the
> --as-filter flag was set.

Yes, that has exactly the opposite problem I wanted to warn us about
by sending an extra message (to which you are reponding to).  If we
have (or can have) very many traversal stopping option, it might
make sense to have --as-filter as a modifier and avoid doubling the
number of options, but if we only have very few (and fundamentally
cannot have more than very few), then giving each of these very few
--X its own --X-as-filter variant would probably make more sense.
Because end users would probably not know which ones are inherently
filters and will not be affected with --as-filter modifier, it would
help them understand if we give them independent --since-as-filter
option and document it separately, if there aren't many of them.

Besides, if we had very few but still multiple of them, --X and
--Y-as-filter can be combined to say "X stops as before, but Y is
applied as filter", which is strictly more expressive than a
separate --as-filter modifier.

So that is why I threw out the message for those interested in the
topic to first think about.  I know we agree that --since may be a
good candidate to have these two flavours of behaviour.  I do not
think anybody carefully thought about existing options to see if
there are many like --since that want two flavours, let alone
possible options we have said in the past that we may want to have
but not yet added.

Thanks.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3] git-log: add a --since=... --as-filter option
  2022-04-08 21:01       ` [PATCH v3] git-log: add a --since=... --as-filter option Miklos Vajna
@ 2022-04-12  8:47         ` Ævar Arnfjörð Bjarmason
  2022-04-15 20:39           ` [PATCH v4] " Miklos Vajna
  0 siblings, 1 reply; 24+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-04-12  8:47 UTC (permalink / raw)
  To: Miklos Vajna; +Cc: Junio C Hamano, git


On Fri, Apr 08 2022, Miklos Vajna wrote:

> On Thu, Apr 07, 2022 at 07:30:33PM -0700, Junio C Hamano <gitster@pobox.com> wrote:
>> As a single-shot change, "--since-as-filter" is certainly an easy to
>> explain approach of least resistance.
>> 
>> But when viewed from a higher level as a general design problem, I
>> am unsure if it is a good direction to go in.
>> 
>> Giving "--since" the "as-filter" variant sets a precedent, and
>> closes the door for a better UI that we can extend more generally
>> without having to add "--X-as-filter" for each and every conceivable
>> "--X" that is a traversal stopper into a filtering kind.
>
> I like the idea of doing this mode as "--since=... --as-filter". I can
> still implement it just for --since=... initially, but it can be
> extended for other flags as well in the future if there is a need.

Yes, I think this is much better.

>> If we pursue the possibility further, perhaps we may realize that
>> there isn't much room for us to add too many "traversal stoppers" in
>> the future, in which case giving "as-filter" to a very limited few
>> traversal stoppers may not be too bad.  I just do not think we have
>> explored that enough to decide that "--since-as-filter" is a good UI
>
> My understanding is that get_revision_1() has a special-case for the max
> age case to be a "traversal stopper", and all other options are just 
> filtering in limit_range(). But perhaps I missed something.
> [...]
>  Documentation/rev-list-options.txt |  5 +++++
>  revision.c                         | 14 +++++++++++--
>  revision.h                         |  1 +
>  t/t4217-log-limit.sh               | 32 ++++++++++++++++++++++++++++++
>  4 files changed, 50 insertions(+), 2 deletions(-)
>  create mode 100755 t/t4217-log-limit.sh
>
> diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt
> index fd4f4e26c9..8565299264 100644
> --- a/Documentation/rev-list-options.txt
> +++ b/Documentation/rev-list-options.txt
> @@ -25,6 +25,11 @@ ordering and formatting options, such as `--reverse`.
>  --after=<date>::
>  	Show commits more recent than a specific date.
>  
> +--as-filter::
> +	When combined with `--since=<date>`, show all commits more recent than
> +	a specific date. This visits all commits in the range, rather than stopping at
> +	the first commit which is older than a specific date.

I wonder if we should be more future-proof here and say that we'll run
anything as a filter, and that --since is the one option currently
affected.

But maybe there's no reason to do so...

In any case these docs are inaccurate because they cover --since, but if
you check revision.c we'll set "max_age" on other options too
(synonyms?).

All in all I wonder if this wouldn't be much more understandable if we
advertised is as another option to do "HISTORY SIMPLIFICATION", which
looking at e.g. get_commit_action() and "prune" is kind of what we're
doing with the existing --since behavior.

>  --until=<date>::
>  --before=<date>::
>  	Show commits older than a specific date.
> diff --git a/revision.c b/revision.c
> index 7d435f8048..41ea72e516 100644
> --- a/revision.c
> +++ b/revision.c
> @@ -1440,6 +1440,9 @@ static int limit_list(struct rev_info *revs)
>  		if (revs->min_age != -1 && (commit->date > revs->min_age) &&
>  		    !revs->line_level_traverse)
>  			continue;
> +		if (revs->max_age != -1 && revs->as_filter && (commit->date < revs->max_age) &&
> +		    !revs->line_level_traverse)
> +			continue;
>  		date = commit->date;
>  		p = &commit_list_insert(commit, p)->next;
>  
> @@ -1838,6 +1841,7 @@ void repo_init_revisions(struct repository *r,
>  	revs->dense = 1;
>  	revs->prefix = prefix;
>  	revs->max_age = -1;
> +	revs->as_filter = 0;
>  	revs->min_age = -1;
>  	revs->skip_count = -1;
>  	revs->max_count = -1;
> @@ -2218,6 +2222,9 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
>  	} else if ((argcount = parse_long_opt("since", argv, &optarg))) {
>  		revs->max_age = approxidate(optarg);
>  		return argcount;
> +	} else if (!strcmp(arg, "--as-filter")) {
> +		revs->as_filter = 1;
> +		return argcount;
>  	} else if ((argcount = parse_long_opt("after", argv, &optarg))) {
>  		revs->max_age = approxidate(optarg);
>  		return argcount;
> @@ -3365,7 +3372,7 @@ static void explore_walk_step(struct rev_info *revs)
>  	if (revs->sort_order == REV_SORT_BY_AUTHOR_DATE)
>  		record_author_date(&info->author_date, c);
>  
> -	if (revs->max_age != -1 && (c->date < revs->max_age))
> +	if (revs->max_age != -1 && !revs->as_filter && (c->date < revs->max_age))
>  		c->object.flags |= UNINTERESTING;
>  
>  	if (process_parents(revs, c, NULL, NULL) < 0)
> @@ -3862,6 +3869,9 @@ enum commit_action get_commit_action(struct rev_info *revs, struct commit *commi
>  	if (revs->min_age != -1 &&
>  	    comparison_date(revs, commit) > revs->min_age)
>  			return commit_ignore;
> +	if (revs->max_age != -1 && revs->as_filter &&
> +	    comparison_date(revs, commit) < revs->max_age)
> +			return commit_ignore;
>  	if (revs->min_parents || (revs->max_parents >= 0)) {
>  		int n = commit_list_count(commit->parents);
>  		if ((n < revs->min_parents) ||
> @@ -4019,7 +4029,7 @@ static struct commit *get_revision_1(struct rev_info *revs)
>  		 * that we'd otherwise have done in limit_list().
>  		 */
>  		if (!revs->limited) {
> -			if (revs->max_age != -1 &&
> +			if (revs->max_age != -1 && !revs->as_filter &&
>  			    comparison_date(revs, commit) < revs->max_age)
>  				continue;

I think it's good to do this as a general mechanism, but if you now
remove the "max_age" field from "struct rev_info" and:

	make -k

You'll see a bunch of callers who check "max_age" outside of revision.c,
since those will accept these revision options are they doing the right
thing now too?

...

> diff --git a/revision.h b/revision.h
> index 5bc59c7bfe..fe37ebd83d 100644
> --- a/revision.h
> +++ b/revision.h
> @@ -263,6 +263,7 @@ struct rev_info {
>  	int skip_count;
>  	int max_count;
>  	timestamp_t max_age;
> +	int as_filter;
>  	timestamp_t min_age;
>  	int min_parents;
>  	int max_parents;
> diff --git a/t/t4217-log-limit.sh b/t/t4217-log-limit.sh
> new file mode 100755
> index 0000000000..a66830e3d7
> --- /dev/null
> +++ b/t/t4217-log-limit.sh
> @@ -0,0 +1,32 @@
> +#!/bin/sh
> +
> +test_description='git log with filter options limiting the output'
> +GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
> +export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
> +
> +. ./test-lib.sh
> +
> +GIT_TEST_COMMIT_GRAPH=0
> +GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS=0
> +
> +test_expect_success 'setup test' '
> +	git init &&
> +	echo a > file &&
> +	git add file &&
> +	GIT_COMMITTER_DATE="2022-02-01 0:00" git commit -m init &&
> +	echo a >> file &&
> +	git add file &&
> +	GIT_COMMITTER_DATE="2021-01-01 0:00" git commit -m second &&
> +	echo a >> file &&
> +	git add file &&
> +	GIT_COMMITTER_DATE="2022-03-01 0:00" git commit -m third
> +'
> +
> +test_expect_success 'git log --since-as-filter' '
> +	git log --since="2022-01-01" --as-filter --pretty="format:%s" > actual &&
> +	test_i18ngrep init actual &&
> +	! test_i18ngrep second actual &&
> +	test_i18ngrep third actual
> +'
> +
> +test_done

In any case we should have tests for those callers, i.e. blame, bundle
etc.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v4] git-log: add a --since=... --as-filter option
  2022-04-12  8:47         ` Ævar Arnfjörð Bjarmason
@ 2022-04-15 20:39           ` Miklos Vajna
  2022-04-15 23:13             ` Junio C Hamano
  0 siblings, 1 reply; 24+ messages in thread
From: Miklos Vajna @ 2022-04-15 20:39 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Ævar Arnfjörð Bjarmason, git

This is similar to --since, but it will filter out not matching commits,
rather than stopping at the first not matching commit.

This is useful if you e.g. want to list the commits from the last year,
but one odd commit has a bad commit date and that would hide lots of
earlier commits in that range.

The behavior of --since is left unchanged, since it's valid to depend on
its current behavior.

Signed-off-by: Miklos Vajna <vmiklos@vmiklos.hu>
---

Hi Ævar,

On Tue, Apr 12, 2022 at 10:47:15AM +0200, Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:
> > +--as-filter::
> > +	When combined with `--since=<date>`, show all commits more recent than
> > +	a specific date. This visits all commits in the range, rather than stopping at
> > +	the first commit which is older than a specific date.
> 
> I wonder if we should be more future-proof here and say that we'll run
> anything as a filter, and that --since is the one option currently
> affected.
> 
> But maybe there's no reason to do so...

My understanding is that in practice --since is the only option that
terminates the revision walk on the first match, so I would argue there
is no need for this.

> In any case these docs are inaccurate because they cover --since, but if
> you check revision.c we'll set "max_age" on other options too
> (synonyms?).

Good catch, I've added --max-age and --after as well.

> All in all I wonder if this wouldn't be much more understandable if we
> advertised is as another option to do "HISTORY SIMPLIFICATION", which
> looking at e.g. get_commit_action() and "prune" is kind of what we're
> doing with the existing --since behavior.

Makes sense, we kind of simplify history by default here & then this
option could be documented as one that modifies this terminating
behavior.

> I think it's good to do this as a general mechanism, but if you now
> remove the "max_age" field from "struct rev_info" and:
> 
> 	make -k
> 
> You'll see a bunch of callers who check "max_age" outside of revision.c,
> since those will accept these revision options are they doing the right
> thing now too?

I found the following callers:

- some builtins that want to make sure that no history limiting is used,
  an additional --as-filter doesn't change behavior there

- blame: this has its own commit walking loop, so --as-filter doesn't
  change any behavior here unintentionally.

- bundle: --since is not used for revision walking here, just to check
  what tags to include/exclude, so this is already not terminating

> In any case we should have tests for those callers, i.e. blame, bundle
> etc.

t/t5607-clone-bundle.sh already tests bundle --since. I've added a new
t/t4218-blame-limit.sh to test blame --since, it seems there were no
tests for this so far.

Thanks,

Miklos

 Documentation/rev-list-options.txt |  6 +++++
 revision.c                         | 13 +++++++++--
 revision.h                         |  1 +
 t/t4217-log-limit.sh               | 36 ++++++++++++++++++++++++++++++
 t/t4218-blame-limit.sh             | 36 ++++++++++++++++++++++++++++++
 5 files changed, 90 insertions(+), 2 deletions(-)
 create mode 100755 t/t4217-log-limit.sh
 create mode 100755 t/t4218-blame-limit.sh

diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt
index fd4f4e26c9..354bd29f10 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -350,6 +350,12 @@ The following options select the commits to be shown:
 <paths>::
 	Commits modifying the given <paths> are selected.
 
+--as-filter::
+	When combined with `--max-age=<date>`, `--since=<date>` or
+	`--after=<date>`, show all commits more recent than a specific date. This
+	visits all commits in the range, rather than stopping at the first commit
+	which is older than a specific date.
+
 --simplify-by-decoration::
 	Commits that are referred by some branch or tag are selected.
 
diff --git a/revision.c b/revision.c
index 7d435f8048..ff018c3976 100644
--- a/revision.c
+++ b/revision.c
@@ -1440,6 +1440,9 @@ static int limit_list(struct rev_info *revs)
 		if (revs->min_age != -1 && (commit->date > revs->min_age) &&
 		    !revs->line_level_traverse)
 			continue;
+		if (revs->max_age != -1 && revs->as_filter && (commit->date < revs->max_age) &&
+		    !revs->line_level_traverse)
+			continue;
 		date = commit->date;
 		p = &commit_list_insert(commit, p)->next;
 
@@ -1838,6 +1841,7 @@ void repo_init_revisions(struct repository *r,
 	revs->dense = 1;
 	revs->prefix = prefix;
 	revs->max_age = -1;
+	revs->as_filter = 0;
 	revs->min_age = -1;
 	revs->skip_count = -1;
 	revs->max_count = -1;
@@ -2218,6 +2222,8 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
 	} else if ((argcount = parse_long_opt("since", argv, &optarg))) {
 		revs->max_age = approxidate(optarg);
 		return argcount;
+	} else if (!strcmp(arg, "--as-filter")) {
+		revs->as_filter = 1;
 	} else if ((argcount = parse_long_opt("after", argv, &optarg))) {
 		revs->max_age = approxidate(optarg);
 		return argcount;
@@ -3365,7 +3371,7 @@ static void explore_walk_step(struct rev_info *revs)
 	if (revs->sort_order == REV_SORT_BY_AUTHOR_DATE)
 		record_author_date(&info->author_date, c);
 
-	if (revs->max_age != -1 && (c->date < revs->max_age))
+	if (revs->max_age != -1 && !revs->as_filter && (c->date < revs->max_age))
 		c->object.flags |= UNINTERESTING;
 
 	if (process_parents(revs, c, NULL, NULL) < 0)
@@ -3862,6 +3868,9 @@ enum commit_action get_commit_action(struct rev_info *revs, struct commit *commi
 	if (revs->min_age != -1 &&
 	    comparison_date(revs, commit) > revs->min_age)
 			return commit_ignore;
+	if (revs->max_age != -1 && revs->as_filter &&
+	    comparison_date(revs, commit) < revs->max_age)
+			return commit_ignore;
 	if (revs->min_parents || (revs->max_parents >= 0)) {
 		int n = commit_list_count(commit->parents);
 		if ((n < revs->min_parents) ||
@@ -4019,7 +4028,7 @@ static struct commit *get_revision_1(struct rev_info *revs)
 		 * that we'd otherwise have done in limit_list().
 		 */
 		if (!revs->limited) {
-			if (revs->max_age != -1 &&
+			if (revs->max_age != -1 && !revs->as_filter &&
 			    comparison_date(revs, commit) < revs->max_age)
 				continue;
 
diff --git a/revision.h b/revision.h
index 5bc59c7bfe..fe37ebd83d 100644
--- a/revision.h
+++ b/revision.h
@@ -263,6 +263,7 @@ struct rev_info {
 	int skip_count;
 	int max_count;
 	timestamp_t max_age;
+	int as_filter;
 	timestamp_t min_age;
 	int min_parents;
 	int max_parents;
diff --git a/t/t4217-log-limit.sh b/t/t4217-log-limit.sh
new file mode 100755
index 0000000000..2a3705c714
--- /dev/null
+++ b/t/t4217-log-limit.sh
@@ -0,0 +1,36 @@
+#!/bin/sh
+
+test_description='git log with filter options limiting the output'
+GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
+export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
+
+. ./test-lib.sh
+
+GIT_TEST_COMMIT_GRAPH=0
+GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS=0
+
+test_expect_success 'setup test' '
+	git init &&
+	echo a > file &&
+	git add file &&
+	GIT_COMMITTER_DATE="2021-02-01 0:00" git commit -m init &&
+	echo a >> file &&
+	git add file &&
+	GIT_COMMITTER_DATE="2022-02-01 0:00" git commit -m first &&
+	echo a >> file &&
+	git add file &&
+	GIT_COMMITTER_DATE="2021-03-01 0:00" git commit -m second &&
+	echo a >> file &&
+	git add file &&
+	GIT_COMMITTER_DATE="2022-03-01 0:00" git commit -m third
+'
+
+test_expect_success 'git log --since=... --as-filter' '
+	git log --since="2022-01-01" --as-filter --pretty="format:%s" > actual &&
+	! test_i18ngrep init actual &&
+	test_i18ngrep first actual &&
+	! test_i18ngrep second actual &&
+	test_i18ngrep third actual
+'
+
+test_done
diff --git a/t/t4218-blame-limit.sh b/t/t4218-blame-limit.sh
new file mode 100755
index 0000000000..03f513f331
--- /dev/null
+++ b/t/t4218-blame-limit.sh
@@ -0,0 +1,36 @@
+#!/bin/sh
+
+test_description='git blame with filter options limiting the output'
+GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
+export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
+
+. ./test-lib.sh
+
+GIT_TEST_COMMIT_GRAPH=0
+GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS=0
+
+test_expect_success 'setup test' '
+	git init &&
+	echo a > file &&
+	git add file &&
+	GIT_AUTHOR_DATE="2020-01-01 0:00" GIT_COMMITTER_DATE="2020-01-01 0:00" git commit -m init &&
+	echo a >> file &&
+	git add file &&
+	GIT_AUTHOR_DATE="2020-02-01 0:00" GIT_COMMITTER_DATE="2020-02-01 0:00" git commit -m first &&
+	echo a >> file &&
+	git add file &&
+	GIT_AUTHOR_DATE="2020-03-01 0:00" GIT_COMMITTER_DATE="2020-03-01 0:00" git commit -m second &&
+	echo a >> file &&
+	git add file &&
+	GIT_AUTHOR_DATE="2020-04-01 0:00" GIT_COMMITTER_DATE="2020-04-01 0:00" git commit -m third
+'
+
+test_expect_success 'git blame --since=...' '
+	git blame --since="2020-02-15" file > actual &&
+	! test_i18ngrep 2020-01-01 actual &&
+	test_i18ngrep 2020-02-01 actual &&
+	test_i18ngrep 2020-03-01 actual &&
+	test_i18ngrep 2020-04-01 actual
+'
+
+test_done
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH v4] git-log: add a --since=... --as-filter option
  2022-04-15 20:39           ` [PATCH v4] " Miklos Vajna
@ 2022-04-15 23:13             ` Junio C Hamano
  2022-04-16 14:23               ` [PATCH v5] log: "--as-filter" option adjusts how "--since" cut-off works Miklos Vajna
  0 siblings, 1 reply; 24+ messages in thread
From: Junio C Hamano @ 2022-04-15 23:13 UTC (permalink / raw)
  To: Miklos Vajna; +Cc: Ævar Arnfjörð Bjarmason, git

Miklos Vajna <vmiklos@vmiklos.hu> writes:

> This is similar to --since, but it will filter out not matching commits,
> rather than stopping at the first not matching commit.
>
> This is useful if you e.g. want to list the commits from the last year,
> but one odd commit has a bad commit date and that would hide lots of
> earlier commits in that range.
>
> The behavior of --since is left unchanged, since it's valid to depend on
> its current behavior.

The above is good if it were a new --since-as-filter option.  Adding
"--as-filter" as a separate option to be used in conjunction with
"--since" would fundamentally mean "The behaviour of --since is left
unchanged" cannot possibly be true.

A suggested rewrite.  I label each section and highlight its point,
but in the end product, you should just separate them with a blank
line, making each of them into its own paragraph.

Title (single line, short and to the point)

    log: "--as-filter" option adjusts how "--since" cut-off works

Intro (observation of the current state)

    The "--since=<time>" option of "git log" limits the commits
    displayed by the command by stopping the traversal once it
    sees a commit whose timestamp is older than the given time and
    not digging further into its parents.

Problem description (pros and cons of the current state)

    This is OK in a history where a commit always has a newer
    timestamp than any of its parents'.  Once you see a commit older
    than the given <time>, all ancestor commits of it are even older
    than the time anyway.  It poses, however, a problem when there
    is a commit with a wrong timestamp that makes it appear older
    than its parents.  Stopping traversal at the "incorrectly old"
    commit will hide its ancestors that are newer than that wrong
    commit and are newer than the cut-off time given with the --since
    option.  --max-age and --after being the synonyms to --since,
    they share the same issue.

Solution (give orders to the codebase to "be like so")

    Add a new "--as-filter" option that modifies how "--since=<time>"
    is used.  Instead of stopping the traversal to hide an old
    enough commit and its all ancestors, exclude commits with an old
    timestamp from the output but still keep digging the history.

Other comments (caveats, etc.)

    Without other traversal stopping options, this will force the
    command in "git log" family to dig down the history to the root.
    It may be an acceptable cost for a small project with short
    history and many commits with screwy timestamps.

> diff --git a/revision.c b/revision.c
> index 7d435f8048..ff018c3976 100644
> --- a/revision.c
> +++ b/revision.c
> @@ -1440,6 +1440,9 @@ static int limit_list(struct rev_info *revs)
>  		if (revs->min_age != -1 && (commit->date > revs->min_age) &&
>  		    !revs->line_level_traverse)
>  			continue;
> +		if (revs->max_age != -1 && revs->as_filter && (commit->date < revs->max_age) &&
> +		    !revs->line_level_traverse)

That's an overly long line.

> +			continue;

In any case, isn't this too late in limit_list() to adjust --since?
There is this logic earlier in the loop:

	while (original_list) {
		struct commit *commit = pop_commit(&original_list);
		struct object *obj = &commit->object;
		show_early_output_fn_t show;

		if (commit == interesting_cache)
			interesting_cache = NULL;

		if (revs->max_age != -1 && (commit->date < revs->max_age))
			obj->flags |= UNINTERESTING;
		if (process_parents(revs, commit, &original_list, NULL) < 0)
			return -1;

We look at max_age and if it is set and newer than the timestamp of
the commit we are looking at, we immediately mark that commit
uninteresting so that this and all its ancestors are excluded from
the output.  Don't you want to disable that logic so that the
traversal continues?

> @@ -3862,6 +3868,9 @@ enum commit_action get_commit_action(struct rev_info *revs, struct commit *commi
>  	if (revs->min_age != -1 &&
>  	    comparison_date(revs, commit) > revs->min_age)
>  			return commit_ignore;
> +	if (revs->max_age != -1 && revs->as_filter &&
> +	    comparison_date(revs, commit) < revs->max_age)
> +			return commit_ignore;
>  	if (revs->min_parents || (revs->max_parents >= 0)) {
>  		int n = commit_list_count(commit->parents);
>  		if ((n < revs->min_parents) ||

I am not sure how --since should affect (or not affect) the history
simplification, but my gut feeling says this may cause unintended
fallout.  I do not have time to dig deeper today, but something to
keep in mind...

> diff --git a/t/t4217-log-limit.sh b/t/t4217-log-limit.sh
> new file mode 100755
> index 0000000000..2a3705c714
> --- /dev/null
> +++ b/t/t4217-log-limit.sh
> @@ -0,0 +1,36 @@
> +#!/bin/sh
> +
> +test_description='git log with filter options limiting the output'

OK.

> +GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
> +export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME

Does any of the tests in this file care?  The above is a mechanism
primarily for transitioning scripts that were written long time ago,
and newly written tests that do not have to care about the initial
branch should not need it (they can instead explicitly create a
branch and work on it, if they really need to refer the initial
branch by name, but as far as I can tell, none your test in this
file even needs to refer to any branch with any name).

> +
> +. ./test-lib.sh
> +
> +GIT_TEST_COMMIT_GRAPH=0
> +GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS=0

Do these matter, and should these matter?  Does anybody export them
to make any difference in this test?

> +test_expect_success 'setup test' '
> +	git init &&
> +	echo a > file &&

Style:

	echo a >file &&

(cf. Documentation/CodingGuidelines)

> +	git add file &&
> +	GIT_COMMITTER_DATE="2021-02-01 0:00" git commit -m init &&

It is somewhat annoying to see 00:00 spelled as 0:00 here (and below).

> +	echo a >> file &&
> +	git add file &&
> +	GIT_COMMITTER_DATE="2022-02-01 0:00" git commit -m first &&
> +	echo a >> file &&
> +	git add file &&
> +	GIT_COMMITTER_DATE="2021-03-01 0:00" git commit -m second &&
> +	echo a >> file &&
> +	git add file &&
> +	GIT_COMMITTER_DATE="2022-03-01 0:00" git commit -m third
> +'
> +
> +test_expect_success 'git log --since=... --as-filter' '
> +	git log --since="2022-01-01" --as-filter --pretty="format:%s" > actual &&

I think you meant --format=%s (which is --pretty="tformat:%s"), as
you do not want "actual" to end in an incomplete line.  The same
"no space between the redirection operator and its target" applies
here as everywhere else.

> +	! test_i18ngrep init actual &&
> +	test_i18ngrep first actual &&
> +	! test_i18ngrep second actual &&
> +	test_i18ngrep third actual

test_i18ngrep -> grep

But stepping back a bit, do we or do we not care the order in which
first and third appear in the "actual" file?  It's not like we have
a mergy history and two commits that cannot topologically be
compared.  We have a simple linear single strand of pearls here, and
if you start traversing from the tip, you'll reliably see third
first, then second, then first and then finally init, in that order
and in no other order.  So, wouldn't we be better off writing it
more like ...

	git log --since=2022-01-01 --as-filter --format=%s >actual &&
	cat >expect <<-\EOF &&
	third
	first
	EOF
	test_cmp expect actual

... i.e. explicitly spell out what we expect not to change?

Thanks.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v5] log: "--as-filter" option adjusts how "--since" cut-off works
  2022-04-15 23:13             ` Junio C Hamano
@ 2022-04-16 14:23               ` Miklos Vajna
  2022-04-22  6:50                 ` Miklos Vajna
  0 siblings, 1 reply; 24+ messages in thread
From: Miklos Vajna @ 2022-04-16 14:23 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Ævar Arnfjörð Bjarmason, git

The "--since=<time>" option of "git log" limits the commits displayed by
the command by stopping the traversal once it sees a commit whose
timestamp is older than the given time and not digging further into its
parents.

This is OK in a history where a commit always has a newer timestamp than
any of its parents'.  Once you see a commit older than the given <time>,
all ancestor commits of it are even older than the time anyway.  It
poses, however, a problem when there is a commit with a wrong timestamp
that makes it appear older than its parents.  Stopping traversal at the
"incorrectly old" commit will hide its ancestors that are newer than
that wrong commit and are newer than the cut-off time given with the
--since option.  --max-age and --after being the synonyms to --since,
they share the same issue.

Add a new "--as-filter" option that modifies how "--since=<time>" is
used.  Instead of stopping the traversal to hide an old enough commit
and its all ancestors, exclude commits with an old timestamp from the
output but still keep digging the history.

Without other traversal stopping options, this will force the command in
"git log" family to dig down the history to the root.  It may be an
acceptable cost for a small project with short history and many commits
with screwy timestamps.

Signed-off-by: Miklos Vajna <vmiklos@vmiklos.hu>
---

Hi Junio,

On Fri, Apr 15, 2022 at 04:13:02PM -0700, Junio C Hamano <gitster@pobox.com> wrote:
> A suggested rewrite.

Thanks a lot, I've updated the commit message to match this.

> > +		if (revs->max_age != -1 && revs->as_filter && (commit->date < revs->max_age) &&
> > +		    !revs->line_level_traverse)
> 
> That's an overly long line.

Fixed.

> > +			continue;
> 
> In any case, isn't this too late in limit_list() to adjust --since?
> There is this logic earlier in the loop:
> 
> 	while (original_list) {
> 		struct commit *commit = pop_commit(&original_list);
> 		struct object *obj = &commit->object;
> 		show_early_output_fn_t show;
> 
> 		if (commit == interesting_cache)
> 			interesting_cache = NULL;
> 
> 		if (revs->max_age != -1 && (commit->date < revs->max_age))
> 			obj->flags |= UNINTERESTING;
> 		if (process_parents(revs, commit, &original_list, NULL) < 0)
> 			return -1;
> 
> We look at max_age and if it is set and newer than the timestamp of
> the commit we are looking at, we immediately mark that commit
> uninteresting so that this and all its ancestors are excluded from
> the output.  Don't you want to disable that logic so that the
> traversal continues?

Yes, a "and not as-filter" was missing there, I've added it now.

> 
> > @@ -3862,6 +3868,9 @@ enum commit_action get_commit_action(struct rev_info *revs, struct commit *commi
> >  	if (revs->min_age != -1 &&
> >  	    comparison_date(revs, commit) > revs->min_age)
> >  			return commit_ignore;
> > +	if (revs->max_age != -1 && revs->as_filter &&
> > +	    comparison_date(revs, commit) < revs->max_age)
> > +			return commit_ignore;
> >  	if (revs->min_parents || (revs->max_parents >= 0)) {
> >  		int n = commit_list_count(commit->parents);
> >  		if ((n < revs->min_parents) ||
> 
> I am not sure how --since should affect (or not affect) the history
> simplification, but my gut feeling says this may cause unintended
> fallout.  I do not have time to dig deeper today, but something to
> keep in mind...

Hm, this works exactly as --after (which is already filtering), this is the
actual place where this patch implements the filtering way of --since.  Based
on this, it feels safe to me. But perhaps I missed something.

> > +GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
> > +export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
> 
> Does any of the tests in this file care?  The above is a mechanism
> primarily for transitioning scripts that were written long time ago,
> and newly written tests that do not have to care about the initial
> branch should not need it (they can instead explicitly create a
> branch and work on it, if they really need to refer the initial
> branch by name, but as far as I can tell, none your test in this
> file even needs to refer to any branch with any name).

Indeed not needed, I now removed these.

> > +GIT_TEST_COMMIT_GRAPH=0
> > +GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS=0
> 
> Do these matter, and should these matter?  Does anybody export them
> to make any difference in this test?

Aha, doesn't matter, removed.

> > +test_expect_success 'setup test' '
> > +	git init &&
> > +	echo a > file &&
> 
> Style:
> 
> 	echo a >file &&

Fixed.

> > +	GIT_COMMITTER_DATE="2021-02-01 0:00" git commit -m init &&
> 
> It is somewhat annoying to see 00:00 spelled as 0:00 here (and below).

Fixed.

> > +test_expect_success 'git log --since=... --as-filter' '
> > +	git log --since="2022-01-01" --as-filter --pretty="format:%s" > actual &&
> 
> I think you meant --format=%s (which is --pretty="tformat:%s"), as
> you do not want "actual" to end in an incomplete line.  The same
> "no space between the redirection operator and its target" applies
> here as everywhere else.

Indeed, fixed.

> 
> > +	! test_i18ngrep init actual &&
> > +	test_i18ngrep first actual &&
> > +	! test_i18ngrep second actual &&
> > +	test_i18ngrep third actual
> 
> test_i18ngrep -> grep
> 
> But stepping back a bit, do we or do we not care the order in which
> first and third appear in the "actual" file?  It's not like we have
> a mergy history and two commits that cannot topologically be
> compared.  We have a simple linear single strand of pearls here, and
> if you start traversing from the tip, you'll reliably see third
> first, then second, then first and then finally init, in that order
> and in no other order.  So, wouldn't we be better off writing it
> more like ...
> 
> 	git log --since=2022-01-01 --as-filter --format=%s >actual &&
> 	cat >expect <<-\EOF &&
> 	third
> 	first
> 	EOF
> 	test_cmp expect actual
> 
> ... i.e. explicitly spell out what we expect not to change?

Aha, I reworked the test to avoid the grep here.

Regards,

Miklos

 Documentation/rev-list-options.txt |  6 ++++++
 revision.c                         | 16 +++++++++++---
 revision.h                         |  1 +
 t/t4217-log-limit.sh               | 32 ++++++++++++++++++++++++++++
 t/t4218-blame-limit.sh             | 34 ++++++++++++++++++++++++++++++
 5 files changed, 86 insertions(+), 3 deletions(-)
 create mode 100755 t/t4217-log-limit.sh
 create mode 100755 t/t4218-blame-limit.sh

diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt
index fd4f4e26c9..354bd29f10 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -350,6 +350,12 @@ The following options select the commits to be shown:
 <paths>::
 	Commits modifying the given <paths> are selected.
 
+--as-filter::
+	When combined with `--max-age=<date>`, `--since=<date>` or
+	`--after=<date>`, show all commits more recent than a specific date. This
+	visits all commits in the range, rather than stopping at the first commit
+	which is older than a specific date.
+
 --simplify-by-decoration::
 	Commits that are referred by some branch or tag are selected.
 
diff --git a/revision.c b/revision.c
index 7d435f8048..f9970bd8ac 100644
--- a/revision.c
+++ b/revision.c
@@ -1426,7 +1426,8 @@ static int limit_list(struct rev_info *revs)
 		if (commit == interesting_cache)
 			interesting_cache = NULL;
 
-		if (revs->max_age != -1 && (commit->date < revs->max_age))
+		if (revs->max_age != -1 && !revs->as_filter &&
+			(commit->date < revs->max_age))
 			obj->flags |= UNINTERESTING;
 		if (process_parents(revs, commit, &original_list, NULL) < 0)
 			return -1;
@@ -1440,6 +1441,9 @@ static int limit_list(struct rev_info *revs)
 		if (revs->min_age != -1 && (commit->date > revs->min_age) &&
 		    !revs->line_level_traverse)
 			continue;
+		if (revs->max_age != -1 && revs->as_filter &&
+			(commit->date < revs->max_age) && !revs->line_level_traverse)
+			continue;
 		date = commit->date;
 		p = &commit_list_insert(commit, p)->next;
 
@@ -1838,6 +1842,7 @@ void repo_init_revisions(struct repository *r,
 	revs->dense = 1;
 	revs->prefix = prefix;
 	revs->max_age = -1;
+	revs->as_filter = 0;
 	revs->min_age = -1;
 	revs->skip_count = -1;
 	revs->max_count = -1;
@@ -2218,6 +2223,8 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
 	} else if ((argcount = parse_long_opt("since", argv, &optarg))) {
 		revs->max_age = approxidate(optarg);
 		return argcount;
+	} else if (!strcmp(arg, "--as-filter")) {
+		revs->as_filter = 1;
 	} else if ((argcount = parse_long_opt("after", argv, &optarg))) {
 		revs->max_age = approxidate(optarg);
 		return argcount;
@@ -3365,7 +3372,7 @@ static void explore_walk_step(struct rev_info *revs)
 	if (revs->sort_order == REV_SORT_BY_AUTHOR_DATE)
 		record_author_date(&info->author_date, c);
 
-	if (revs->max_age != -1 && (c->date < revs->max_age))
+	if (revs->max_age != -1 && !revs->as_filter && (c->date < revs->max_age))
 		c->object.flags |= UNINTERESTING;
 
 	if (process_parents(revs, c, NULL, NULL) < 0)
@@ -3862,6 +3869,9 @@ enum commit_action get_commit_action(struct rev_info *revs, struct commit *commi
 	if (revs->min_age != -1 &&
 	    comparison_date(revs, commit) > revs->min_age)
 			return commit_ignore;
+	if (revs->max_age != -1 && revs->as_filter &&
+	    comparison_date(revs, commit) < revs->max_age)
+			return commit_ignore;
 	if (revs->min_parents || (revs->max_parents >= 0)) {
 		int n = commit_list_count(commit->parents);
 		if ((n < revs->min_parents) ||
@@ -4019,7 +4029,7 @@ static struct commit *get_revision_1(struct rev_info *revs)
 		 * that we'd otherwise have done in limit_list().
 		 */
 		if (!revs->limited) {
-			if (revs->max_age != -1 &&
+			if (revs->max_age != -1 && !revs->as_filter &&
 			    comparison_date(revs, commit) < revs->max_age)
 				continue;
 
diff --git a/revision.h b/revision.h
index 5bc59c7bfe..fe37ebd83d 100644
--- a/revision.h
+++ b/revision.h
@@ -263,6 +263,7 @@ struct rev_info {
 	int skip_count;
 	int max_count;
 	timestamp_t max_age;
+	int as_filter;
 	timestamp_t min_age;
 	int min_parents;
 	int max_parents;
diff --git a/t/t4217-log-limit.sh b/t/t4217-log-limit.sh
new file mode 100755
index 0000000000..41df7f0f0f
--- /dev/null
+++ b/t/t4217-log-limit.sh
@@ -0,0 +1,32 @@
+#!/bin/sh
+
+test_description='git log with filter options limiting the output'
+
+. ./test-lib.sh
+
+test_expect_success 'setup test' '
+	git init &&
+	echo a >file &&
+	git add file &&
+	GIT_COMMITTER_DATE="2021-02-01 00:00" git commit -m init &&
+	echo a >>file &&
+	git add file &&
+	GIT_COMMITTER_DATE="2022-02-01 00:00" git commit -m first &&
+	echo a >>file &&
+	git add file &&
+	GIT_COMMITTER_DATE="2021-03-01 00:00" git commit -m second &&
+	echo a >>file &&
+	git add file &&
+	GIT_COMMITTER_DATE="2022-03-01 00:00" git commit -m third
+'
+
+test_expect_success 'git log --since=... --as-filter' '
+	git log --since="2022-01-01" --as-filter --format=%s >actual &&
+	cat >expect <<-\EOF &&
+	third
+	first
+	EOF
+	test_cmp expect actual
+'
+
+test_done
diff --git a/t/t4218-blame-limit.sh b/t/t4218-blame-limit.sh
new file mode 100755
index 0000000000..bef7cf7d48
--- /dev/null
+++ b/t/t4218-blame-limit.sh
@@ -0,0 +1,34 @@
+#!/bin/sh
+
+test_description='git blame with filter options limiting the output'
+
+. ./test-lib.sh
+
+test_expect_success 'setup test' '
+	git init &&
+	echo a >file &&
+	git add file &&
+	GIT_AUTHOR_DATE="2020-01-01 00:00" GIT_COMMITTER_DATE="2020-01-01 00:00" git commit -m init &&
+	echo a >>file &&
+	git add file &&
+	GIT_AUTHOR_DATE="2020-02-01 00:00" GIT_COMMITTER_DATE="2020-02-01 00:00" git commit -m first &&
+	echo a >>file &&
+	git add file &&
+	GIT_AUTHOR_DATE="2020-03-01 00:00" GIT_COMMITTER_DATE="2020-03-01 00:00" git commit -m second &&
+	echo a >>file &&
+	git add file &&
+	GIT_AUTHOR_DATE="2020-04-01 00:00" GIT_COMMITTER_DATE="2020-04-01 00:00" git commit -m third
+'
+
+test_expect_success 'git blame --since=...' '
+	git blame --since="2020-02-15" file >actual &&
+	cat >expect <<-\EOF &&
+	^c7bc5ce (A U Thor 2020-02-01 00:00:00 +0000 1) a
+	^c7bc5ce (A U Thor 2020-02-01 00:00:00 +0000 2) a
+	33fc0d13 (A U Thor 2020-03-01 00:00:00 +0000 3) a
+	ec76e003 (A U Thor 2020-04-01 00:00:00 +0000 4) a
+	EOF
+	test_cmp expect actual
+'
+
+test_done
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH v5] log: "--as-filter" option adjusts how "--since" cut-off works
  2022-04-16 14:23               ` [PATCH v5] log: "--as-filter" option adjusts how "--since" cut-off works Miklos Vajna
@ 2022-04-22  6:50                 ` Miklos Vajna
  0 siblings, 0 replies; 24+ messages in thread
From: Miklos Vajna @ 2022-04-22  6:50 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Ævar Arnfjörð Bjarmason, git

Hi,

On Sat, Apr 16, 2022 at 04:23:14PM +0200, Miklos Vajna <vmiklos@vmiklos.hu> wrote:
> Thanks a lot, I've updated the commit message to match this.

Is there anything else I should tweak in v5 to get this into a topic
branch?

Thanks,

Miklos

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: git log --since to not stop after first old commit?
  2022-04-11 16:58           ` Junio C Hamano
@ 2022-04-22 18:48             ` Junio C Hamano
  2022-04-22 20:01               ` [PATCH v6] log: "--since-as-filter" option is a non-terminating "--since" variant Miklos Vajna
  0 siblings, 1 reply; 24+ messages in thread
From: Junio C Hamano @ 2022-04-22 18:48 UTC (permalink / raw)
  To: Miklos Vajna, demerphq; +Cc: Git

Junio C Hamano <gitster@pobox.com> writes:

>> When you do have the cycles perhaps it is worth considering whether
>> splitting it up, so that --as-filter is a modifier for traversal stoppers,
>> would avoid the problem of proliferating options.   Eg, instead of saying
>> --since-as-filter you would say --since ... --as-filter. That way the
>> stoppers where "filter like behavior" made sense could just check if the
>> --as-filter flag was set.
>
> Yes, that has exactly the opposite problem I wanted to warn us about
> by sending an extra message (to which you are reponding to).  If we
> have (or can have) very many traversal stopping option, it might
> make sense to have --as-filter as a modifier and avoid doubling the
> number of options, but if we only have very few (and fundamentally
> cannot have more than very few), then giving each of these very few
> --X its own --X-as-filter variant would probably make more sense.
> Because end users would probably not know which ones are inherently
> filters and will not be affected with --as-filter modifier, it would
> help them understand if we give them independent --since-as-filter
> option and document it separately, if there aren't many of them.
>
> Besides, if we had very few but still multiple of them, --X and
> --Y-as-filter can be combined to say "X stops as before, but Y is
> applied as filter", which is strictly more expressive than a
> separate --as-filter modifier.
>
> So that is why I threw out the message for those interested in the
> topic to first think about.  I know we agree that --since may be a
> good candidate to have these two flavours of behaviour.  I do not
> think anybody carefully thought about existing options to see if
> there are many like --since that want two flavours, let alone
> possible options we have said in the past that we may want to have
> but not yet added.

Now I had some time to think about it, I have a feeling that it is
quite unlikely for us to add traversal stopper other than since, so
having a separate "--as-filter" would probably be more confusing
than adding "--since-as-filter", stressing on "only the 'show
commits with timestamp after this one' has two variants".

Thanks.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v6] log: "--since-as-filter" option is a non-terminating "--since" variant
  2022-04-22 18:48             ` Junio C Hamano
@ 2022-04-22 20:01               ` Miklos Vajna
  2022-04-22 22:11                 ` Junio C Hamano
  2022-04-22 23:43                 ` Junio C Hamano
  0 siblings, 2 replies; 24+ messages in thread
From: Miklos Vajna @ 2022-04-22 20:01 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: demerphq, Git

The "--since=<time>" option of "git log" limits the commits displayed by
the command by stopping the traversal once it sees a commit whose
timestamp is older than the given time and not digging further into its
parents.

This is OK in a history where a commit always has a newer timestamp than
any of its parents'.  Once you see a commit older than the given <time>,
all ancestor commits of it are even older than the time anyway.  It
poses, however, a problem when there is a commit with a wrong timestamp
that makes it appear older than its parents.  Stopping traversal at the
"incorrectly old" commit will hide its ancestors that are newer than
that wrong commit and are newer than the cut-off time given with the
--since option.  --max-age and --after being the synonyms to --since,
they share the same issue.

Add a new "--since-as-filter" option that is a variant of
"--since=<time>".  Instead of stopping the traversal to hide an old
enough commit and its all ancestors, exclude commits with an old
timestamp from the output but still keep digging the history.

Without other traversal stopping options, this will force the command in
"git log" family to dig down the history to the root.  It may be an
acceptable cost for a small project with short history and many commits
with screwy timestamps.

It is quite unlikely for us to add traversal stopper other than since,
so have this as a --since-as-filter option, rather than a separate
--as-filter, that would be probably more confusing.

Signed-off-by: Miklos Vajna <vmiklos@vmiklos.hu>
---

Hi Junio,

On Fri, Apr 22, 2022 at 11:48:45AM -0700, Junio C Hamano <gitster@pobox.com> wrote:
> Now I had some time to think about it, I have a feeling that it is
> quite unlikely for us to add traversal stopper other than since, so
> having a separate "--as-filter" would probably be more confusing
> than adding "--since-as-filter", stressing on "only the 'show
> commits with timestamp after this one' has two variants".

I'm fine with this approach, it goes back to the initial version. I 
rather reworked v5 in practice, to keep the other improvements.

Here is a patch that does this.

Regards,

Miklos

 Documentation/rev-list-options.txt |  5 +++++
 revision.c                         | 10 +++++++++
 revision.h                         |  1 +
 t/t4217-log-limit.sh               | 32 ++++++++++++++++++++++++++++
 t/t4218-blame-limit.sh             | 34 ++++++++++++++++++++++++++++++
 5 files changed, 82 insertions(+)
 create mode 100755 t/t4217-log-limit.sh
 create mode 100755 t/t4218-blame-limit.sh

diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt
index fd4f4e26c9..195e74eec6 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -25,6 +25,11 @@ ordering and formatting options, such as `--reverse`.
 --after=<date>::
 	Show commits more recent than a specific date.
 
+--since-as-filter=<date>::
+	Show all commits more recent than a specific date. This visits
+	all commits in the range, rather than stopping at the first commit which
+	is older than a specific date.
+
 --until=<date>::
 --before=<date>::
 	Show commits older than a specific date.
diff --git a/revision.c b/revision.c
index 7d435f8048..ee933a11c7 100644
--- a/revision.c
+++ b/revision.c
@@ -1440,6 +1440,9 @@ static int limit_list(struct rev_info *revs)
 		if (revs->min_age != -1 && (commit->date > revs->min_age) &&
 		    !revs->line_level_traverse)
 			continue;
+		if (revs->max_age_as_filter != -1 &&
+			(commit->date < revs->max_age) && !revs->line_level_traverse)
+			continue;
 		date = commit->date;
 		p = &commit_list_insert(commit, p)->next;
 
@@ -1838,6 +1841,7 @@ void repo_init_revisions(struct repository *r,
 	revs->dense = 1;
 	revs->prefix = prefix;
 	revs->max_age = -1;
+	revs->max_age_as_filter = -1;
 	revs->min_age = -1;
 	revs->skip_count = -1;
 	revs->max_count = -1;
@@ -2218,6 +2222,9 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
 	} else if ((argcount = parse_long_opt("since", argv, &optarg))) {
 		revs->max_age = approxidate(optarg);
 		return argcount;
+	} else if ((argcount = parse_long_opt("since-as-filter", argv, &optarg))) {
+		revs->max_age_as_filter = approxidate(optarg);
+		return argcount;
 	} else if ((argcount = parse_long_opt("after", argv, &optarg))) {
 		revs->max_age = approxidate(optarg);
 		return argcount;
@@ -3862,6 +3869,9 @@ enum commit_action get_commit_action(struct rev_info *revs, struct commit *commi
 	if (revs->min_age != -1 &&
 	    comparison_date(revs, commit) > revs->min_age)
 			return commit_ignore;
+	if (revs->max_age_as_filter != -1 &&
+	    comparison_date(revs, commit) < revs->max_age_as_filter)
+			return commit_ignore;
 	if (revs->min_parents || (revs->max_parents >= 0)) {
 		int n = commit_list_count(commit->parents);
 		if ((n < revs->min_parents) ||
diff --git a/revision.h b/revision.h
index 5bc59c7bfe..e80c148b19 100644
--- a/revision.h
+++ b/revision.h
@@ -263,6 +263,7 @@ struct rev_info {
 	int skip_count;
 	int max_count;
 	timestamp_t max_age;
+	timestamp_t max_age_as_filter;
 	timestamp_t min_age;
 	int min_parents;
 	int max_parents;
diff --git a/t/t4217-log-limit.sh b/t/t4217-log-limit.sh
new file mode 100755
index 0000000000..587cd0e386
--- /dev/null
+++ b/t/t4217-log-limit.sh
@@ -0,0 +1,32 @@
+#!/bin/sh
+
+test_description='git log with filter options limiting the output'
+
+. ./test-lib.sh
+
+test_expect_success 'setup test' '
+	git init &&
+	echo a >file &&
+	git add file &&
+	GIT_COMMITTER_DATE="2021-02-01 00:00" git commit -m init &&
+	echo a >>file &&
+	git add file &&
+	GIT_COMMITTER_DATE="2022-02-01 00:00" git commit -m first &&
+	echo a >>file &&
+	git add file &&
+	GIT_COMMITTER_DATE="2021-03-01 00:00" git commit -m second &&
+	echo a >>file &&
+	git add file &&
+	GIT_COMMITTER_DATE="2022-03-01 00:00" git commit -m third
+'
+
+test_expect_success 'git log --since-as-filter=...' '
+	git log --since-as-filter="2022-01-01" --format=%s >actual &&
+	cat >expect <<-\EOF &&
+	third
+	first
+	EOF
+	test_cmp expect actual
+'
+
+test_done
diff --git a/t/t4218-blame-limit.sh b/t/t4218-blame-limit.sh
new file mode 100755
index 0000000000..bef7cf7d48
--- /dev/null
+++ b/t/t4218-blame-limit.sh
@@ -0,0 +1,34 @@
+#!/bin/sh
+
+test_description='git blame with filter options limiting the output'
+
+. ./test-lib.sh
+
+test_expect_success 'setup test' '
+	git init &&
+	echo a >file &&
+	git add file &&
+	GIT_AUTHOR_DATE="2020-01-01 00:00" GIT_COMMITTER_DATE="2020-01-01 00:00" git commit -m init &&
+	echo a >>file &&
+	git add file &&
+	GIT_AUTHOR_DATE="2020-02-01 00:00" GIT_COMMITTER_DATE="2020-02-01 00:00" git commit -m first &&
+	echo a >>file &&
+	git add file &&
+	GIT_AUTHOR_DATE="2020-03-01 00:00" GIT_COMMITTER_DATE="2020-03-01 00:00" git commit -m second &&
+	echo a >>file &&
+	git add file &&
+	GIT_AUTHOR_DATE="2020-04-01 00:00" GIT_COMMITTER_DATE="2020-04-01 00:00" git commit -m third
+'
+
+test_expect_success 'git blame --since=...' '
+	git blame --since="2020-02-15" file >actual &&
+	cat >expect <<-\EOF &&
+	^c7bc5ce (A U Thor 2020-02-01 00:00:00 +0000 1) a
+	^c7bc5ce (A U Thor 2020-02-01 00:00:00 +0000 2) a
+	33fc0d13 (A U Thor 2020-03-01 00:00:00 +0000 3) a
+	ec76e003 (A U Thor 2020-04-01 00:00:00 +0000 4) a
+	EOF
+	test_cmp expect actual
+'
+
+test_done
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH v6] log: "--since-as-filter" option is a non-terminating "--since" variant
  2022-04-22 20:01               ` [PATCH v6] log: "--since-as-filter" option is a non-terminating "--since" variant Miklos Vajna
@ 2022-04-22 22:11                 ` Junio C Hamano
  2022-04-22 23:43                 ` Junio C Hamano
  1 sibling, 0 replies; 24+ messages in thread
From: Junio C Hamano @ 2022-04-22 22:11 UTC (permalink / raw)
  To: Miklos Vajna; +Cc: demerphq, Git

Miklos Vajna <vmiklos@vmiklos.hu> writes:

> The "--since=<time>" option of "git log" limits the commits displayed by
> the command by stopping the traversal once it sees a commit whose
> timestamp is older than the given time and not digging further into its
> parents.
>
> This is OK in a history where a commit always has a newer timestamp than
> any of its parents'.  Once you see a commit older than the given <time>,
> all ancestor commits of it are even older than the time anyway.  It
> poses, however, a problem when there is a commit with a wrong timestamp
> that makes it appear older than its parents.  Stopping traversal at the
> "incorrectly old" commit will hide its ancestors that are newer than
> that wrong commit and are newer than the cut-off time given with the
> --since option.  --max-age and --after being the synonyms to --since,
> they share the same issue.
>
> Add a new "--since-as-filter" option that is a variant of
> "--since=<time>".  Instead of stopping the traversal to hide an old
> enough commit and its all ancestors, exclude commits with an old
> timestamp from the output but still keep digging the history.
>
> Without other traversal stopping options, this will force the command in
> "git log" family to dig down the history to the root.  It may be an
> acceptable cost for a small project with short history and many commits
> with screwy timestamps.
>
> It is quite unlikely for us to add traversal stopper other than since,
> so have this as a --since-as-filter option, rather than a separate
> --as-filter, that would be probably more confusing.
>
> Signed-off-by: Miklos Vajna <vmiklos@vmiklos.hu>
> ---
>
> Hi Junio,
>
> On Fri, Apr 22, 2022 at 11:48:45AM -0700, Junio C Hamano <gitster@pobox.com> wrote:
>> Now I had some time to think about it, I have a feeling that it is
>> quite unlikely for us to add traversal stopper other than since, so
>> having a separate "--as-filter" would probably be more confusing
>> than adding "--since-as-filter", stressing on "only the 'show
>> commits with timestamp after this one' has two variants".
>
> I'm fine with this approach, it goes back to the initial version. I 
> rather reworked v5 in practice, to keep the other improvements.
>
> Here is a patch that does this.
>
> Regards,
>
> Miklos

Thanks.  Now 2.36 release is behind us, let's queue this in 'seen'
and after getting reviewed by somebody else---I do not trust
anything looked at by me and nobody else---merge it down, aiming for
graduation during this cycle.

> +--since-as-filter=<date>::
> +	Show all commits more recent than a specific date. This visits
> +	all commits in the range, rather than stopping at the first commit which
> +	is older than a specific date.

> diff --git a/revision.c b/revision.c
> index 7d435f8048..ee933a11c7 100644
> --- a/revision.c
> +++ b/revision.c
> @@ -1440,6 +1440,9 @@ static int limit_list(struct rev_info *revs)
>  		if (revs->min_age != -1 && (commit->date > revs->min_age) &&
>  		    !revs->line_level_traverse)
>  			continue;

Much ealrier than this part in the same loop there is

		if (revs->max_age != -1 && (commit->date < revs->max_age))
			obj->flags |= UNINTERESTING;
		if (process_parents(revs, commit, &original_list, NULL) < 0)
			return -1;

which taints the commit we are looking at as UNINTERESTING, cutting
the traversal down to its parents, if its timestamp is older than
max_age, and it would need to be taught not to do that, no?

Ah, OK, max_age_as_filter is *NOT* a boolean (false is -1 and true
is something else) but is an independent timestamp and the
expectation is that max_age is left to be -1 when --since is not in
use.  --since-as-filter's timestamp is stored in max_age_as_filter
so the earlier "compare with max_age and use UNINTERESTING bit to
stop traversal" code does not have to be modified.

> +		if (revs->max_age_as_filter != -1 &&
> +			(commit->date < revs->max_age) && !revs->line_level_traverse)
> +			continue;
>  		date = commit->date;
>  		p = &commit_list_insert(commit, p)->next;

And if that is the assumption of this code, shouldn't this part be
using "if max_age is set (i.e. not -1) and the commit is older than
that and we are not doing -La,b traversal"?  "--since-as-filter"
being used does not mean max_age must be set to a value that can be
compared to timestamps in commits in the world order of this version
of the patch, right?

> @@ -1838,6 +1841,7 @@ void repo_init_revisions(struct repository *r,
>  	revs->dense = 1;
>  	revs->prefix = prefix;
>  	revs->max_age = -1;
> +	revs->max_age_as_filter = -1;
>  	revs->min_age = -1;
>  	revs->skip_count = -1;
>  	revs->max_count = -1;
> @@ -2218,6 +2222,9 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
>  	} else if ((argcount = parse_long_opt("since", argv, &optarg))) {
>  		revs->max_age = approxidate(optarg);
>  		return argcount;
> +	} else if ((argcount = parse_long_opt("since-as-filter", argv, &optarg))) {
> +		revs->max_age_as_filter = approxidate(optarg);
> +		return argcount;

OK, so max_age_as_filter is a timestamp to be compared with commits
when --since-as-filter option is given.

> @@ -3862,6 +3869,9 @@ enum commit_action get_commit_action(struct rev_info *revs, struct commit *commi
>  	if (revs->min_age != -1 &&
>  	    comparison_date(revs, commit) > revs->min_age)
>  			return commit_ignore;
> +	if (revs->max_age_as_filter != -1 &&
> +	    comparison_date(revs, commit) < revs->max_age_as_filter)
> +			return commit_ignore;

This one does make sense.

> diff --git a/revision.h b/revision.h
> index 5bc59c7bfe..e80c148b19 100644
> --- a/revision.h
> +++ b/revision.h
> @@ -263,6 +263,7 @@ struct rev_info {
>  	int skip_count;
>  	int max_count;
>  	timestamp_t max_age;
> +	timestamp_t max_age_as_filter;
>  	timestamp_t min_age;

So does this.

Everything except that "why are we checking if --since-as-filter is
set and then comparing with the value came from --since?" part looks
great to me.  If that indeed is a bug (it is very possible that I am
misreading the logic and the comparison with continue is perfectly
correct), and if the tests added by this patch didn't catch it, then
the test script may need a bit more work to catch such a mistake.

Thanks.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v6] log: "--since-as-filter" option is a non-terminating "--since" variant
  2022-04-22 20:01               ` [PATCH v6] log: "--since-as-filter" option is a non-terminating "--since" variant Miklos Vajna
  2022-04-22 22:11                 ` Junio C Hamano
@ 2022-04-22 23:43                 ` Junio C Hamano
  2022-04-23 12:59                   ` [PATCH v7] " Miklos Vajna
  1 sibling, 1 reply; 24+ messages in thread
From: Junio C Hamano @ 2022-04-22 23:43 UTC (permalink / raw)
  To: Miklos Vajna; +Cc: demerphq, Git

Miklos Vajna <vmiklos@vmiklos.hu> writes:

> +test_expect_success 'git blame --since=...' '
> +	git blame --since="2020-02-15" file >actual &&
> +	cat >expect <<-\EOF &&
> +	^c7bc5ce (A U Thor 2020-02-01 00:00:00 +0000 1) a
> +	^c7bc5ce (A U Thor 2020-02-01 00:00:00 +0000 2) a
> +	33fc0d13 (A U Thor 2020-03-01 00:00:00 +0000 3) a
> +	ec76e003 (A U Thor 2020-04-01 00:00:00 +0000 4) a
> +	EOF
> +	test_cmp expect actual
> +'

Hardcoding the object names like this does not pass our test suite.
These abbreviated object names hardcode the use of SHA-1, but the
code is tested in repositories that use SHA-256 as well.

As you are creating four commits with distinct timestamps, I think
you can simply filter out the object name part for comparison,
perhaps like:

redact_blame_output () {
	sed -e 's/\([^]*\)\([0-9a-f]*\) /\1HASH /'
}

test_expect_success 'git blame --since=...' '
	git blame --since=2020-02-15 file >raw &&
	redact_blame_output <raw >actual &&
	redact_blame_output <<-\EOF &&
	^c7bc5ce (A U Thor 2020-02-01 00:00:00 +0000 1) a
	^c7bc5ce (A U Thor 2020-02-01 00:00:00 +0000 2) a
	33fc0d13 (A U Thor 2020-03-01 00:00:00 +0000 3) a
	ec76e003 (A U Thor 2020-04-01 00:00:00 +0000 4) a
	EOF
	test_cmp expect actual
'

But did you really mean to test how --since works with blame?  Given
that there does not seem to be any clock skew in the history being
tested, I am wondering if this new test file should even be a part
of the topic.

Thanks.



^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v7] log: "--since-as-filter" option is a non-terminating "--since" variant
  2022-04-22 23:43                 ` Junio C Hamano
@ 2022-04-23 12:59                   ` Miklos Vajna
  0 siblings, 0 replies; 24+ messages in thread
From: Miklos Vajna @ 2022-04-23 12:59 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: demerphq, Git, Ævar Arnfjörð Bjarmason

The "--since=<time>" option of "git log" limits the commits displayed by
the command by stopping the traversal once it sees a commit whose
timestamp is older than the given time and not digging further into its
parents.

This is OK in a history where a commit always has a newer timestamp than
any of its parents'.  Once you see a commit older than the given <time>,
all ancestor commits of it are even older than the time anyway.  It
poses, however, a problem when there is a commit with a wrong timestamp
that makes it appear older than its parents.  Stopping traversal at the
"incorrectly old" commit will hide its ancestors that are newer than
that wrong commit and are newer than the cut-off time given with the
--since option.  --max-age and --after being the synonyms to --since,
they share the same issue.

Add a new "--since-as-filter" option that is a variant of
"--since=<time>".  Instead of stopping the traversal to hide an old
enough commit and its all ancestors, exclude commits with an old
timestamp from the output but still keep digging the history.

Without other traversal stopping options, this will force the command in
"git log" family to dig down the history to the root.  It may be an
acceptable cost for a small project with short history and many commits
with screwy timestamps.

It is quite unlikely for us to add traversal stopper other than since,
so have this as a --since-as-filter option, rather than a separate
--as-filter, that would be probably more confusing.

Signed-off-by: Miklos Vajna <vmiklos@vmiklos.hu>
---

Hi Junio,

On Fri, Apr 22, 2022 at 03:11:11PM -0700, Junio C Hamano <gitster@pobox.com> wrote:
> Thanks.  Now 2.36 release is behind us, let's queue this in 'seen'
> and after getting reviewed by somebody else---I do not trust
> anything looked at by me and nobody else---merge it down, aiming for
> graduation during this cycle.

Ævar: you had opinion in this thread -- are you interested in reviewing this patch?

> Much ealrier than this part in the same loop there is
> 
> 		if (revs->max_age != -1 && (commit->date < revs->max_age))
> 			obj->flags |= UNINTERESTING;
> 		if (process_parents(revs, commit, &original_list, NULL) < 0)
> 			return -1;
> 
> which taints the commit we are looking at as UNINTERESTING, cutting
> the traversal down to its parents, if its timestamp is older than
> max_age, and it would need to be taught not to do that, no?
> 
> Ah, OK, max_age_as_filter is *NOT* a boolean (false is -1 and true
> is something else) but is an independent timestamp and the
> expectation is that max_age is left to be -1 when --since is not in
> use.  --since-as-filter's timestamp is stored in max_age_as_filter
> so the earlier "compare with max_age and use UNINTERESTING bit to
> stop traversal" code does not have to be modified.

Yes, max_age is the terminating timestamp and max_age_as_filter is the
filtering timestamp.

> > +		if (revs->max_age_as_filter != -1 &&
> > +			(commit->date < revs->max_age) && !revs->line_level_traverse)
> > +			continue;
> >  		date = commit->date;
> >  		p = &commit_list_insert(commit, p)->next;
> 
> And if that is the assumption of this code, shouldn't this part be
> using "if max_age is set (i.e. not -1) and the commit is older than
> that and we are not doing -La,b traversal"?  "--since-as-filter"
> being used does not mean max_age must be set to a value that can be
> compared to timestamps in commits in the world order of this version
> of the patch, right?

Good catch, commit->date < revs->max_age is meant to be commit->date <
revs->max_age_as_filter there, fixed.

> Everything except that "why are we checking if --since-as-filter is
> set and then comparing with the value came from --since?" part looks
> great to me.  If that indeed is a bug (it is very possible that I am
> misreading the logic and the comparison with continue is perfectly
> correct), and if the tests added by this patch didn't catch it, then
> the test script may need a bit more work to catch such a mistake.

Yes, it was a bug. For example git log --children --since-as-filter=...
gave empty output due to this. I now added a test for this.

> Hardcoding the object names like this does not pass our test suite.
> These abbreviated object names hardcode the use of SHA-1, but the
> code is tested in repositories that use SHA-256 as well.
> 
> As you are creating four commits with distinct timestamps, I think
> you can simply filter out the object name part for comparison,
> perhaps like:
> 
> redact_blame_output () {
> 	sed -e 's/\([^]*\)\([0-9a-f]*\) /\1HASH /'
> }
> 
> test_expect_success 'git blame --since=...' '
> 	git blame --since=2020-02-15 file >raw &&
> 	redact_blame_output <raw >actual &&
> 	redact_blame_output <<-\EOF &&
> 	^c7bc5ce (A U Thor 2020-02-01 00:00:00 +0000 1) a
> 	^c7bc5ce (A U Thor 2020-02-01 00:00:00 +0000 2) a
> 	33fc0d13 (A U Thor 2020-03-01 00:00:00 +0000 3) a
> 	ec76e003 (A U Thor 2020-04-01 00:00:00 +0000 4) a
> 	EOF
> 	test_cmp expect actual
> '
> 
> But did you really mean to test how --since works with blame?  Given
> that there does not seem to be any clock skew in the history being
> tested, I am wondering if this new test file should even be a part
> of the topic.

I dropped t/t4218-blame-limit.sh from this topic. The idea was to
increase coverage for git blame --since=..., as it seems to have no
test. But we can get back to that in a separate topic, I agree to keep
this topic scoped.

Thanks,

Miklos

 Documentation/rev-list-options.txt |  5 ++++
 revision.c                         | 10 ++++++++
 revision.h                         |  1 +
 t/t4217-log-limit.sh               | 41 ++++++++++++++++++++++++++++++
 4 files changed, 57 insertions(+)
 create mode 100755 t/t4217-log-limit.sh

diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt
index fd4f4e26c9..195e74eec6 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -25,6 +25,11 @@ ordering and formatting options, such as `--reverse`.
 --after=<date>::
 	Show commits more recent than a specific date.
 
+--since-as-filter=<date>::
+	Show all commits more recent than a specific date. This visits
+	all commits in the range, rather than stopping at the first commit which
+	is older than a specific date.
+
 --until=<date>::
 --before=<date>::
 	Show commits older than a specific date.
diff --git a/revision.c b/revision.c
index 7d435f8048..c367273c00 100644
--- a/revision.c
+++ b/revision.c
@@ -1440,6 +1440,9 @@ static int limit_list(struct rev_info *revs)
 		if (revs->min_age != -1 && (commit->date > revs->min_age) &&
 		    !revs->line_level_traverse)
 			continue;
+		if (revs->max_age_as_filter != -1 &&
+			(commit->date < revs->max_age_as_filter) && !revs->line_level_traverse)
+			continue;
 		date = commit->date;
 		p = &commit_list_insert(commit, p)->next;
 
@@ -1838,6 +1841,7 @@ void repo_init_revisions(struct repository *r,
 	revs->dense = 1;
 	revs->prefix = prefix;
 	revs->max_age = -1;
+	revs->max_age_as_filter = -1;
 	revs->min_age = -1;
 	revs->skip_count = -1;
 	revs->max_count = -1;
@@ -2218,6 +2222,9 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
 	} else if ((argcount = parse_long_opt("since", argv, &optarg))) {
 		revs->max_age = approxidate(optarg);
 		return argcount;
+	} else if ((argcount = parse_long_opt("since-as-filter", argv, &optarg))) {
+		revs->max_age_as_filter = approxidate(optarg);
+		return argcount;
 	} else if ((argcount = parse_long_opt("after", argv, &optarg))) {
 		revs->max_age = approxidate(optarg);
 		return argcount;
@@ -3862,6 +3869,9 @@ enum commit_action get_commit_action(struct rev_info *revs, struct commit *commi
 	if (revs->min_age != -1 &&
 	    comparison_date(revs, commit) > revs->min_age)
 			return commit_ignore;
+	if (revs->max_age_as_filter != -1 &&
+	    comparison_date(revs, commit) < revs->max_age_as_filter)
+			return commit_ignore;
 	if (revs->min_parents || (revs->max_parents >= 0)) {
 		int n = commit_list_count(commit->parents);
 		if ((n < revs->min_parents) ||
diff --git a/revision.h b/revision.h
index 5bc59c7bfe..e80c148b19 100644
--- a/revision.h
+++ b/revision.h
@@ -263,6 +263,7 @@ struct rev_info {
 	int skip_count;
 	int max_count;
 	timestamp_t max_age;
+	timestamp_t max_age_as_filter;
 	timestamp_t min_age;
 	int min_parents;
 	int max_parents;
diff --git a/t/t4217-log-limit.sh b/t/t4217-log-limit.sh
new file mode 100755
index 0000000000..6e01e2629c
--- /dev/null
+++ b/t/t4217-log-limit.sh
@@ -0,0 +1,41 @@
+#!/bin/sh
+
+test_description='git log with filter options limiting the output'
+
+. ./test-lib.sh
+
+test_expect_success 'setup test' '
+	git init &&
+	echo a >file &&
+	git add file &&
+	GIT_COMMITTER_DATE="2021-02-01 00:00" git commit -m init &&
+	echo a >>file &&
+	git add file &&
+	GIT_COMMITTER_DATE="2022-02-01 00:00" git commit -m first &&
+	echo a >>file &&
+	git add file &&
+	GIT_COMMITTER_DATE="2021-03-01 00:00" git commit -m second &&
+	echo a >>file &&
+	git add file &&
+	GIT_COMMITTER_DATE="2022-03-01 00:00" git commit -m third
+'
+
+test_expect_success 'git log --since-as-filter=...' '
+	git log --since-as-filter="2022-01-01" --format=%s >actual &&
+	cat >expect <<-\EOF &&
+	third
+	first
+	EOF
+	test_cmp expect actual
+'
+
+test_expect_success 'git log --children --since-as-filter=...' '
+	git log --children --since-as-filter="2022-01-01" --format=%s >actual &&
+	cat >expect <<-\EOF &&
+	third
+	first
+	EOF
+	test_cmp expect actual
+'
+
+test_done
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2022-04-23 13:00 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-01  8:21 git log --since to not stop after first old commit? Miklos Vajna
2022-04-01  9:57 ` Ævar Arnfjörð Bjarmason
2022-04-01 10:14   ` Miklos Vajna
2022-04-01 13:51     ` Ævar Arnfjörð Bjarmason
2022-04-01 17:51 ` Junio C Hamano
2022-04-01 21:36   ` [PATCH] git-log: add a --since-as-filter option Miklos Vajna
2022-04-02 10:09     ` [PATCH v2] " Miklos Vajna
2022-04-07 15:43   ` git log --since to not stop after first old commit? Miklos Vajna
2022-04-08  2:30     ` Junio C Hamano
2022-04-08 18:19       ` Junio C Hamano
     [not found]         ` <CANgJU+Wr+tKNPfeh4dst-E_LSnoYYmN1easqmkFUA9spp-rpKQ@mail.gmail.com>
2022-04-11  6:37           ` Miklos Vajna
2022-04-11  9:18             ` demerphq
2022-04-11 16:58           ` Junio C Hamano
2022-04-22 18:48             ` Junio C Hamano
2022-04-22 20:01               ` [PATCH v6] log: "--since-as-filter" option is a non-terminating "--since" variant Miklos Vajna
2022-04-22 22:11                 ` Junio C Hamano
2022-04-22 23:43                 ` Junio C Hamano
2022-04-23 12:59                   ` [PATCH v7] " Miklos Vajna
2022-04-08 21:01       ` [PATCH v3] git-log: add a --since=... --as-filter option Miklos Vajna
2022-04-12  8:47         ` Ævar Arnfjörð Bjarmason
2022-04-15 20:39           ` [PATCH v4] " Miklos Vajna
2022-04-15 23:13             ` Junio C Hamano
2022-04-16 14:23               ` [PATCH v5] log: "--as-filter" option adjusts how "--since" cut-off works Miklos Vajna
2022-04-22  6:50                 ` Miklos Vajna

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).