git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* Mismatched HEAD default behavior from git log
@ 2020-08-25 19:16 Bryan Turner
  2020-08-25 19:40 ` Junio C Hamano
  0 siblings, 1 reply; 7+ messages in thread
From: Bryan Turner @ 2020-08-25 19:16 UTC (permalink / raw)
  To: Git Users

When git log is run without revs, it defaults to HEAD. That's
well-documented. However, when paired with --ignore-missing, there's a
behavior mismatch if all the provided revs are nonexistent.

If you provide the revs on the command line, like "git log
--ignore-missing nonexistent --" (the trailing -- is just to help git
log know "nonexistent" isn't intended to be a path), you get no output
and no error (exit code is 0).

If you provide the revs via stdin, like "echo 'nonexistent' | git log
--ignore-missing --stdin --", you get every commit reachable from
HEAD.

It appears the way --stdin processes input discards nonexistent
commits before the machinery that decides whether you provided any
revs or not runs, and so if every --stdin rev is discarded then you
get the default HEAD. If you provide them via the command line,
though, then it seems like they're discarded later and you don't get a
default.

I'm not sure whether this is intentional or not (certainly I don't see
it anywhere in the git log documentation for --ignore-missing or
--stdin), but it results in a behavior mismatch that's impossible to
reconcile without requiring extra git processes. I can't always
provide HEAD since, if multiple revs are supplied, if any revs exist
then HEAD would not be included regardless of whether the revs were
supplied via the command line or --stdin.

I've confirmed this behavior as far back as Git 1.8.0 all the way up
to 2.28.0, so it's by no means new. I can't recall seeing it come up
on the list before, but that doesn't mean it hasn't. Is this a bug, or
a feature?

Best regards,
Bryan Turner

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Mismatched HEAD default behavior from git log
  2020-08-25 19:16 Mismatched HEAD default behavior from git log Bryan Turner
@ 2020-08-25 19:40 ` Junio C Hamano
  2020-08-25 19:46   ` Jeff King
  0 siblings, 1 reply; 7+ messages in thread
From: Junio C Hamano @ 2020-08-25 19:40 UTC (permalink / raw)
  To: Bryan Turner; +Cc: Git Users

Bryan Turner <bturner@atlassian.com> writes:

> It appears the way --stdin processes input discards nonexistent
> commits before the machinery that decides whether you provided any
> revs or not runs, and so if every --stdin rev is discarded then you
> get the default HEAD. If you provide them via the command line,
> though, then it seems like they're discarded later and you don't get a
> default.
>
> I'm not sure whether this is intentional or not (certainly I don't see
> it anywhere in the git log documentation for --ignore-missing or
> --stdin), but it results in a behavior mismatch that's impossible to
> reconcile without requiring extra git processes. I can't always
> provide HEAD since, if multiple revs are supplied, if any revs exist
> then HEAD would not be included regardless of whether the revs were
> supplied via the command line or --stdin.

As the intent for adding the "--stdin" option to any subcommand has
always been "we may need to feed many many things, that may bust the
command line length limit, hence we let you feed these things from
the standard input, but otherwise there should be no change in
behaviour or semantics", when the behaviour of command line and
"--stdin" differ, it is a bug in the latter.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Mismatched HEAD default behavior from git log
  2020-08-25 19:40 ` Junio C Hamano
@ 2020-08-25 19:46   ` Jeff King
  2020-08-25 19:51     ` Junio C Hamano
  0 siblings, 1 reply; 7+ messages in thread
From: Jeff King @ 2020-08-25 19:46 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Bryan Turner, Git Users

On Tue, Aug 25, 2020 at 12:40:42PM -0700, Junio C Hamano wrote:

> Bryan Turner <bturner@atlassian.com> writes:
> 
> > It appears the way --stdin processes input discards nonexistent
> > commits before the machinery that decides whether you provided any
> > revs or not runs, and so if every --stdin rev is discarded then you
> > get the default HEAD. If you provide them via the command line,
> > though, then it seems like they're discarded later and you don't get a
> > default.
> >
> > I'm not sure whether this is intentional or not (certainly I don't see
> > it anywhere in the git log documentation for --ignore-missing or
> > --stdin), but it results in a behavior mismatch that's impossible to
> > reconcile without requiring extra git processes. I can't always
> > provide HEAD since, if multiple revs are supplied, if any revs exist
> > then HEAD would not be included regardless of whether the revs were
> > supplied via the command line or --stdin.
> 
> As the intent for adding the "--stdin" option to any subcommand has
> always been "we may need to feed many many things, that may bust the
> command line length limit, hence we let you feed these things from
> the standard input, but otherwise there should be no change in
> behaviour or semantics", when the behaviour of command line and
> "--stdin" differ, it is a bug in the latter.

Agreed. It also helps in this case that the command-line behavior is
sensible and the --stdin one is not. :)

I think the solution is probably something like:

diff --git a/revision.c b/revision.c
index 96630e3186..f5bbefa091 100644
--- a/revision.c
+++ b/revision.c
@@ -2099,12 +2099,13 @@ static void read_pathspec_from_stdin(struct strbuf *sb,
 		strvec_push(prune, sb->buf);
 }
 
-static void read_revisions_from_stdin(struct rev_info *revs,
-				      struct strvec *prune)
+static int read_revisions_from_stdin(struct rev_info *revs,
+				     struct strvec *prune)
 {
 	struct strbuf sb;
 	int seen_dashdash = 0;
 	int save_warning;
+	int got_rev_arg = 0;
 
 	save_warning = warn_on_object_refname_ambiguity;
 	warn_on_object_refname_ambiguity = 0;
@@ -2124,12 +2125,14 @@ static void read_revisions_from_stdin(struct rev_info *revs,
 		if (handle_revision_arg(sb.buf, revs, 0,
 					REVARG_CANNOT_BE_FILENAME))
 			die("bad revision '%s'", sb.buf);
+		got_rev_arg = 1;
 	}
 	if (seen_dashdash)
 		read_pathspec_from_stdin(&sb, prune);
 
 	strbuf_release(&sb);
 	warn_on_object_refname_ambiguity = save_warning;
+	return got_rev_arg;
 }
 
 static void add_grep(struct rev_info *revs, const char *ptn, enum grep_pat_token what)
@@ -2754,7 +2757,8 @@ int setup_revisions(int argc, const char **argv, struct rev_info *revs, struct s
 				}
 				if (revs->read_from_stdin++)
 					die("--stdin given twice?");
-				read_revisions_from_stdin(revs, &prune_data);
+				if (read_revisions_from_stdin(revs, &prune_data))
+					got_rev_arg = 1;
 				continue;
 			}
 

Possibly it would make sense to push that flag into rev_info, though,
and let handle_revision_arg() set it. That would fix this bug and
prevent similar ones in other code paths (though we're not likely to get
revisions from anywhere else, I suppose).

-Peff

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: Mismatched HEAD default behavior from git log
  2020-08-25 19:46   ` Jeff King
@ 2020-08-25 19:51     ` Junio C Hamano
  2020-08-25 19:55       ` Jeff King
  0 siblings, 1 reply; 7+ messages in thread
From: Junio C Hamano @ 2020-08-25 19:51 UTC (permalink / raw)
  To: Jeff King; +Cc: Bryan Turner, Git Users

Jeff King <peff@peff.net> writes:

>> As the intent for adding the "--stdin" option to any subcommand has
>> always been "we may need to feed many many things, that may bust the
>> command line length limit, hence we let you feed these things from
>> the standard input, but otherwise there should be no change in
>> behaviour or semantics", when the behaviour of command line and
>> "--stdin" differ, it is a bug in the latter.
>
> Agreed. It also helps in this case that the command-line behavior is
> sensible and the --stdin one is not. :)
>
> I think the solution is probably something like:

You beat me to it while I was wondering what to do between the local
got_rev_arg variable and the revs->rev_input_given field.


> diff --git a/revision.c b/revision.c
> index 96630e3186..f5bbefa091 100644
> --- a/revision.c
> +++ b/revision.c
> @@ -2099,12 +2099,13 @@ static void read_pathspec_from_stdin(struct strbuf *sb,
>  		strvec_push(prune, sb->buf);
>  }
>  
> -static void read_revisions_from_stdin(struct rev_info *revs,
> -				      struct strvec *prune)
> +static int read_revisions_from_stdin(struct rev_info *revs,
> +				     struct strvec *prune)
>  {
>  	struct strbuf sb;
>  	int seen_dashdash = 0;
>  	int save_warning;
> +	int got_rev_arg = 0;
>  
>  	save_warning = warn_on_object_refname_ambiguity;
>  	warn_on_object_refname_ambiguity = 0;
> @@ -2124,12 +2125,14 @@ static void read_revisions_from_stdin(struct rev_info *revs,
>  		if (handle_revision_arg(sb.buf, revs, 0,
>  					REVARG_CANNOT_BE_FILENAME))
>  			die("bad revision '%s'", sb.buf);
> +		got_rev_arg = 1;
>  	}
>  	if (seen_dashdash)
>  		read_pathspec_from_stdin(&sb, prune);
>  
>  	strbuf_release(&sb);
>  	warn_on_object_refname_ambiguity = save_warning;
> +	return got_rev_arg;
>  }
>  
>  static void add_grep(struct rev_info *revs, const char *ptn, enum grep_pat_token what)
> @@ -2754,7 +2757,8 @@ int setup_revisions(int argc, const char **argv, struct rev_info *revs, struct s
>  				}
>  				if (revs->read_from_stdin++)
>  					die("--stdin given twice?");
> -				read_revisions_from_stdin(revs, &prune_data);
> +				if (read_revisions_from_stdin(revs, &prune_data))
> +					got_rev_arg = 1;
>  				continue;
>  			}
>  
>
> Possibly it would make sense to push that flag into rev_info, though,
> and let handle_revision_arg() set it. That would fix this bug and
> prevent similar ones in other code paths (though we're not likely to get
> revisions from anywhere else, I suppose).
>
> -Peff

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Mismatched HEAD default behavior from git log
  2020-08-25 19:51     ` Junio C Hamano
@ 2020-08-25 19:55       ` Jeff King
  2020-08-26 20:13         ` [PATCH] revision: set rev_input_given in handle_revision_arg() Jeff King
  0 siblings, 1 reply; 7+ messages in thread
From: Jeff King @ 2020-08-25 19:55 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Bryan Turner, Git Users

On Tue, Aug 25, 2020 at 12:51:59PM -0700, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> >> As the intent for adding the "--stdin" option to any subcommand has
> >> always been "we may need to feed many many things, that may bust the
> >> command line length limit, hence we let you feed these things from
> >> the standard input, but otherwise there should be no change in
> >> behaviour or semantics", when the behaviour of command line and
> >> "--stdin" differ, it is a bug in the latter.
> >
> > Agreed. It also helps in this case that the command-line behavior is
> > sensible and the --stdin one is not. :)
> >
> > I think the solution is probably something like:
> 
> You beat me to it while I was wondering what to do between the local
> got_rev_arg variable and the revs->rev_input_given field.

That makes me wonder why we need got_rev_arg at all if we have
revs->rev_input_given. But I suspect an answer can be found by digging
into git-blame. I probably won't do that immediately, so if you want to,
you can do so without worrying that we're duplicating work. :)

-Peff

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH] revision: set rev_input_given in handle_revision_arg()
  2020-08-25 19:55       ` Jeff King
@ 2020-08-26 20:13         ` Jeff King
  2020-08-26 20:20           ` Junio C Hamano
  0 siblings, 1 reply; 7+ messages in thread
From: Jeff King @ 2020-08-26 20:13 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Bryan Turner, Git Users

On Tue, Aug 25, 2020 at 03:55:11PM -0400, Jeff King wrote:

> > You beat me to it while I was wondering what to do between the local
> > got_rev_arg variable and the revs->rev_input_given field.
> 
> That makes me wonder why we need got_rev_arg at all if we have
> revs->rev_input_given. But I suspect an answer can be found by digging
> into git-blame. I probably won't do that immediately, so if you want to,
> you can do so without worrying that we're duplicating work. :)

This is all my fault, naturally. :) I added rev_input_given a while ago
but didn't go far enough. Here's what I think we should do:

-- >8 --
Subject: [PATCH] revision: set rev_input_given in handle_revision_arg()

Commit 7ba826290a (revision: add rev_input_given flag, 2017-08-02) added
a flag to rev_info to tell whether we got any revision arguments. As
explained there, this is necessary because some revision arguments may
not produce any pending traversal objects, but should still inhibit
default behaviors (e.g., a glob that matches nothing).

However, it only set the flag in the globbing code, but not for
revisions we get on the command-line or via stdin. This leads to two
problems:

  - the command-line code keeps its own separate got_rev_arg flag; this
    isn't wrong, but it's confusing and an extra maintenance burden

  - even specifically-named rev arguments might end up not adding any
    pending objects: if --ignore-missing is set, then specifying a
    missing object is a noop rather than an error.

And that leads to some user-visible bugs:

  - when deciding whether a default rev like "HEAD" should kick in, we
    check both got_rev_arg and rev_input_given. That means that
    "--ignore-missing $ZERO_OID" works on the command-line (where we set
    got_rev_arg) but not on --stdin (where we don't)

  - when rev-list decides whether it should complain that it wasn't
    given a starting point, it relies on rev_input_given. So it can't
    even get the command-line "--ignore-missing $ZERO_OID" right

Let's consistently set the flag if we got any revision argument. That
lets us clean up the redundant got_rev_arg, and fixes both of those bugs
(but note there are three new tests: we'll confirm the already working
git-log command-line case).

A few implementation notes:

  - conceptually we want to set the flag whenever handle_revision_arg()
    finds an actual revision arg ("handles" it, you might say). But it
    covers a ton of cases with early returns. Rather than annotating
    each one, we just wrap it and use its success exit-code to set the
    flag in one spot.

  - the new rev-list test is in t6018, which is titled to cover globs.
    This isn't exactly a glob, but it made sense to stick it with the
    other tests that handle the "even though we got a rev, we have no
    pending objects" case, which are globs.

  - the tests check for the oid of a missing object, which it's pretty
    clear --ignore-missing should ignore. You can see the same behavior
    with "--ignore-missing a-ref-that-does-not-exist", because
    --ignore-missing treats them both the same. That's perhaps less
    clearly correct, and we may want to change that in the future. But
    the way the code and tests here are written, we'd continue to do the
    right thing even if it does.

Reported-by: Bryan Turner <bturner@atlassian.com>
Signed-off-by: Jeff King <peff@peff.net>
---
 revision.c               | 16 +++++++++++-----
 t/t4202-log.sh           | 10 ++++++++++
 t/t6018-rev-list-glob.sh |  5 +++++
 3 files changed, 26 insertions(+), 5 deletions(-)

diff --git a/revision.c b/revision.c
index 96630e3186..08c2ad23af 100644
--- a/revision.c
+++ b/revision.c
@@ -2017,7 +2017,7 @@ static int handle_dotdot(const char *arg,
 	return ret;
 }
 
-int handle_revision_arg(const char *arg_, struct rev_info *revs, int flags, unsigned revarg_opt)
+static int handle_revision_arg_1(const char *arg_, struct rev_info *revs, int flags, unsigned revarg_opt)
 {
 	struct object_context oc;
 	char *mark;
@@ -2092,6 +2092,14 @@ int handle_revision_arg(const char *arg_, struct rev_info *revs, int flags, unsi
 	return 0;
 }
 
+int handle_revision_arg(const char *arg, struct rev_info *revs, int flags, unsigned revarg_opt)
+{
+	int ret = handle_revision_arg_1(arg, revs, flags, revarg_opt);
+	if (!ret)
+		revs->rev_input_given = 1;
+	return ret;
+}
+
 static void read_pathspec_from_stdin(struct strbuf *sb,
 				     struct strvec *prune)
 {
@@ -2703,7 +2711,7 @@ static void NORETURN diagnose_missing_default(const char *def)
  */
 int setup_revisions(int argc, const char **argv, struct rev_info *revs, struct setup_revision_opt *opt)
 {
-	int i, flags, left, seen_dashdash, got_rev_arg = 0, revarg_opt;
+	int i, flags, left, seen_dashdash, revarg_opt;
 	struct strvec prune_data = STRVEC_INIT;
 	const char *submodule = NULL;
 	int seen_end_of_options = 0;
@@ -2792,8 +2800,6 @@ int setup_revisions(int argc, const char **argv, struct rev_info *revs, struct s
 			strvec_pushv(&prune_data, argv + i);
 			break;
 		}
-		else
-			got_rev_arg = 1;
 	}
 
 	if (prune_data.nr) {
@@ -2822,7 +2828,7 @@ int setup_revisions(int argc, const char **argv, struct rev_info *revs, struct s
 		opt->tweak(revs, opt);
 	if (revs->show_merge)
 		prepare_show_merge(revs);
-	if (revs->def && !revs->pending.nr && !revs->rev_input_given && !got_rev_arg) {
+	if (revs->def && !revs->pending.nr && !revs->rev_input_given) {
 		struct object_id oid;
 		struct object *object;
 		struct object_context oc;
diff --git a/t/t4202-log.sh b/t/t4202-log.sh
index a0930599aa..56d34ed465 100755
--- a/t/t4202-log.sh
+++ b/t/t4202-log.sh
@@ -1850,6 +1850,16 @@ test_expect_success 'log does not default to HEAD when rev input is given' '
 	test_must_be_empty actual
 '
 
+test_expect_success 'do not default to HEAD with ignored object on cmdline' '
+	git log --ignore-missing $ZERO_OID >actual &&
+	test_must_be_empty actual
+'
+
+test_expect_success 'do not default to HEAD with ignored object on stdin' '
+	echo $ZERO_OID | git log --ignore-missing --stdin >actual &&
+	test_must_be_empty actual
+'
+
 test_expect_success 'set up --source tests' '
 	git checkout --orphan source-a &&
 	test_commit one &&
diff --git a/t/t6018-rev-list-glob.sh b/t/t6018-rev-list-glob.sh
index bb5aeac07f..b31ff7eeec 100755
--- a/t/t6018-rev-list-glob.sh
+++ b/t/t6018-rev-list-glob.sh
@@ -345,6 +345,11 @@ test_expect_success 'rev-list should succeed with empty output with empty glob'
 	test_must_be_empty actual
 '
 
+test_expect_success 'rev-list should succeed with empty output when ignoring missing' '
+	git rev-list --ignore-missing $ZERO_OID >actual &&
+	test_must_be_empty actual
+'
+
 test_expect_success 'shortlog accepts --glob/--tags/--remotes' '
 
 	compare shortlog "subspace/one subspace/two" --branches=subspace &&
-- 
2.28.0.749.gf1242ce4bd


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] revision: set rev_input_given in handle_revision_arg()
  2020-08-26 20:13         ` [PATCH] revision: set rev_input_given in handle_revision_arg() Jeff King
@ 2020-08-26 20:20           ` Junio C Hamano
  0 siblings, 0 replies; 7+ messages in thread
From: Junio C Hamano @ 2020-08-26 20:20 UTC (permalink / raw)
  To: Jeff King; +Cc: Bryan Turner, Git Users

Jeff King <peff@peff.net> writes:

> Subject: [PATCH] revision: set rev_input_given in handle_revision_arg()
>
> Commit 7ba826290a (revision: add rev_input_given flag, 2017-08-02) added
> a flag to rev_info to tell whether we got any revision arguments. As
> explained there, this is necessary because some revision arguments may
> not produce any pending traversal objects, but should still inhibit
> default behaviors (e.g., a glob that matches nothing).

Ah, that explains the symptom under discussion quite well; the topic
that introduced the commit was to fix "git log --tag=no-such-tag"
that used to default to HEAD.  We should be able to reuse the same
mechanism to prevent --stdin codepath from falling back to HEAD, of
course.

> However, it only set the flag in the globbing code, but not for
> revisions we get on the command-line or via stdin. This leads to two
> problems:
>
>   - the command-line code keeps its own separate got_rev_arg flag; this
>     isn't wrong, but it's confusing and an extra maintenance burden
>
>   - even specifically-named rev arguments might end up not adding any
>     pending objects: if --ignore-missing is set, then specifying a
>     missing object is a noop rather than an error.

Yup, nicely described.

> diff --git a/revision.c b/revision.c
> index 96630e3186..08c2ad23af 100644
> --- a/revision.c
> +++ b/revision.c

The patch of course looks straightforward and good.  Thanks.



^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-08-26 20:20 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-25 19:16 Mismatched HEAD default behavior from git log Bryan Turner
2020-08-25 19:40 ` Junio C Hamano
2020-08-25 19:46   ` Jeff King
2020-08-25 19:51     ` Junio C Hamano
2020-08-25 19:55       ` Jeff King
2020-08-26 20:13         ` [PATCH] revision: set rev_input_given in handle_revision_arg() Jeff King
2020-08-26 20:20           ` Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).