git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH 0/2] range-diff: fix a crash in parsing git-log output
@ 2020-04-15 14:28 Vasil Dimov via GitGitGadget
  2020-04-15 14:28 ` [PATCH 1/2] " Vasil Dimov via GitGitGadget
                   ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: Vasil Dimov via GitGitGadget @ 2020-04-15 14:28 UTC (permalink / raw)
  To: git; +Cc: Vasil Dimov

 * override a possibly user-customized format.pretty that would render git
   log output unparsable by git range-diff
   
   
 * don't use negative string precision, e.g. "%.*s", -5, "foo"

Vasil Dimov (2):
  range-diff: fix a crash in parsing git-log output
  range-diff: avoid negative string precision

 range-diff.c          | 18 +++++++++++++++++-
 t/t3206-range-diff.sh | 10 ++++++++++
 2 files changed, 27 insertions(+), 1 deletion(-)


base-commit: de49261b050d9cd8ec73842356077bc5b606640f
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-760%2Fvasild%2Frange-diff-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-760/vasild/range-diff-v1
Pull-Request: https://github.com/git/git/pull/760
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 1/2] range-diff: fix a crash in parsing git-log output
  2020-04-15 14:28 [PATCH 0/2] range-diff: fix a crash in parsing git-log output Vasil Dimov via GitGitGadget
@ 2020-04-15 14:28 ` Vasil Dimov via GitGitGadget
  2020-04-15 15:31   ` Junio C Hamano
  2020-04-15 16:13   ` Taylor Blau
  2020-04-15 14:28 ` [PATCH 2/2] range-diff: avoid negative string precision Vasil Dimov via GitGitGadget
  2020-04-15 20:32 ` [PATCH v2 0/2] range-diff: fix a crash in parsing git-log output Vasil Dimov via GitGitGadget
  2 siblings, 2 replies; 15+ messages in thread
From: Vasil Dimov via GitGitGadget @ 2020-04-15 14:28 UTC (permalink / raw)
  To: git; +Cc: Vasil Dimov, Vasil Dimov

From: Vasil Dimov <vd@FreeBSD.org>

`git range-diff` calls `git log` internally and tries to parse its
output. But `git log` output can be customized by the user in their
git config and for certain configurations either an error will be
returned by `git range-diff` or it will crash.

To fix this explicitly set the output format of the internally
executed `git log` with `--pretty=medium`. Because that cancels
`--notes`, add explicitly `--notes` at the end.

Also, make sure we never crash in the same way - trying to dereference
`util` which was never created and has remained NULL. It would happen
if the first line of `git log` output does not begin with 'commit '.

Alternative considered but discarded - somehow disable all git configs
and behave as if no config is present in the internally executed
`git log`, but that does not seem to be possible. GIT_CONFIG_NOSYSTEM
is the closest to it, but even with that we would still read
`.git/config`.

Signed-off-by: Vasil Dimov <vd@FreeBSD.org>
---
 range-diff.c          | 13 +++++++++++++
 t/t3206-range-diff.sh | 10 ++++++++++
 2 files changed, 23 insertions(+)

diff --git a/range-diff.c b/range-diff.c
index f745567cf67..5cc920be391 100644
--- a/range-diff.c
+++ b/range-diff.c
@@ -63,6 +63,8 @@ static int read_patches(const char *range, struct string_list *list,
 			"--output-indicator-old=<",
 			"--output-indicator-context=#",
 			"--no-abbrev-commit",
+			"--pretty=medium",
+			"--notes",
 			NULL);
 	if (other_arg)
 		argv_array_pushv(&cp.args, other_arg->argv);
@@ -106,6 +108,17 @@ static int read_patches(const char *range, struct string_list *list,
 			continue;
 		}
 
+		if (!util) {
+			error(_("could not parse first line of `log` output: "
+				"did not start with 'commit ': '%s'"),
+			      line);
+			string_list_clear(list, 1);
+			strbuf_release(&buf);
+			strbuf_release(&contents);
+			finish_command(&cp);
+			return -1;
+		}
+
 		if (starts_with(line, "diff --git")) {
 			struct patch patch = { 0 };
 			struct strbuf root = STRBUF_INIT;
diff --git a/t/t3206-range-diff.sh b/t/t3206-range-diff.sh
index bd808f87ed5..e024cff65cb 100755
--- a/t/t3206-range-diff.sh
+++ b/t/t3206-range-diff.sh
@@ -513,6 +513,16 @@ test_expect_success 'range-diff overrides diff.noprefix internally' '
 	git -c diff.noprefix=true range-diff HEAD^...
 '
 
+test_expect_success 'basic with modified format.pretty with suffix' '
+	git -c format.pretty="format:commit %H%d%n" range-diff \
+		master..topic master..unmodified
+'
+
+test_expect_success 'basic with modified format.pretty without "commit "' '
+	git -c format.pretty="format:%H%n" range-diff \
+		master..topic master..unmodified
+'
+
 test_expect_success 'range-diff compares notes by default' '
 	git notes add -m "topic note" topic &&
 	git notes add -m "unmodified note" unmodified &&
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 2/2] range-diff: avoid negative string precision
  2020-04-15 14:28 [PATCH 0/2] range-diff: fix a crash in parsing git-log output Vasil Dimov via GitGitGadget
  2020-04-15 14:28 ` [PATCH 1/2] " Vasil Dimov via GitGitGadget
@ 2020-04-15 14:28 ` Vasil Dimov via GitGitGadget
  2020-04-15 16:20   ` Taylor Blau
  2020-04-15 20:32 ` [PATCH v2 0/2] range-diff: fix a crash in parsing git-log output Vasil Dimov via GitGitGadget
  2 siblings, 1 reply; 15+ messages in thread
From: Vasil Dimov via GitGitGadget @ 2020-04-15 14:28 UTC (permalink / raw)
  To: git; +Cc: Vasil Dimov, Vasil Dimov

From: Vasil Dimov <vd@FreeBSD.org>

If the supplied integer for "precisoin" is negative in
`"%.*s", len, line` then it is ignored. So the current code is
equivalent to just `"%s", line` because it is executed only if
`len` is negative.

Fix this by saving the value of `len` before overwriting it with the
return value of `parse_git_diff_header()`.

Signed-off-by: Vasil Dimov <vd@FreeBSD.org>
---
 range-diff.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/range-diff.c b/range-diff.c
index 5cc920be391..40af0862818 100644
--- a/range-diff.c
+++ b/range-diff.c
@@ -123,16 +123,19 @@ static int read_patches(const char *range, struct string_list *list,
 			struct patch patch = { 0 };
 			struct strbuf root = STRBUF_INIT;
 			int linenr = 0;
+			int orig_len;
 
 			in_header = 0;
 			strbuf_addch(&buf, '\n');
 			if (!util->diff_offset)
 				util->diff_offset = buf.len;
 			line[len - 1] = '\n';
+			orig_len = len;
 			len = parse_git_diff_header(&root, &linenr, 0, line,
 						    len, size, &patch);
 			if (len < 0)
-				die(_("could not parse git header '%.*s'"), (int)len, line);
+				die(_("could not parse git header '%.*s'"),
+				    orig_len, line);
 			strbuf_addstr(&buf, " ## ");
 			if (patch.is_new > 0)
 				strbuf_addf(&buf, "%s (new)", patch.new_name);
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] range-diff: fix a crash in parsing git-log output
  2020-04-15 14:28 ` [PATCH 1/2] " Vasil Dimov via GitGitGadget
@ 2020-04-15 15:31   ` Junio C Hamano
  2020-04-15 16:16     ` Vasil Dimov
  2020-04-15 16:23     ` Jeff King
  2020-04-15 16:13   ` Taylor Blau
  1 sibling, 2 replies; 15+ messages in thread
From: Junio C Hamano @ 2020-04-15 15:31 UTC (permalink / raw)
  To: Vasil Dimov via GitGitGadget; +Cc: git, Vasil Dimov

"Vasil Dimov via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Vasil Dimov <vd@FreeBSD.org>
>
> `git range-diff` calls `git log` internally and tries to parse its
> output. But `git log` output can be customized by the user in their
> git config and for certain configurations either an error will be
> returned by `git range-diff` or it will crash.
>
> To fix this explicitly set the output format of the internally
> executed `git log` with `--pretty=medium`. Because that cancels
> `--notes`, add explicitly `--notes` at the end.

Good finding.  

Shouldn't we also disable customizations that come from the
configuration variables like diff.external, diff.<driver>.command?

Thanks.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] range-diff: fix a crash in parsing git-log output
  2020-04-15 14:28 ` [PATCH 1/2] " Vasil Dimov via GitGitGadget
  2020-04-15 15:31   ` Junio C Hamano
@ 2020-04-15 16:13   ` Taylor Blau
  1 sibling, 0 replies; 15+ messages in thread
From: Taylor Blau @ 2020-04-15 16:13 UTC (permalink / raw)
  To: Vasil Dimov via GitGitGadget; +Cc: git, Vasil Dimov

Hi Vasil,

Nice find. I have a question below:

On Wed, Apr 15, 2020 at 02:28:40PM +0000, Vasil Dimov via GitGitGadget wrote:
> From: Vasil Dimov <vd@FreeBSD.org>
>
> `git range-diff` calls `git log` internally and tries to parse its
> output. But `git log` output can be customized by the user in their
> git config and for certain configurations either an error will be
> returned by `git range-diff` or it will crash.
>
> To fix this explicitly set the output format of the internally
> executed `git log` with `--pretty=medium`. Because that cancels
> `--notes`, add explicitly `--notes` at the end.
>
> Also, make sure we never crash in the same way - trying to dereference
> `util` which was never created and has remained NULL. It would happen
> if the first line of `git log` output does not begin with 'commit '.
>
> Alternative considered but discarded - somehow disable all git configs
> and behave as if no config is present in the internally executed
> `git log`, but that does not seem to be possible. GIT_CONFIG_NOSYSTEM
> is the closest to it, but even with that we would still read
> `.git/config`.

I don't know of a great way to do this off-hand. Perhaps an internal
`--for-range-diff` option that ignores options that are incompatible
with range-diff's own parsing seems like an OK path forward to me.

Here, internal means that it's not part of the manual page. We can pass
'PARSE_OPT_NONEG' to make sure that a caller can't later overwrite it by
passing '--no-for-range-diff'.

> Signed-off-by: Vasil Dimov <vd@FreeBSD.org>
> ---
>  range-diff.c          | 13 +++++++++++++
>  t/t3206-range-diff.sh | 10 ++++++++++
>  2 files changed, 23 insertions(+)
>
> diff --git a/range-diff.c b/range-diff.c
> index f745567cf67..5cc920be391 100644
> --- a/range-diff.c
> +++ b/range-diff.c
> @@ -63,6 +63,8 @@ static int read_patches(const char *range, struct string_list *list,
>  			"--output-indicator-old=<",
>  			"--output-indicator-context=#",
>  			"--no-abbrev-commit",
> +			"--pretty=medium",
> +			"--notes",
>  			NULL);
>  	if (other_arg)
>  		argv_array_pushv(&cp.args, other_arg->argv);
> @@ -106,6 +108,17 @@ static int read_patches(const char *range, struct string_list *list,
>  			continue;
>  		}
>
> +		if (!util) {
> +			error(_("could not parse first line of `log` output: "
> +				"did not start with 'commit ': '%s'"),
> +			      line);
> +			string_list_clear(list, 1);
> +			strbuf_release(&buf);
> +			strbuf_release(&contents);
> +			finish_command(&cp);
> +			return -1;
> +		}
> +
>  		if (starts_with(line, "diff --git")) {
>  			struct patch patch = { 0 };
>  			struct strbuf root = STRBUF_INIT;
> diff --git a/t/t3206-range-diff.sh b/t/t3206-range-diff.sh
> index bd808f87ed5..e024cff65cb 100755
> --- a/t/t3206-range-diff.sh
> +++ b/t/t3206-range-diff.sh
> @@ -513,6 +513,16 @@ test_expect_success 'range-diff overrides diff.noprefix internally' '
>  	git -c diff.noprefix=true range-diff HEAD^...
>  '
>
> +test_expect_success 'basic with modified format.pretty with suffix' '
> +	git -c format.pretty="format:commit %H%d%n" range-diff \
> +		master..topic master..unmodified
> +'
> +
> +test_expect_success 'basic with modified format.pretty without "commit "' '
> +	git -c format.pretty="format:%H%n" range-diff \
> +		master..topic master..unmodified
> +'
> +
>  test_expect_success 'range-diff compares notes by default' '
>  	git notes add -m "topic note" topic &&
>  	git notes add -m "unmodified note" unmodified &&
> --
> gitgitgadget

Otherwise what you have here does look good to me, too. I'm just not
sure that there aren't other ways that a caller could circumvent placing
'--notes --pretty=medium' at the end, anyway.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] range-diff: fix a crash in parsing git-log output
  2020-04-15 15:31   ` Junio C Hamano
@ 2020-04-15 16:16     ` Vasil Dimov
  2020-04-15 16:23     ` Jeff King
  1 sibling, 0 replies; 15+ messages in thread
From: Vasil Dimov @ 2020-04-15 16:16 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Vasil Dimov via GitGitGadget, git

[-- Attachment #1: Type: text/plain, Size: 1489 bytes --]

On Wed, Apr 15, 2020 at 08:31:39 -0700, Junio C Hamano wrote:
> "Vasil Dimov via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> > From: Vasil Dimov <vd@FreeBSD.org>
> >
> > `git range-diff` calls `git log` internally and tries to parse its
> > output. But `git log` output can be customized by the user in their
> > git config and for certain configurations either an error will be
> > returned by `git range-diff` or it will crash.
> >
> > To fix this explicitly set the output format of the internally
> > executed `git log` with `--pretty=medium`. Because that cancels
> > `--notes`, add explicitly `--notes` at the end.
>
> Good finding.
>
> Shouldn't we also disable customizations that come from the
> configuration variables like diff.external, diff.<driver>.command?

Hmm, I am not sure. I just read the doc about diff.<driver>.command. Is
it possible that the user has set up some custom diff tools to compare
e.g. .jpg files by only comparing their EXIF data instead of the entire
(binary) files?

Surely if the customizations wrt diff.external and
diff.<driver>.command produce result that is unparsable by
git range-diff, then they are not good. But maybe the opposite is the
case?

I don't feel confident enough to judge.

-- 
Vasil Dimov
gro.DSBeerF@dv
%
Sometimes I really think people ought to have to pass a proper exam
before they're allowed to be parents. Not just the practical, I mean.
    -- (Terry Pratchett, Thief of Time)

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 1528 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] range-diff: avoid negative string precision
  2020-04-15 14:28 ` [PATCH 2/2] range-diff: avoid negative string precision Vasil Dimov via GitGitGadget
@ 2020-04-15 16:20   ` Taylor Blau
  2020-04-15 20:19     ` Vasil Dimov
  0 siblings, 1 reply; 15+ messages in thread
From: Taylor Blau @ 2020-04-15 16:20 UTC (permalink / raw)
  To: Vasil Dimov via GitGitGadget; +Cc: git, Vasil Dimov

On Wed, Apr 15, 2020 at 02:28:41PM +0000, Vasil Dimov via GitGitGadget wrote:
> From: Vasil Dimov <vd@FreeBSD.org>
>
> If the supplied integer for "precisoin" is negative in

s/precisoin/precision

> `"%.*s", len, line` then it is ignored. So the current code is
> equivalent to just `"%s", line` because it is executed only if
> `len` is negative.
>
> Fix this by saving the value of `len` before overwriting it with the
> return value of `parse_git_diff_header()`.
>
> Signed-off-by: Vasil Dimov <vd@FreeBSD.org>
> ---
>  range-diff.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/range-diff.c b/range-diff.c
> index 5cc920be391..40af0862818 100644
> --- a/range-diff.c
> +++ b/range-diff.c
> @@ -123,16 +123,19 @@ static int read_patches(const char *range, struct string_list *list,
>  			struct patch patch = { 0 };
>  			struct strbuf root = STRBUF_INIT;
>  			int linenr = 0;
> +			int orig_len;

Any reason to not assign this to 'len' up here?
>
>  			in_header = 0;
>  			strbuf_addch(&buf, '\n');
>  			if (!util->diff_offset)
>  				util->diff_offset = buf.len;
>  			line[len - 1] = '\n';
> +			orig_len = len;
>  			len = parse_git_diff_header(&root, &linenr, 0, line,
>  						    len, size, &patch);

OK, so we cut up the line by placing a NL at len, and then feed it to
'parse_git_diff_header' which will tell us the length of the thing that
it parsed, or give a negative value if it couldn't parse...

>  			if (len < 0)
> -				die(_("could not parse git header '%.*s'"), (int)len, line);
> +				die(_("could not parse git header '%.*s'"),
> +				    orig_len, line);

...and then you restore the original length and print it out here. It
seems like this error is now misleading though, because the line is
already modified at the point that the newline was inserted.

>  			strbuf_addstr(&buf, " ## ");
>  			if (patch.is_new > 0)
>  				strbuf_addf(&buf, "%s (new)", patch.new_name);
> --
> gitgitgadget

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] range-diff: fix a crash in parsing git-log output
  2020-04-15 15:31   ` Junio C Hamano
  2020-04-15 16:16     ` Vasil Dimov
@ 2020-04-15 16:23     ` Jeff King
  2020-04-15 22:02       ` Taylor Blau
  1 sibling, 1 reply; 15+ messages in thread
From: Jeff King @ 2020-04-15 16:23 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Vasil Dimov via GitGitGadget, git, Vasil Dimov

On Wed, Apr 15, 2020 at 08:31:39AM -0700, Junio C Hamano wrote:

> "Vasil Dimov via GitGitGadget" <gitgitgadget@gmail.com> writes:
> 
> > From: Vasil Dimov <vd@FreeBSD.org>
> >
> > `git range-diff` calls `git log` internally and tries to parse its
> > output. But `git log` output can be customized by the user in their
> > git config and for certain configurations either an error will be
> > returned by `git range-diff` or it will crash.
> >
> > To fix this explicitly set the output format of the internally
> > executed `git log` with `--pretty=medium`. Because that cancels
> > `--notes`, add explicitly `--notes` at the end.
> 
> Good finding.  
> 
> Shouldn't we also disable customizations that come from the
> configuration variables like diff.external, diff.<driver>.command?

If range-diff were a script, I would say it should be using the
"rev-list | diff-tree --stdin" plumbing under the hood, rather than
"log".

The read_patches() function does let callers pass options to git-log,
but I don't _think_ this is exposed to the user. We only allow a few
--notes options to be passed, and we should be able to apply those to
diff-tree. So converting it to use plumbing might be an option.

Though I think there is another bug:

  $ git rev-list HEAD | git diff-tree --stdin --pretty=medium --notes
  commit 8f3d9f354286745c751374f5f1fcafee6b3f3136
  git: notes.c:1308: format_display_notes: Assertion `display_notes_trees' failed.
  Aborted

Another option would be for range-diff to directly call the
revision-traversal plumbing itself. There may be complications there,
though (or else it would have been done from the outset).

We should fix that assertion regardless, though.

-Peff

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] range-diff: avoid negative string precision
  2020-04-15 16:20   ` Taylor Blau
@ 2020-04-15 20:19     ` Vasil Dimov
  0 siblings, 0 replies; 15+ messages in thread
From: Vasil Dimov @ 2020-04-15 20:19 UTC (permalink / raw)
  To: Taylor Blau; +Cc: Vasil Dimov via GitGitGadget, git

On Wed, Apr 15, 2020 at 10:20:35 -0600, Taylor Blau wrote:
> On Wed, Apr 15, 2020 at 02:28:41PM +0000, Vasil Dimov via GitGitGadget wrote:
> > From: Vasil Dimov <vd@FreeBSD.org>
> >
> > If the supplied integer for "precisoin" is negative in
> 
> s/precisoin/precision

Fixed in v2.

> > `"%.*s", len, line` then it is ignored. So the current code is
> > equivalent to just `"%s", line` because it is executed only if
> > `len` is negative.
> >
> > Fix this by saving the value of `len` before overwriting it with the
> > return value of `parse_git_diff_header()`.
> >
> > Signed-off-by: Vasil Dimov <vd@FreeBSD.org>
> > ---
> >  range-diff.c | 5 ++++-
> >  1 file changed, 4 insertions(+), 1 deletion(-)
> >
> > diff --git a/range-diff.c b/range-diff.c
> > index 5cc920be391..40af0862818 100644
> > --- a/range-diff.c
> > +++ b/range-diff.c
> > @@ -123,16 +123,19 @@ static int read_patches(const char *range, struct string_list *list,
> >  			struct patch patch = { 0 };
> >  			struct strbuf root = STRBUF_INIT;
> >  			int linenr = 0;
> > +			int orig_len;
> 
> Any reason to not assign this to 'len' up here?

I believe that assigning it just before len is changed and that grouping
all usage of the new variable close to each other makes the code more
readable. Ideally it would also be defined below.

> >  			in_header = 0;
> >  			strbuf_addch(&buf, '\n');
> >  			if (!util->diff_offset)
> >  				util->diff_offset = buf.len;
> >  			line[len - 1] = '\n';
> > +			orig_len = len;
> >  			len = parse_git_diff_header(&root, &linenr, 0, line,
> >  						    len, size, &patch);
> 
> OK, so we cut up the line by placing a NL at len, and then feed it to
> 'parse_git_diff_header' which will tell us the length of the thing that
> it parsed, or give a negative value if it couldn't parse...
> 
> >  			if (len < 0)
> > -				die(_("could not parse git header '%.*s'"), (int)len, line);
> > +				die(_("could not parse git header '%.*s'"),
> > +				    orig_len, line);
> 
> ...and then you restore the original length and print it out here. It
> seems like this error is now misleading though, because the line is
> already modified at the point that the newline was inserted.
[...]

It was '\0' before we overwrote it with '\n':

 89                 len = find_end_of_line(line, size);
 90                 line[len - 1] = '\0';

`line` points to a buffer of the entire output of `git log`, with many
newlines in it. In the beginning of the loop we overwrite the first new
line char in the buffer with '\0', then we restore it to '\n' and
eventually advance the pointer to the start of the next line.

I think that the intention of this code is to print only one line (the
current one).

-- 
Vasil Dimov
gro.DSBeerF@dv
%
Success consists of going from failure to failure without loss of
enthusiasm.
                -- Winston Churchill

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v2 0/2] range-diff: fix a crash in parsing git-log output
  2020-04-15 14:28 [PATCH 0/2] range-diff: fix a crash in parsing git-log output Vasil Dimov via GitGitGadget
  2020-04-15 14:28 ` [PATCH 1/2] " Vasil Dimov via GitGitGadget
  2020-04-15 14:28 ` [PATCH 2/2] range-diff: avoid negative string precision Vasil Dimov via GitGitGadget
@ 2020-04-15 20:32 ` Vasil Dimov via GitGitGadget
  2020-04-15 20:32   ` [PATCH v2 1/2] " Vasil Dimov via GitGitGadget
                     ` (2 more replies)
  2 siblings, 3 replies; 15+ messages in thread
From: Vasil Dimov via GitGitGadget @ 2020-04-15 20:32 UTC (permalink / raw)
  To: git; +Cc: Vasil Dimov

 * override a possibly user-customized format.pretty that would render git
   log output unparsable by git range-diff
   
   
 * don't use negative string precision, e.g. "%.*s", -5, "foo"
   
   

Changes since v1:

 * Fixed a typo in the commit message (found by Taylor Blau)

Vasil Dimov (2):
  range-diff: fix a crash in parsing git-log output
  range-diff: avoid negative string precision

 range-diff.c          | 18 +++++++++++++++++-
 t/t3206-range-diff.sh | 10 ++++++++++
 2 files changed, 27 insertions(+), 1 deletion(-)


base-commit: de49261b050d9cd8ec73842356077bc5b606640f
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-760%2Fvasild%2Frange-diff-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-760/vasild/range-diff-v2
Pull-Request: https://github.com/git/git/pull/760

Range-diff vs v1:

 1:  2375e34100e = 1:  2375e34100e range-diff: fix a crash in parsing git-log output
 2:  b3384880c72 ! 2:  72fddcff554 range-diff: avoid negative string precision
     @@ Metadata
       ## Commit message ##
          range-diff: avoid negative string precision
      
     -    If the supplied integer for "precisoin" is negative in
     +    If the supplied integer for "precision" is negative in
          `"%.*s", len, line` then it is ignored. So the current code is
          equivalent to just `"%s", line` because it is executed only if
          `len` is negative.

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v2 1/2] range-diff: fix a crash in parsing git-log output
  2020-04-15 20:32 ` [PATCH v2 0/2] range-diff: fix a crash in parsing git-log output Vasil Dimov via GitGitGadget
@ 2020-04-15 20:32   ` Vasil Dimov via GitGitGadget
  2020-04-15 20:32   ` [PATCH v2 2/2] range-diff: avoid negative string precision Vasil Dimov via GitGitGadget
  2020-04-16  1:07   ` [PATCH v2 0/2] range-diff: fix a crash in parsing git-log output Taylor Blau
  2 siblings, 0 replies; 15+ messages in thread
From: Vasil Dimov via GitGitGadget @ 2020-04-15 20:32 UTC (permalink / raw)
  To: git; +Cc: Vasil Dimov, Vasil Dimov

From: Vasil Dimov <vd@FreeBSD.org>

`git range-diff` calls `git log` internally and tries to parse its
output. But `git log` output can be customized by the user in their
git config and for certain configurations either an error will be
returned by `git range-diff` or it will crash.

To fix this explicitly set the output format of the internally
executed `git log` with `--pretty=medium`. Because that cancels
`--notes`, add explicitly `--notes` at the end.

Also, make sure we never crash in the same way - trying to dereference
`util` which was never created and has remained NULL. It would happen
if the first line of `git log` output does not begin with 'commit '.

Alternative considered but discarded - somehow disable all git configs
and behave as if no config is present in the internally executed
`git log`, but that does not seem to be possible. GIT_CONFIG_NOSYSTEM
is the closest to it, but even with that we would still read
`.git/config`.

Signed-off-by: Vasil Dimov <vd@FreeBSD.org>
---
 range-diff.c          | 13 +++++++++++++
 t/t3206-range-diff.sh | 10 ++++++++++
 2 files changed, 23 insertions(+)

diff --git a/range-diff.c b/range-diff.c
index f745567cf67..5cc920be391 100644
--- a/range-diff.c
+++ b/range-diff.c
@@ -63,6 +63,8 @@ static int read_patches(const char *range, struct string_list *list,
 			"--output-indicator-old=<",
 			"--output-indicator-context=#",
 			"--no-abbrev-commit",
+			"--pretty=medium",
+			"--notes",
 			NULL);
 	if (other_arg)
 		argv_array_pushv(&cp.args, other_arg->argv);
@@ -106,6 +108,17 @@ static int read_patches(const char *range, struct string_list *list,
 			continue;
 		}
 
+		if (!util) {
+			error(_("could not parse first line of `log` output: "
+				"did not start with 'commit ': '%s'"),
+			      line);
+			string_list_clear(list, 1);
+			strbuf_release(&buf);
+			strbuf_release(&contents);
+			finish_command(&cp);
+			return -1;
+		}
+
 		if (starts_with(line, "diff --git")) {
 			struct patch patch = { 0 };
 			struct strbuf root = STRBUF_INIT;
diff --git a/t/t3206-range-diff.sh b/t/t3206-range-diff.sh
index bd808f87ed5..e024cff65cb 100755
--- a/t/t3206-range-diff.sh
+++ b/t/t3206-range-diff.sh
@@ -513,6 +513,16 @@ test_expect_success 'range-diff overrides diff.noprefix internally' '
 	git -c diff.noprefix=true range-diff HEAD^...
 '
 
+test_expect_success 'basic with modified format.pretty with suffix' '
+	git -c format.pretty="format:commit %H%d%n" range-diff \
+		master..topic master..unmodified
+'
+
+test_expect_success 'basic with modified format.pretty without "commit "' '
+	git -c format.pretty="format:%H%n" range-diff \
+		master..topic master..unmodified
+'
+
 test_expect_success 'range-diff compares notes by default' '
 	git notes add -m "topic note" topic &&
 	git notes add -m "unmodified note" unmodified &&
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v2 2/2] range-diff: avoid negative string precision
  2020-04-15 20:32 ` [PATCH v2 0/2] range-diff: fix a crash in parsing git-log output Vasil Dimov via GitGitGadget
  2020-04-15 20:32   ` [PATCH v2 1/2] " Vasil Dimov via GitGitGadget
@ 2020-04-15 20:32   ` Vasil Dimov via GitGitGadget
  2020-04-16  1:07   ` [PATCH v2 0/2] range-diff: fix a crash in parsing git-log output Taylor Blau
  2 siblings, 0 replies; 15+ messages in thread
From: Vasil Dimov via GitGitGadget @ 2020-04-15 20:32 UTC (permalink / raw)
  To: git; +Cc: Vasil Dimov, Vasil Dimov

From: Vasil Dimov <vd@FreeBSD.org>

If the supplied integer for "precision" is negative in
`"%.*s", len, line` then it is ignored. So the current code is
equivalent to just `"%s", line` because it is executed only if
`len` is negative.

Fix this by saving the value of `len` before overwriting it with the
return value of `parse_git_diff_header()`.

Signed-off-by: Vasil Dimov <vd@FreeBSD.org>
---
 range-diff.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/range-diff.c b/range-diff.c
index 5cc920be391..40af0862818 100644
--- a/range-diff.c
+++ b/range-diff.c
@@ -123,16 +123,19 @@ static int read_patches(const char *range, struct string_list *list,
 			struct patch patch = { 0 };
 			struct strbuf root = STRBUF_INIT;
 			int linenr = 0;
+			int orig_len;
 
 			in_header = 0;
 			strbuf_addch(&buf, '\n');
 			if (!util->diff_offset)
 				util->diff_offset = buf.len;
 			line[len - 1] = '\n';
+			orig_len = len;
 			len = parse_git_diff_header(&root, &linenr, 0, line,
 						    len, size, &patch);
 			if (len < 0)
-				die(_("could not parse git header '%.*s'"), (int)len, line);
+				die(_("could not parse git header '%.*s'"),
+				    orig_len, line);
 			strbuf_addstr(&buf, " ## ");
 			if (patch.is_new > 0)
 				strbuf_addf(&buf, "%s (new)", patch.new_name);
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] range-diff: fix a crash in parsing git-log output
  2020-04-15 16:23     ` Jeff King
@ 2020-04-15 22:02       ` Taylor Blau
  2020-04-15 22:29         ` Jeff King
  0 siblings, 1 reply; 15+ messages in thread
From: Taylor Blau @ 2020-04-15 22:02 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, Vasil Dimov via GitGitGadget, git, Vasil Dimov

Hi Peff,

On Wed, Apr 15, 2020 at 12:23:26PM -0400, Jeff King wrote:
> On Wed, Apr 15, 2020 at 08:31:39AM -0700, Junio C Hamano wrote:
>
> > "Vasil Dimov via GitGitGadget" <gitgitgadget@gmail.com> writes:
> >
> > > From: Vasil Dimov <vd@FreeBSD.org>
> > >
> > > `git range-diff` calls `git log` internally and tries to parse its
> > > output. But `git log` output can be customized by the user in their
> > > git config and for certain configurations either an error will be
> > > returned by `git range-diff` or it will crash.
> > >
> > > To fix this explicitly set the output format of the internally
> > > executed `git log` with `--pretty=medium`. Because that cancels
> > > `--notes`, add explicitly `--notes` at the end.
> >
> > Good finding.
> >
> > Shouldn't we also disable customizations that come from the
> > configuration variables like diff.external, diff.<driver>.command?
>
> If range-diff were a script, I would say it should be using the
> "rev-list | diff-tree --stdin" plumbing under the hood, rather than
> "log".
>
> The read_patches() function does let callers pass options to git-log,
> but I don't _think_ this is exposed to the user. We only allow a few
> --notes options to be passed, and we should be able to apply those to
> diff-tree. So converting it to use plumbing might be an option.
>
> Though I think there is another bug:
>
>   $ git rev-list HEAD | git diff-tree --stdin --pretty=medium --notes
>   commit 8f3d9f354286745c751374f5f1fcafee6b3f3136
>   git: notes.c:1308: format_display_notes: Assertion `display_notes_trees' failed.
>   Aborted

Nice find. I think that I have a patch addressing this below. Please let
me know what you think:

--- >8 ---

Subject: [PATCH] diff-tree.c: load notes machinery with '--notes'

Since its introduction in 7249e91 (revision.c: support --notes
command-line option, 2011-03-29), combining '--notes' with '--pretty'
causes 'git diff-tree' to fail a runtime assertion:

  $ git rev-list HEAD | git diff-tree --stdin --pretty=medium --notes
  commit 8f3d9f354286745c751374f5f1fcafee6b3f3136
  git: notes.c:1308: format_display_notes: Assertion `display_notes_trees' failed.
  Aborted

This failure is due to diff-tree not calling 'load_display_notes' to
initialize the notes machinery.

Ordinarily, this failure isn't triggered, because it requires passing
both '--notes' and '--pretty'. Specifically, passing '--pretty' sets
'opt->verbose_header', causing 'show_log()' to eventually call
'format_display_notes()', which expects a non-NULL 'display_note_trees'.
Without initializing the notes machinery, 'display_note_trees' remains
NULL, and thus triggers an assertion failure. This doesn't occur without
'--pretty' since we never call 'format_display_notes()' without it.

Fix this by initializing the notes machinery after parsing our options,
and harden this behavior against regression with a test in t4013.

Reported-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 builtin/diff-tree.c     | 2 ++
 t/t4013-diff-various.sh | 1 +
 2 files changed, 3 insertions(+)

diff --git a/builtin/diff-tree.c b/builtin/diff-tree.c
index cb9ea79367..17c1cc8c3c 100644
--- a/builtin/diff-tree.c
+++ b/builtin/diff-tree.c
@@ -126,6 +126,8 @@ int cmd_diff_tree(int argc, const char **argv, const char *prefix)

 	precompose_argv(argc, argv);
 	argc = setup_revisions(argc, argv, opt, &s_r_opt);
+	if (opt->show_notes)
+		load_display_notes(&opt->notes_opt);

 	while (--argc > 0) {
 		const char *arg = *++argv;
diff --git a/t/t4013-diff-various.sh b/t/t4013-diff-various.sh
index dde3f11fec..6ae8cfb271 100755
--- a/t/t4013-diff-various.sh
+++ b/t/t4013-diff-various.sh
@@ -398,6 +398,7 @@ diff --no-index --raw --no-abbrev dir2 dir

 diff-tree --pretty --root --stat --compact-summary initial
 diff-tree --pretty -R --root --stat --compact-summary initial
+diff-tree --pretty --notes initial
 diff-tree --stat --compact-summary initial mode
 diff-tree -R --stat --compact-summary initial mode
 EOF
--
2.26.0.106.g9fadedd637

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] range-diff: fix a crash in parsing git-log output
  2020-04-15 22:02       ` Taylor Blau
@ 2020-04-15 22:29         ` Jeff King
  0 siblings, 0 replies; 15+ messages in thread
From: Jeff King @ 2020-04-15 22:29 UTC (permalink / raw)
  To: Taylor Blau
  Cc: Junio C Hamano, Vasil Dimov via GitGitGadget, git, Vasil Dimov

On Wed, Apr 15, 2020 at 04:02:42PM -0600, Taylor Blau wrote:

> Subject: [PATCH] diff-tree.c: load notes machinery with '--notes'
> 
> Since its introduction in 7249e91 (revision.c: support --notes
> command-line option, 2011-03-29), combining '--notes' with '--pretty'
> causes 'git diff-tree' to fail a runtime assertion:
> 
>   $ git rev-list HEAD | git diff-tree --stdin --pretty=medium --notes
>   commit 8f3d9f354286745c751374f5f1fcafee6b3f3136
>   git: notes.c:1308: format_display_notes: Assertion `display_notes_trees' failed.
>   Aborted
> 
> This failure is due to diff-tree not calling 'load_display_notes' to
> initialize the notes machinery.

Yes, I think that's the problem that I saw. And this definitely fixes
that case, but I think there's another related one.

  $ git notes add -m foo
  $ git log -1 --format='%h %N'
  94316974f7 foo
  $ git rev-list -1 HEAD | git diff-tree --stdin --format='%h %N' -s
  94316974f7e27dccc6b87f3946bce5d2fc252dc2 %N

This is true even with your patch. With your patch I can add --notes to
get the right output:

  $ git rev-list -1 HEAD | git diff-tree --stdin --format='%h %N' -s --notes
  94316974f7e27dccc6b87f3946bce5d2fc252dc2 foo

(It's also slightly curious that %h doesn't abbreviate in diff-tree; I
guess this is a side effect of the plumbing having no default abbrev
setting; it may be simplest to just live with it).

> @@ -126,6 +126,8 @@ int cmd_diff_tree(int argc, const char **argv, const char *prefix)
> 
>  	precompose_argv(argc, argv);
>  	argc = setup_revisions(argc, argv, opt, &s_r_opt);
> +	if (opt->show_notes)
> +		load_display_notes(&opt->notes_opt);

In git-log we have the equivalent of these new lines, but just before it
we check the userformat, too:

          memset(&w, 0, sizeof(w));
          userformat_find_requirements(NULL, &w);
  
          if (!rev->show_notes_given && (!rev->pretty_given || w.notes))
                  rev->show_notes = 1;
          if (rev->show_notes)
                  load_display_notes(&rev->notes_opt);

I think we'd want to do the same here. Even though it's plumbing, I
can't think of any reason why somebody would _not_ want notes to be
auto-enabled when they say "%N".

> Ordinarily, this failure isn't triggered, because it requires passing
> both '--notes' and '--pretty'. Specifically, passing '--pretty' sets
> 'opt->verbose_header', causing 'show_log()' to eventually call
> 'format_display_notes()', which expects a non-NULL 'display_note_trees'.
> Without initializing the notes machinery, 'display_note_trees' remains
> NULL, and thus triggers an assertion failure. This doesn't occur without
> '--pretty' since we never call 'format_display_notes()' without it.

It's not just --pretty, of course, but any option that causes us to
actually try to format notes (--format, --oneline, etc).

-Peff

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 0/2] range-diff: fix a crash in parsing git-log output
  2020-04-15 20:32 ` [PATCH v2 0/2] range-diff: fix a crash in parsing git-log output Vasil Dimov via GitGitGadget
  2020-04-15 20:32   ` [PATCH v2 1/2] " Vasil Dimov via GitGitGadget
  2020-04-15 20:32   ` [PATCH v2 2/2] range-diff: avoid negative string precision Vasil Dimov via GitGitGadget
@ 2020-04-16  1:07   ` Taylor Blau
  2 siblings, 0 replies; 15+ messages in thread
From: Taylor Blau @ 2020-04-16  1:07 UTC (permalink / raw)
  To: Vasil Dimov via GitGitGadget; +Cc: git, Vasil Dimov

Hi Vasil,

Thanks for sending a v2. This all looks good to me.

On Wed, Apr 15, 2020 at 08:32:23PM +0000, Vasil Dimov via GitGitGadget wrote:
>  * override a possibly user-customized format.pretty that would render git
>    log output unparsable by git range-diff
>
>
>  * don't use negative string precision, e.g. "%.*s", -5, "foo"
>
>
>
> Changes since v1:
>
>  * Fixed a typo in the commit message (found by Taylor Blau)
>
> Vasil Dimov (2):
>   range-diff: fix a crash in parsing git-log output
>   range-diff: avoid negative string precision
>
>  range-diff.c          | 18 +++++++++++++++++-
>  t/t3206-range-diff.sh | 10 ++++++++++
>  2 files changed, 27 insertions(+), 1 deletion(-)
>
> [snip]
> --
> gitgitgadget

  Reviewed-by: Taylor Blau <me@ttaylorr.com>

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2020-04-16  1:07 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-15 14:28 [PATCH 0/2] range-diff: fix a crash in parsing git-log output Vasil Dimov via GitGitGadget
2020-04-15 14:28 ` [PATCH 1/2] " Vasil Dimov via GitGitGadget
2020-04-15 15:31   ` Junio C Hamano
2020-04-15 16:16     ` Vasil Dimov
2020-04-15 16:23     ` Jeff King
2020-04-15 22:02       ` Taylor Blau
2020-04-15 22:29         ` Jeff King
2020-04-15 16:13   ` Taylor Blau
2020-04-15 14:28 ` [PATCH 2/2] range-diff: avoid negative string precision Vasil Dimov via GitGitGadget
2020-04-15 16:20   ` Taylor Blau
2020-04-15 20:19     ` Vasil Dimov
2020-04-15 20:32 ` [PATCH v2 0/2] range-diff: fix a crash in parsing git-log output Vasil Dimov via GitGitGadget
2020-04-15 20:32   ` [PATCH v2 1/2] " Vasil Dimov via GitGitGadget
2020-04-15 20:32   ` [PATCH v2 2/2] range-diff: avoid negative string precision Vasil Dimov via GitGitGadget
2020-04-16  1:07   ` [PATCH v2 0/2] range-diff: fix a crash in parsing git-log output Taylor Blau

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).