git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH 0/2] fix "rebase -i --root" corrupting root commit
@ 2018-07-30  9:29 Eric Sunshine
  2018-07-30  9:29 ` [PATCH 1/2] sequencer: fix "rebase -i --root" corrupting author header Eric Sunshine
                   ` (3 more replies)
  0 siblings, 4 replies; 13+ messages in thread
From: Eric Sunshine @ 2018-07-30  9:29 UTC (permalink / raw)
  To: git
  Cc: Johannes Schindelin, Phillip Wood, Akinori MUSHA, Junio C Hamano,
	Eric Sunshine

This series fixes bugs causing corruption of the root commit when
"rebase -i --root" is used to swap in a new root commit. In particular,
the "author" header has trailing garbage. Some tools handle the
corruption somewhat gracefully by showing a bogus date, but others barf
on it (gitk, for instance). git-fsck correctly identifies the
corruption. I discovered this after git-rebase corrupted one of my own
projects.

Unfortunately, these bugs (from js/sequencer-and-root-commits) made it
into the v2.18.0 release. It's worrying that a released Git can be
creating corrupt commits, but fortunately "rebase -i --root" is not
likely used often (especially on well-established projects).
Nevertheless, it may be 'maint' worthy and applies cleanly there.

It was only after I diagnosed and fixed these bugs that I thought to
check 'pu' and discovered that Akinori MUSHA already made a stab[1] at
fixing one of the three bugs which this series fixes. Akinori's fix has
the somewhat undesirable property that it adds an extra blank line to
the end of the script, as Phillip correctly pointed out in review[2].
Patch 2/2 of this series has the more "correct" fix, in addition to
fixing another bug.

Moreover, patch 2/2 of this series provides a more thorough fix overall
than Akinori, so it may make sense to replace his patch with this
series, though perhaps keep the test his patch adds to augment the
strict test of the "author" header added by this series.

[1]: https://public-inbox.org/git/86a7qwpt9g.knu@iDaemons.org/
[2]: https://public-inbox.org/git/f5b56540-d26a-044e-5f46-1d975f889d06@talktalk.net/

Eric Sunshine (2):
  sequencer: fix "rebase -i --root" corrupting author header
  sequencer: fix "rebase -i --root" corrupting author header timezone

 sequencer.c                   |  9 +++++++--
 t/t3404-rebase-interactive.sh | 10 +++++++++-
 2 files changed, 16 insertions(+), 3 deletions(-)

-- 
2.18.0.597.ga71716f1ad


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/2] sequencer: fix "rebase -i --root" corrupting author header
  2018-07-30  9:29 [PATCH 0/2] fix "rebase -i --root" corrupting root commit Eric Sunshine
@ 2018-07-30  9:29 ` Eric Sunshine
  2018-07-30 15:44   ` Johannes Schindelin
  2018-07-30  9:29 ` [PATCH 2/2] sequencer: fix "rebase -i --root" corrupting author header timezone Eric Sunshine
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 13+ messages in thread
From: Eric Sunshine @ 2018-07-30  9:29 UTC (permalink / raw)
  To: git
  Cc: Johannes Schindelin, Phillip Wood, Akinori MUSHA, Junio C Hamano,
	Eric Sunshine

When "git rebase -i --root" creates a new root commit (say, by swapping
in a different commit for the root), it corrupts the commit's "author"
header with trailing garbage:

    author A U Thor <author@example.com> @1112912773 -07000or@example.com

This is a result of read_author_ident() neglecting to NUL-terminate the
buffer into which it composes the "author" header.

(Note that the extra "0" in the timezone is a separate bug which will be
fixed subsequently.)

Security considerations: Construction of the "author" header by
read_author_ident() happens in-place and in parallel with parsing the
content of "rebase-merge/author-script" which occupies the same buffer.
This is possible because the constructed "author" header is always
smaller than the content of "rebase-merge/author-script". Despite
neglecting to NUL-terminate the constructed "author" header, memory is
never accessed (either by read_author_ident() or its caller) beyond the
allocated buffer since a NUL-terminator is present at the end of the
loaded "rebase-merge/author-script" content, and additional NUL's are
inserted as part of the parsing process.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
---
 sequencer.c                   |  2 +-
 t/t3404-rebase-interactive.sh | 10 +++++++++-
 2 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/sequencer.c b/sequencer.c
index 16c1411054..78864d9072 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -744,7 +744,7 @@ static const char *read_author_ident(struct strbuf *buf)
 		return NULL;
 	}
 
-	buf->len = out - buf->buf;
+	strbuf_setlen(buf, out - buf->buf);
 	return buf->buf;
 }
 
diff --git a/t/t3404-rebase-interactive.sh b/t/t3404-rebase-interactive.sh
index 01616901bd..8509c89a26 100755
--- a/t/t3404-rebase-interactive.sh
+++ b/t/t3404-rebase-interactive.sh
@@ -1238,7 +1238,7 @@ rebase_setup_and_clean () {
 		test_might_fail git branch -D $1 &&
 		test_might_fail git rebase --abort
 	" &&
-	git checkout -b $1 master
+	git checkout -b $1 ${2:-master}
 }
 
 test_expect_success 'drop' '
@@ -1415,4 +1415,12 @@ test_expect_success 'rebase -i --gpg-sign=<key-id> overrides commit.gpgSign' '
 	test_i18ngrep "$SQ-S\"S I Gner\"$SQ" err
 '
 
+test_expect_success 'valid author header after --root swap' '
+	rebase_setup_and_clean author-header no-conflict-branch &&
+	set_fake_editor &&
+	FAKE_LINES="2 1" git rebase -i --root &&
+	git cat-file commit HEAD^ >out &&
+	grep "^author ..* @[0-9][0-9]* [-+][0-9][0-9]*$" out
+'
+
 test_done
-- 
2.18.0.597.ga71716f1ad


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/2] sequencer: fix "rebase -i --root" corrupting author header timezone
  2018-07-30  9:29 [PATCH 0/2] fix "rebase -i --root" corrupting root commit Eric Sunshine
  2018-07-30  9:29 ` [PATCH 1/2] sequencer: fix "rebase -i --root" corrupting author header Eric Sunshine
@ 2018-07-30  9:29 ` Eric Sunshine
  2018-07-30 12:20   ` Phillip Wood
  2018-07-30 10:06 ` [PATCH 0/2] fix "rebase -i --root" corrupting root commit Eric Sunshine
  2018-07-30 12:14 ` Phillip Wood
  3 siblings, 1 reply; 13+ messages in thread
From: Eric Sunshine @ 2018-07-30  9:29 UTC (permalink / raw)
  To: git
  Cc: Johannes Schindelin, Phillip Wood, Akinori MUSHA, Junio C Hamano,
	Eric Sunshine

When "git rebase -i --root" creates a new root commit, it corrupts the
"author" header's timezone by repeating the last digit:

    author A U Thor <author@example.com> @1112912773 -07000

This is due to two bugs.

First, write_author_script() neglects to add the closing quote to the
value of GIT_AUTHOR_DATE when generating "rebase-merge/author-script".

Second, although sq_dequote() correctly diagnoses the missing closing
quote, read_author_ident() ignores sq_dequote()'s return value and
blindly uses the result of the aborted dequote.

sq_dequote() performs dequoting in-place by removing quoting and
shifting content downward. When it detects misquoting (lack of closing
quote, in this case), it gives up and returns an error without inserting
a NUL-terminator at the end of the shifted content, which explains the
duplicated last digit in the timezone.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
---
 sequencer.c                   | 7 ++++++-
 t/t3404-rebase-interactive.sh | 2 +-
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/sequencer.c b/sequencer.c
index 78864d9072..1008f6d71a 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -654,6 +654,7 @@ static int write_author_script(const char *message)
 			strbuf_addch(&buf, *(message++));
 		else
 			strbuf_addf(&buf, "'\\\\%c'", *(message++));
+	strbuf_addch(&buf, '\'');
 	res = write_message(buf.buf, buf.len, rebase_path_author_script(), 1);
 	strbuf_release(&buf);
 	return res;
@@ -724,7 +725,11 @@ static const char *read_author_ident(struct strbuf *buf)
 
 		eol = strchrnul(in, '\n');
 		*eol = '\0';
-		sq_dequote(in);
+		if (!sq_dequote(in)) {
+			warning(_("bad quoting on %s value in '%s'"),
+				keys[i], rebase_path_author_script());
+			return NULL;
+		}
 		len = strlen(in);
 
 		if (i > 0) /* separate values by spaces */
diff --git a/t/t3404-rebase-interactive.sh b/t/t3404-rebase-interactive.sh
index 8509c89a26..37796bb4c1 100755
--- a/t/t3404-rebase-interactive.sh
+++ b/t/t3404-rebase-interactive.sh
@@ -1420,7 +1420,7 @@ test_expect_success 'valid author header after --root swap' '
 	set_fake_editor &&
 	FAKE_LINES="2 1" git rebase -i --root &&
 	git cat-file commit HEAD^ >out &&
-	grep "^author ..* @[0-9][0-9]* [-+][0-9][0-9]*$" out
+	grep "^author ..* @[0-9][0-9]* [-+][0-9][0-9][0-9][0-9]$" out
 '
 
 test_done
-- 
2.18.0.597.ga71716f1ad


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/2] fix "rebase -i --root" corrupting root commit
  2018-07-30  9:29 [PATCH 0/2] fix "rebase -i --root" corrupting root commit Eric Sunshine
  2018-07-30  9:29 ` [PATCH 1/2] sequencer: fix "rebase -i --root" corrupting author header Eric Sunshine
  2018-07-30  9:29 ` [PATCH 2/2] sequencer: fix "rebase -i --root" corrupting author header timezone Eric Sunshine
@ 2018-07-30 10:06 ` Eric Sunshine
  2018-07-30 15:47   ` Johannes Schindelin
  2018-07-30 12:14 ` Phillip Wood
  3 siblings, 1 reply; 13+ messages in thread
From: Eric Sunshine @ 2018-07-30 10:06 UTC (permalink / raw)
  To: Git List; +Cc: Johannes Schindelin, phillip.wood, knu, Junio C Hamano

On Mon, Jul 30, 2018 at 5:30 AM Eric Sunshine <sunshine@sunshineco.com> wrote:
> This series fixes bugs causing corruption of the root commit when
> "rebase -i --root" is used to swap in a new root commit. In particular,
> the "author" header has trailing garbage. Some tools handle the
> corruption somewhat gracefully by showing a bogus date, but others barf
> on it (gitk, for instance). git-fsck correctly identifies the
> corruption. I discovered this after git-rebase corrupted one of my own
> projects.

Unfortunately, after sending this series, I see that there is
additional corruption that needs to be addressed. In particular, the
"author" header time is incorrectly prefixed with "@", so this series
is going to need a re-roll to fix that, as well.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/2] fix "rebase -i --root" corrupting root commit
  2018-07-30  9:29 [PATCH 0/2] fix "rebase -i --root" corrupting root commit Eric Sunshine
                   ` (2 preceding siblings ...)
  2018-07-30 10:06 ` [PATCH 0/2] fix "rebase -i --root" corrupting root commit Eric Sunshine
@ 2018-07-30 12:14 ` Phillip Wood
  2018-07-30 15:29   ` Junio C Hamano
                     ` (2 more replies)
  3 siblings, 3 replies; 13+ messages in thread
From: Phillip Wood @ 2018-07-30 12:14 UTC (permalink / raw)
  To: Eric Sunshine, git; +Cc: Johannes Schindelin, Akinori MUSHA, Junio C Hamano

On 30/07/18 10:29, Eric Sunshine wrote:
> This series fixes bugs causing corruption of the root commit when
> "rebase -i --root" is used to swap in a new root commit. In particular,
> the "author" header has trailing garbage. Some tools handle the
> corruption somewhat gracefully by showing a bogus date, but others barf
> on it (gitk, for instance). git-fsck correctly identifies the
> corruption. I discovered this after git-rebase corrupted one of my own
> projects.
> 
> Unfortunately, these bugs (from js/sequencer-and-root-commits) made it
> into the v2.18.0 release. It's worrying that a released Git can be
> creating corrupt commits, but fortunately "rebase -i --root" is not
> likely used often (especially on well-established projects).
> Nevertheless, it may be 'maint' worthy and applies cleanly there.
> 
> It was only after I diagnosed and fixed these bugs that I thought to
> check 'pu' and discovered that Akinori MUSHA already made a stab[1] at
> fixing one of the three bugs which this series fixes. Akinori's fix has
> the somewhat undesirable property that it adds an extra blank line to
> the end of the script, as Phillip correctly pointed out in review[2].
> Patch 2/2 of this series has the more "correct" fix, in addition to
> fixing another bug.
> 
> Moreover, patch 2/2 of this series provides a more thorough fix overall
> than Akinori, so it may make sense to replace his patch with this
> series, though perhaps keep the test his patch adds to augment the
> strict test of the "author" header added by this series.

Johannes and I have some fixups for Akinori's patch on the branch
fix-t3403-author-script-test at https://github.com/phillipwood/git

That branch also contains a fix for the bad quoting of names with "'" in
them. I think it would be good to somehow try and combine this series
with those patches.

I'd really like to see a single function to read and another to write
the author script that is shared by 'git am' and 'git rebase -i', rather
than the two writers and three readers we have at the moment. I was
thinking of doing that in the longer term, but given the extra bug
you've found in read_author_script() maybe we should do that sooner
rather than later.

> [1]: https://public-inbox.org/git/86a7qwpt9g.knu@iDaemons.org/
> [2]: https://public-inbox.org/git/f5b56540-d26a-044e-5f46-1d975f889d06@talktalk.net/
> 
> Eric Sunshine (2):
>   sequencer: fix "rebase -i --root" corrupting author header
>   sequencer: fix "rebase -i --root" corrupting author header timezone
> 
>  sequencer.c                   |  9 +++++++--
>  t/t3404-rebase-interactive.sh | 10 +++++++++-
>  2 files changed, 16 insertions(+), 3 deletions(-)
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/2] sequencer: fix "rebase -i --root" corrupting author header timezone
  2018-07-30  9:29 ` [PATCH 2/2] sequencer: fix "rebase -i --root" corrupting author header timezone Eric Sunshine
@ 2018-07-30 12:20   ` Phillip Wood
  2018-07-30 18:45     ` Eric Sunshine
  0 siblings, 1 reply; 13+ messages in thread
From: Phillip Wood @ 2018-07-30 12:20 UTC (permalink / raw)
  To: Eric Sunshine, git; +Cc: Johannes Schindelin, Akinori MUSHA, Junio C Hamano

Hi Eric
On 30/07/18 10:29, Eric Sunshine wrote:
> When "git rebase -i --root" creates a new root commit, it corrupts the
> "author" header's timezone by repeating the last digit:
> 
>     author A U Thor <author@example.com> @1112912773 -07000
> 
> This is due to two bugs.
> 
> First, write_author_script() neglects to add the closing quote to the
> value of GIT_AUTHOR_DATE when generating "rebase-merge/author-script".
> 
> Second, although sq_dequote() correctly diagnoses the missing closing
> quote, read_author_ident() ignores sq_dequote()'s return value and
> blindly uses the result of the aborted dequote.
> 
> sq_dequote() performs dequoting in-place by removing quoting and
> shifting content downward. When it detects misquoting (lack of closing
> quote, in this case), it gives up and returns an error without inserting
> a NUL-terminator at the end of the shifted content, which explains the
> duplicated last digit in the timezone.
> 
> Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
> ---
>  sequencer.c                   | 7 ++++++-
>  t/t3404-rebase-interactive.sh | 2 +-
>  2 files changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/sequencer.c b/sequencer.c
> index 78864d9072..1008f6d71a 100644
> --- a/sequencer.c
> +++ b/sequencer.c
> @@ -654,6 +654,7 @@ static int write_author_script(const char *message)
>  			strbuf_addch(&buf, *(message++));
>  		else
>  			strbuf_addf(&buf, "'\\\\%c'", *(message++));
> +	strbuf_addch(&buf, '\'');
>  	res = write_message(buf.buf, buf.len, rebase_path_author_script(), 1);
>  	strbuf_release(&buf);
>  	return res;
> @@ -724,7 +725,11 @@ static const char *read_author_ident(struct strbuf *buf)
>  
>  		eol = strchrnul(in, '\n');
>  		*eol = '\0';
> -		sq_dequote(in);
> +		if (!sq_dequote(in)) {
> +			warning(_("bad quoting on %s value in '%s'"),
> +				keys[i], rebase_path_author_script());
> +			return NULL;

I think we want to handle the broken author script properly rather than
returning NULL. If we had a single function
int read_author_script(const char **name, const char **author, const
char **date)
to read the author script that tried sq_dequote() and then fell back to
code based on read_env_script() that handled the missing "'" at the end
and also the bad quoting of "'" if sq_dequote() failed it would make it
easier to fix the existing bugs, rather than having to fix
read_author_ident() and read_env_script() separately. What do you think?

Best Wishes

Phillip

> +		}
>  		len = strlen(in);
>  
>  		if (i > 0) /* separate values by spaces */
> diff --git a/t/t3404-rebase-interactive.sh b/t/t3404-rebase-interactive.sh
> index 8509c89a26..37796bb4c1 100755
> --- a/t/t3404-rebase-interactive.sh
> +++ b/t/t3404-rebase-interactive.sh
> @@ -1420,7 +1420,7 @@ test_expect_success 'valid author header after --root swap' '
>  	set_fake_editor &&
>  	FAKE_LINES="2 1" git rebase -i --root &&
>  	git cat-file commit HEAD^ >out &&
> -	grep "^author ..* @[0-9][0-9]* [-+][0-9][0-9]*$" out
> +	grep "^author ..* @[0-9][0-9]* [-+][0-9][0-9][0-9][0-9]$" out
>  '
>  
>  test_done
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/2] fix "rebase -i --root" corrupting root commit
  2018-07-30 12:14 ` Phillip Wood
@ 2018-07-30 15:29   ` Junio C Hamano
  2018-07-30 15:49   ` Johannes Schindelin
  2018-07-30 19:15   ` Eric Sunshine
  2 siblings, 0 replies; 13+ messages in thread
From: Junio C Hamano @ 2018-07-30 15:29 UTC (permalink / raw)
  To: Phillip Wood; +Cc: Eric Sunshine, git, Johannes Schindelin, Akinori MUSHA

Phillip Wood <phillip.wood@talktalk.net> writes:

>> Moreover, patch 2/2 of this series provides a more thorough fix overall
>> than Akinori, so it may make sense to replace his patch with this
>> series, though perhaps keep the test his patch adds to augment the
>> strict test of the "author" header added by this series.
>
> Johannes and I have some fixups for Akinori's patch on the branch
> fix-t3403-author-script-test at https://github.com/phillipwood/git
>
> That branch also contains a fix for the bad quoting of names with "'" in
> them. I think it would be good to somehow try and combine this series
> with those patches.

Thanks for working well together.  Always nice to see contributors
thinking beyond immediate band-aid and for longer term ;-)

> I'd really like to see a single function to read and another to write
> the author script that is shared by 'git am' and 'git rebase -i', rather
> than the two writers and three readers we have at the moment. I was
> thinking of doing that in the longer term, but given the extra bug
> you've found in read_author_script() maybe we should do that sooner
> rather than later.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/2] sequencer: fix "rebase -i --root" corrupting author header
  2018-07-30  9:29 ` [PATCH 1/2] sequencer: fix "rebase -i --root" corrupting author header Eric Sunshine
@ 2018-07-30 15:44   ` Johannes Schindelin
  0 siblings, 0 replies; 13+ messages in thread
From: Johannes Schindelin @ 2018-07-30 15:44 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: git, Phillip Wood, Akinori MUSHA, Junio C Hamano

Hi Eric,

On Mon, 30 Jul 2018, Eric Sunshine wrote:

> When "git rebase -i --root" creates a new root commit (say, by swapping
> in a different commit for the root), it corrupts the commit's "author"
> header with trailing garbage:
> 
>     author A U Thor <author@example.com> @1112912773 -07000or@example.com
> 
> This is a result of read_author_ident() neglecting to NUL-terminate the
> buffer into which it composes the "author" header.
> 
> (Note that the extra "0" in the timezone is a separate bug which will be
> fixed subsequently.)
> 
> Security considerations: Construction of the "author" header by
> read_author_ident() happens in-place and in parallel with parsing the
> content of "rebase-merge/author-script" which occupies the same buffer.
> This is possible because the constructed "author" header is always
> smaller than the content of "rebase-merge/author-script". Despite
> neglecting to NUL-terminate the constructed "author" header, memory is
> never accessed (either by read_author_ident() or its caller) beyond the
> allocated buffer since a NUL-terminator is present at the end of the
> loaded "rebase-merge/author-script" content, and additional NUL's are
> inserted as part of the parsing process.
> 
> Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>

ACK!

Thanks,
Dscho

> ---
>  sequencer.c                   |  2 +-
>  t/t3404-rebase-interactive.sh | 10 +++++++++-
>  2 files changed, 10 insertions(+), 2 deletions(-)
> 
> diff --git a/sequencer.c b/sequencer.c
> index 16c1411054..78864d9072 100644
> --- a/sequencer.c
> +++ b/sequencer.c
> @@ -744,7 +744,7 @@ static const char *read_author_ident(struct strbuf *buf)
>  		return NULL;
>  	}
>  
> -	buf->len = out - buf->buf;
> +	strbuf_setlen(buf, out - buf->buf);
>  	return buf->buf;
>  }
>  
> diff --git a/t/t3404-rebase-interactive.sh b/t/t3404-rebase-interactive.sh
> index 01616901bd..8509c89a26 100755
> --- a/t/t3404-rebase-interactive.sh
> +++ b/t/t3404-rebase-interactive.sh
> @@ -1238,7 +1238,7 @@ rebase_setup_and_clean () {
>  		test_might_fail git branch -D $1 &&
>  		test_might_fail git rebase --abort
>  	" &&
> -	git checkout -b $1 master
> +	git checkout -b $1 ${2:-master}
>  }
>  
>  test_expect_success 'drop' '
> @@ -1415,4 +1415,12 @@ test_expect_success 'rebase -i --gpg-sign=<key-id> overrides commit.gpgSign' '
>  	test_i18ngrep "$SQ-S\"S I Gner\"$SQ" err
>  '
>  
> +test_expect_success 'valid author header after --root swap' '
> +	rebase_setup_and_clean author-header no-conflict-branch &&
> +	set_fake_editor &&
> +	FAKE_LINES="2 1" git rebase -i --root &&
> +	git cat-file commit HEAD^ >out &&
> +	grep "^author ..* @[0-9][0-9]* [-+][0-9][0-9]*$" out
> +'
> +
>  test_done
> -- 
> 2.18.0.597.ga71716f1ad
> 
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/2] fix "rebase -i --root" corrupting root commit
  2018-07-30 10:06 ` [PATCH 0/2] fix "rebase -i --root" corrupting root commit Eric Sunshine
@ 2018-07-30 15:47   ` Johannes Schindelin
  2018-07-30 19:19     ` Eric Sunshine
  0 siblings, 1 reply; 13+ messages in thread
From: Johannes Schindelin @ 2018-07-30 15:47 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: Git List, phillip.wood, knu, Junio C Hamano

Hi Eric,

On Mon, 30 Jul 2018, Eric Sunshine wrote:

> On Mon, Jul 30, 2018 at 5:30 AM Eric Sunshine <sunshine@sunshineco.com> wrote:
> > This series fixes bugs causing corruption of the root commit when
> > "rebase -i --root" is used to swap in a new root commit. In particular,
> > the "author" header has trailing garbage. Some tools handle the
> > corruption somewhat gracefully by showing a bogus date, but others barf
> > on it (gitk, for instance). git-fsck correctly identifies the
> > corruption. I discovered this after git-rebase corrupted one of my own
> > projects.
> 
> Unfortunately, after sending this series, I see that there is
> additional corruption that needs to be addressed. In particular, the
> "author" header time is incorrectly prefixed with "@", so this series
> is going to need a re-roll to fix that, as well.

AFAIR the `@` was not an oversight, but required so that we could pass in
the Unix epoch.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/2] fix "rebase -i --root" corrupting root commit
  2018-07-30 12:14 ` Phillip Wood
  2018-07-30 15:29   ` Junio C Hamano
@ 2018-07-30 15:49   ` Johannes Schindelin
  2018-07-30 19:15   ` Eric Sunshine
  2 siblings, 0 replies; 13+ messages in thread
From: Johannes Schindelin @ 2018-07-30 15:49 UTC (permalink / raw)
  To: phillip.wood; +Cc: Eric Sunshine, git, Akinori MUSHA, Junio C Hamano

Hi Phillip,

On Mon, 30 Jul 2018, Phillip Wood wrote:

> On 30/07/18 10:29, Eric Sunshine wrote:
> > This series fixes bugs causing corruption of the root commit when
> > "rebase -i --root" is used to swap in a new root commit. In particular,
> > the "author" header has trailing garbage. Some tools handle the
> > corruption somewhat gracefully by showing a bogus date, but others barf
> > on it (gitk, for instance). git-fsck correctly identifies the
> > corruption. I discovered this after git-rebase corrupted one of my own
> > projects.
> > 
> > Unfortunately, these bugs (from js/sequencer-and-root-commits) made it
> > into the v2.18.0 release. It's worrying that a released Git can be
> > creating corrupt commits, but fortunately "rebase -i --root" is not
> > likely used often (especially on well-established projects).
> > Nevertheless, it may be 'maint' worthy and applies cleanly there.
> > 
> > It was only after I diagnosed and fixed these bugs that I thought to
> > check 'pu' and discovered that Akinori MUSHA already made a stab[1] at
> > fixing one of the three bugs which this series fixes. Akinori's fix has
> > the somewhat undesirable property that it adds an extra blank line to
> > the end of the script, as Phillip correctly pointed out in review[2].
> > Patch 2/2 of this series has the more "correct" fix, in addition to
> > fixing another bug.
> > 
> > Moreover, patch 2/2 of this series provides a more thorough fix overall
> > than Akinori, so it may make sense to replace his patch with this
> > series, though perhaps keep the test his patch adds to augment the
> > strict test of the "author" header added by this series.
> 
> Johannes and I have some fixups for Akinori's patch on the branch
> fix-t3403-author-script-test at https://github.com/phillipwood/git
> 
> That branch also contains a fix for the bad quoting of names with "'" in
> them. I think it would be good to somehow try and combine this series
> with those patches.

I would like that, too.

> I'd really like to see a single function to read and another to write
> the author script that is shared by 'git am' and 'git rebase -i', rather
> than the two writers and three readers we have at the moment. I was
> thinking of doing that in the longer term, but given the extra bug
> you've found in read_author_script() maybe we should do that sooner
> rather than later.

You are at least the second person (after Junio) with that wish.

Please note, however, that the purpose of author-script reading/writing is
very different in git-am vs rebase -i, or at least it used to be:
read_env_script() in sequencer.c very specifically wants to construct an
env parameter for use in run_command().

Don't let me stop you from trying to consolidate the two different code
paths, though.

Ciao,
Dscho

> 
> > [1]: https://public-inbox.org/git/86a7qwpt9g.knu@iDaemons.org/
> > [2]: https://public-inbox.org/git/f5b56540-d26a-044e-5f46-1d975f889d06@talktalk.net/
> > 
> > Eric Sunshine (2):
> >   sequencer: fix "rebase -i --root" corrupting author header
> >   sequencer: fix "rebase -i --root" corrupting author header timezone
> > 
> >  sequencer.c                   |  9 +++++++--
> >  t/t3404-rebase-interactive.sh | 10 +++++++++-
> >  2 files changed, 16 insertions(+), 3 deletions(-)
> > 
> 
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/2] sequencer: fix "rebase -i --root" corrupting author header timezone
  2018-07-30 12:20   ` Phillip Wood
@ 2018-07-30 18:45     ` Eric Sunshine
  0 siblings, 0 replies; 13+ messages in thread
From: Eric Sunshine @ 2018-07-30 18:45 UTC (permalink / raw)
  To: Phillip Wood; +Cc: Git List, Johannes Schindelin, Akinori MUSHA, Junio C Hamano

On Mon, Jul 30, 2018 at 8:20 AM Phillip Wood <phillip.wood@talktalk.net> wrote:
> On 30/07/18 10:29, Eric Sunshine wrote:
> > When "git rebase -i --root" creates a new root commit, it corrupts the
> > "author" header's timezone by repeating the last digit:
> > [...]
> > Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
> > ---
> > diff --git a/sequencer.c b/sequencer.c
> > @@ -654,6 +654,7 @@ static int write_author_script(const char *message)
> > +     strbuf_addch(&buf, '\'');
> > @@ -724,7 +725,11 @@ static const char *read_author_ident(struct strbuf *buf)
> > -             sq_dequote(in);
> > +             if (!sq_dequote(in)) {
> > +                     warning(_("bad quoting on %s value in '%s'"),
> > +                             keys[i], rebase_path_author_script());
> > +                     return NULL;
>
> I think we want to handle the broken author script properly rather than
> returning NULL. If we had a single function
> int read_author_script(const char **name, const char **author, const
> char **date)
> to read the author script that tried sq_dequote() and then fell back to
> code based on read_env_script() that handled the missing "'" at the end
> and also the bad quoting of "'" if sq_dequote() failed it would make it
> easier to fix the existing bugs, rather than having to fix
> read_author_ident() and read_env_script() separately. What do you think?

That makes sense as a long-term plan, however, I'm concerned with the
immediate problem that a released version of Git can (and did, in my
case) corrupt commit objects. So, in the short term, I think it makes
sense to get this minimal fix landed, and build the more "correctly
engineered" solution on top of it, without the pressure of worrying
about corruption spreading.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/2] fix "rebase -i --root" corrupting root commit
  2018-07-30 12:14 ` Phillip Wood
  2018-07-30 15:29   ` Junio C Hamano
  2018-07-30 15:49   ` Johannes Schindelin
@ 2018-07-30 19:15   ` Eric Sunshine
  2 siblings, 0 replies; 13+ messages in thread
From: Eric Sunshine @ 2018-07-30 19:15 UTC (permalink / raw)
  To: Phillip Wood; +Cc: Git List, Johannes Schindelin, Akinori MUSHA, Junio C Hamano

On Mon, Jul 30, 2018 at 8:14 AM Phillip Wood <phillip.wood@talktalk.net> wrote:
> On 30/07/18 10:29, Eric Sunshine wrote:
> > It was only after I diagnosed and fixed these bugs that I thought to
> > check 'pu' and discovered that Akinori MUSHA already made a stab[1] at
> > fixing one of the three bugs which this series fixes. Akinori's fix has
> > the somewhat undesirable property that it adds an extra blank line to
> > the end of the script, as Phillip correctly pointed out in review[2].
> > Patch 2/2 of this series has the more "correct" fix, in addition to
> > fixing another bug.
> >
> > Moreover, patch 2/2 of this series provides a more thorough fix overall
> > than Akinori, so it may make sense to replace his patch with this
> > series, though perhaps keep the test his patch adds to augment the
> > strict test of the "author" header added by this series.
>
> Johannes and I have some fixups for Akinori's patch on the branch
> fix-t3403-author-script-test at https://github.com/phillipwood/git

I don't see a branch with that name there. There are a couple "wip"
branches, however, named wip/fix-t3403-author-script-test and
wip/fix-t3404-author-script-test. I'm guessing you wanted me to look
at the former.

> That branch also contains a fix for the bad quoting of names with "'" in
> them. I think it would be good to somehow try and combine this series
> with those patches.

It appears that your patches are fixing issues and a test outside the
issues fixed by my series (aside from the one line inserting the
missing closing quote). As such, I think your patches can be built
atop this series without worrying about conflicts. That would allow
this commit-corruption-bug-fixing series to land without being tied to
those "wip" patches which address lower-priority problems.

> I'd really like to see a single function to read and another to write
> the author script that is shared by 'git am' and 'git rebase -i', rather
> than the two writers and three readers we have at the moment. I was
> thinking of doing that in the longer term, but given the extra bug
> you've found in read_author_script() maybe we should do that sooner
> rather than later.

Agreed. That seems a reasonable long-term goal but needn't hold up
this series which addresses very real bugs leading to object
corruption.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/2] fix "rebase -i --root" corrupting root commit
  2018-07-30 15:47   ` Johannes Schindelin
@ 2018-07-30 19:19     ` Eric Sunshine
  0 siblings, 0 replies; 13+ messages in thread
From: Eric Sunshine @ 2018-07-30 19:19 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Git List, Phillip Wood, Akinori MUSHA, Junio C Hamano

On Mon, Jul 30, 2018 at 11:47 AM Johannes Schindelin
<Johannes.Schindelin@gmx.de> wrote:
> On Mon, 30 Jul 2018, Eric Sunshine wrote:
> > Unfortunately, after sending this series, I see that there is
> > additional corruption that needs to be addressed. In particular, the
> > "author" header time is incorrectly prefixed with "@", so this series
> > is going to need a re-roll to fix that, as well.
>
> AFAIR the `@` was not an oversight, but required so that we could pass in
> the Unix epoch.

I don't think it was an oversight either, but it is nevertheless
incorrect to use the "@" in the commit object's "author" header.

As I understand it, you do "need" the "@" to distinguish a Unix epoch
value assigned to GIT_AUTHOR_DATE, but the commit object format is
very exacting in the datestamp format it accepts, and it does not
allow "@". So, the date from GIT_AUTHOR_DATE needs to be converted to
a format acceptable to the commit object, otherwise the commit is
considered corrupt.

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2018-07-30 19:19 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-07-30  9:29 [PATCH 0/2] fix "rebase -i --root" corrupting root commit Eric Sunshine
2018-07-30  9:29 ` [PATCH 1/2] sequencer: fix "rebase -i --root" corrupting author header Eric Sunshine
2018-07-30 15:44   ` Johannes Schindelin
2018-07-30  9:29 ` [PATCH 2/2] sequencer: fix "rebase -i --root" corrupting author header timezone Eric Sunshine
2018-07-30 12:20   ` Phillip Wood
2018-07-30 18:45     ` Eric Sunshine
2018-07-30 10:06 ` [PATCH 0/2] fix "rebase -i --root" corrupting root commit Eric Sunshine
2018-07-30 15:47   ` Johannes Schindelin
2018-07-30 19:19     ` Eric Sunshine
2018-07-30 12:14 ` Phillip Wood
2018-07-30 15:29   ` Junio C Hamano
2018-07-30 15:49   ` Johannes Schindelin
2018-07-30 19:15   ` Eric Sunshine

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).