git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH 0/3] Re-fix rebase -i with SHA-1 collisions
@ 2020-01-16 21:18 Johannes Schindelin via GitGitGadget
  2020-01-16 21:18 ` [PATCH 1/3] parse_insn_line(): improve error message when parsing failed Johannes Schindelin via GitGitGadget
                   ` (4 more replies)
  0 siblings, 5 replies; 28+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2020-01-16 21:18 UTC (permalink / raw)
  To: git
  Cc: brian m. carlson, Alban Gruin, Eric Sunshine, Junio C Hamano,
	Johannes Schindelin

Triggered by one of brian's SHA-256 patch series, I looked at the reason why
the SHA-1 collision test case passed when it shouldn't. Turns out that the
regression test was not quite thorough enough, and the interactive rebase 
did regress recently.

While in the area, I realized that the same bug exists in the code backing
the rebase.missingCommitsCheck feature: the backed-up todo list uses
shortened commit IDs that may very well become ambiguous during the rebase.
For good measure, this patch series fixes that, too.

Finally, I saw that git rebase --edit-todo reported the line in an awkward,
maybe even incorrect, way when there was an ambiguous commit ID, and I also
fixed that.

To make sure that the code can be easily adapted to SHA-256 after these
patches, I actually already made those adjustments on top and offered them
up at https://github.com/bk2204/git/pull/1.

Johannes Schindelin (3):
  parse_insn_line(): improve error message when parsing failed
  rebase -i: re-fix short SHA-1 collision
  rebase -i: also avoid SHA-1 collisions with missingCommitsCheck

 rebase-interactive.c          |  8 +++++---
 sequencer.c                   | 21 +++++++++++++++++----
 t/t3404-rebase-interactive.sh | 17 +++++++++++++++--
 3 files changed, 37 insertions(+), 9 deletions(-)


base-commit: d0654dc308b0ba76dd8ed7bbb33c8d8f7aacd783
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-529%2Fdscho%2Fre-fix-rebase-i-with-sha-collisions-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-529/dscho/re-fix-rebase-i-with-sha-collisions-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/529
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 1/3] parse_insn_line(): improve error message when parsing failed
  2020-01-16 21:18 [PATCH 0/3] Re-fix rebase -i with SHA-1 collisions Johannes Schindelin via GitGitGadget
@ 2020-01-16 21:18 ` Johannes Schindelin via GitGitGadget
  2020-01-17 18:00   ` Junio C Hamano
  2020-01-16 21:18 ` [PATCH 2/3] rebase -i: re-fix short SHA-1 collision Johannes Schindelin via GitGitGadget
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 28+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2020-01-16 21:18 UTC (permalink / raw)
  To: git
  Cc: brian m. carlson, Alban Gruin, Eric Sunshine, Junio C Hamano,
	Johannes Schindelin, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

In the case that a `get_oid()` call failed, we showed some rather bogus
part of the line instead of the precise string we sent to said function.
That makes it rather hard for users to understand what is going wrong,
so let's fix that.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 sequencer.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/sequencer.c b/sequencer.c
index b9dbf1adb0..7c30dad59c 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -2118,6 +2118,8 @@ static int parse_insn_line(struct repository *r, struct todo_item *item,
 	saved = *end_of_object_name;
 	*end_of_object_name = '\0';
 	status = get_oid(bol, &commit_oid);
+	if (status < 0)
+		error(_("could not parse '%s'"), bol); /* return later */
 	*end_of_object_name = saved;
 
 	bol = end_of_object_name + strspn(end_of_object_name, " \t");
@@ -2125,11 +2127,10 @@ static int parse_insn_line(struct repository *r, struct todo_item *item,
 	item->arg_len = (int)(eol - bol);
 
 	if (status < 0)
-		return error(_("could not parse '%.*s'"),
-			     (int)(end_of_object_name - bol), bol);
+		return status;
 
 	item->commit = lookup_commit_reference(r, &commit_oid);
-	return !item->commit;
+	return item->commit ? 0 : -1;
 }
 
 int sequencer_get_last_command(struct repository *r, enum replay_action *action)
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 2/3] rebase -i: re-fix short SHA-1 collision
  2020-01-16 21:18 [PATCH 0/3] Re-fix rebase -i with SHA-1 collisions Johannes Schindelin via GitGitGadget
  2020-01-16 21:18 ` [PATCH 1/3] parse_insn_line(): improve error message when parsing failed Johannes Schindelin via GitGitGadget
@ 2020-01-16 21:18 ` Johannes Schindelin via GitGitGadget
  2020-01-17 18:30   ` Junio C Hamano
  2020-01-16 21:18 ` [PATCH 3/3] rebase -i: also avoid SHA-1 collisions with missingCommitsCheck Johannes Schindelin via GitGitGadget
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 28+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2020-01-16 21:18 UTC (permalink / raw)
  To: git
  Cc: brian m. carlson, Alban Gruin, Eric Sunshine, Junio C Hamano,
	Johannes Schindelin, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

In 66ae9a57b88 (t3404: rebase -i: demonstrate short SHA-1 collision,
2013-08-23), we added a test case that demonstrated how it is possible
that a previously unambiguous short commit ID could become ambiguous
*during* a rebase.

In 75c69766554 (rebase -i: fix short SHA-1 collision, 2013-08-23), we
fixed that problem simply by writing out the todo list with expanded
commit IDs (except *right* before letting the user edit the todo list,
in which case we shorten them, but we expand them right after the file
was edited).

However, the bug resurfaced as a side effect of 393adf7a6f6 (sequencer:
directly call pick_commits() from complete_action(), 2019-11-24): as of
this commit, the sequencer no longer re-reads the todo list after
writing it out with expanded commit IDs.

The only redeeming factor is that the todo list is already parsed at
that stage, including all the commits corresponding to the commands,
therefore the sequencer can continue even if the internal todo list has
short commit IDs.

That does not prevent problems, though: the sequencer writes out the
`done` and `git-rebase-todo` files incrementally (i.e. overwriting the
todo list with a version that has _short_ commit IDs), and if a merge
conflict happens, or if an `edit` or a `break` command is encountered, a
subsequent `git rebase --continue` _will_ re-read the todo list, opening
an opportunity for the "short SHA-1 collision" bug again.

To avoid that, let's make sure that we do expand the commit IDs in the
todo list as soon as we have parsed it after letting the user edit it.

Additionally, we improve the 'short SHA-1 collide' test case in t3404 to
test specifically for the case where the rebase is resumed. We also
hard-code the expected colliding short SHA-1s, to document the
expectation (and to make it easier on future readers).

Note that we specifically test that the short commit ID is used in the
`git-rebase-todo.tmp` file: this file is created by the fake editor in
the test script and reflects the state that would have been presented to
the user to edit.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 sequencer.c                   | 14 +++++++++++++-
 t/t3404-rebase-interactive.sh | 15 +++++++++++++--
 2 files changed, 26 insertions(+), 3 deletions(-)

diff --git a/sequencer.c b/sequencer.c
index 7c30dad59c..c2945c699d 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -5076,7 +5076,7 @@ int complete_action(struct repository *r, struct replay_opts *opts, unsigned fla
 {
 	const char *shortonto, *todo_file = rebase_path_todo();
 	struct todo_list new_todo = TODO_LIST_INIT;
-	struct strbuf *buf = &todo_list->buf;
+	struct strbuf *buf = &todo_list->buf, buf2 = STRBUF_INIT;
 	struct object_id oid = onto->object.oid;
 	int res;
 
@@ -5128,6 +5128,18 @@ int complete_action(struct repository *r, struct replay_opts *opts, unsigned fla
 		return -1;
 	}
 
+	/* Expand the commit IDs */
+	todo_list_to_strbuf(r, &new_todo, &buf2, -1, 0);
+	strbuf_swap(&new_todo.buf, &buf2);
+	strbuf_release(&buf2);
+	new_todo.total_nr -= new_todo.nr;
+	if (todo_list_parse_insn_buffer(r, new_todo.buf.buf, &new_todo) < 0) {
+		fprintf(stderr, _(edit_todo_list_advice));
+		checkout_onto(r, opts, onto_name, &onto->object.oid, orig_head);
+		todo_list_release(&new_todo);
+		return error(_("invalid todo list after expanding IDs"));
+	}
+
 	if (opts->allow_ff && skip_unnecessary_picks(r, &new_todo, &oid)) {
 		todo_list_release(&new_todo);
 		return error(_("could not skip unnecessary pick commands"));
diff --git a/t/t3404-rebase-interactive.sh b/t/t3404-rebase-interactive.sh
index ae6e55ce79..1cc9f36bc7 100755
--- a/t/t3404-rebase-interactive.sh
+++ b/t/t3404-rebase-interactive.sh
@@ -1264,13 +1264,24 @@ test_expect_success SHA1 'short SHA-1 setup' '
 test_expect_success SHA1 'short SHA-1 collide' '
 	test_when_finished "reset_rebase && git checkout master" &&
 	git checkout collide &&
+	colliding_sha1=6bcda37 &&
+	test $colliding_sha1 = "$(git rev-parse HEAD | cut -c 1-7)" &&
 	(
 		unset test_tick &&
 		test_tick &&
 		set_fake_editor &&
 		FAKE_COMMIT_MESSAGE="collide2 ac4f2ee" \
-		FAKE_LINES="reword 1 2" git rebase -i HEAD~2
-	)
+		FAKE_LINES="reword 1 break 2" git rebase -i HEAD~2 &&
+		test $colliding_sha1 = "$(git rev-parse HEAD | cut -c 1-7)" &&
+		grep "^pick $colliding_sha1 " \
+			.git/rebase-merge/git-rebase-todo.tmp &&
+		grep "^pick [0-9a-f]\{40\}" \
+			.git/rebase-merge/git-rebase-todo &&
+		git rebase --continue
+	) &&
+	collide2="$(git rev-parse HEAD~1 | cut -c 1-4)" &&
+	collide3="$(git rev-parse collide3 | cut -c 1-4)" &&
+	test "$collide2" = "$collide3"
 '
 
 test_expect_success 'respect core.abbrev' '
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 3/3] rebase -i: also avoid SHA-1 collisions with missingCommitsCheck
  2020-01-16 21:18 [PATCH 0/3] Re-fix rebase -i with SHA-1 collisions Johannes Schindelin via GitGitGadget
  2020-01-16 21:18 ` [PATCH 1/3] parse_insn_line(): improve error message when parsing failed Johannes Schindelin via GitGitGadget
  2020-01-16 21:18 ` [PATCH 2/3] rebase -i: re-fix short SHA-1 collision Johannes Schindelin via GitGitGadget
@ 2020-01-16 21:18 ` Johannes Schindelin via GitGitGadget
  2020-01-16 23:54 ` [PATCH 0/3] Re-fix rebase -i with SHA-1 collisions brian m. carlson
  2020-01-17 23:38 ` [PATCH v2 " Johannes Schindelin via GitGitGadget
  4 siblings, 0 replies; 28+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2020-01-16 21:18 UTC (permalink / raw)
  To: git
  Cc: brian m. carlson, Alban Gruin, Eric Sunshine, Junio C Hamano,
	Johannes Schindelin, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

When `rebase.missingCommitsCheck` is in effect, we use the backup of the
todo list that was copied just before the user was allowed to edit it.

That backup is, of course, just as susceptible to the hash collision as
the todo list itself: a reworded commit could make a previously
unambiguous short commit ID ambiguous all of a sudden.

So let's not just copy the todo list, but let's instead write out the
backup with expanded commit IDs.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 rebase-interactive.c          | 8 +++++---
 t/t3404-rebase-interactive.sh | 2 ++
 2 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/rebase-interactive.c b/rebase-interactive.c
index aa18ae82b7..1259adc8ea 100644
--- a/rebase-interactive.c
+++ b/rebase-interactive.c
@@ -104,9 +104,11 @@ int edit_todo_list(struct repository *r, struct todo_list *todo_list,
 				    -1, flags | TODO_LIST_SHORTEN_IDS | TODO_LIST_APPEND_TODO_HELP))
 		return error_errno(_("could not write '%s'"), todo_file);
 
-	if (initial && copy_file(rebase_path_todo_backup(), todo_file, 0666))
-		return error(_("could not copy '%s' to '%s'."), todo_file,
-			     rebase_path_todo_backup());
+	if (initial &&
+	    todo_list_write_to_file(r, todo_list, rebase_path_todo_backup(),
+				    shortrevisions, shortonto, -1,
+				    (flags | TODO_LIST_APPEND_TODO_HELP) & ~TODO_LIST_SHORTEN_IDS) < 0)
+		return error(_("could not write '%s'."), rebase_path_todo_backup());
 
 	if (launch_sequence_editor(todo_file, &new_todo->buf, NULL))
 		return -2;
diff --git a/t/t3404-rebase-interactive.sh b/t/t3404-rebase-interactive.sh
index 1cc9f36bc7..b90ea0fe44 100755
--- a/t/t3404-rebase-interactive.sh
+++ b/t/t3404-rebase-interactive.sh
@@ -1277,6 +1277,8 @@ test_expect_success SHA1 'short SHA-1 collide' '
 			.git/rebase-merge/git-rebase-todo.tmp &&
 		grep "^pick [0-9a-f]\{40\}" \
 			.git/rebase-merge/git-rebase-todo &&
+		grep "^pick [0-9a-f]\{40\}" \
+			.git/rebase-merge/git-rebase-todo.backup &&
 		git rebase --continue
 	) &&
 	collide2="$(git rev-parse HEAD~1 | cut -c 1-4)" &&
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH 0/3] Re-fix rebase -i with SHA-1 collisions
  2020-01-16 21:18 [PATCH 0/3] Re-fix rebase -i with SHA-1 collisions Johannes Schindelin via GitGitGadget
                   ` (2 preceding siblings ...)
  2020-01-16 21:18 ` [PATCH 3/3] rebase -i: also avoid SHA-1 collisions with missingCommitsCheck Johannes Schindelin via GitGitGadget
@ 2020-01-16 23:54 ` brian m. carlson
  2020-01-17  9:51   ` Johannes Schindelin
  2020-01-17 23:38 ` [PATCH v2 " Johannes Schindelin via GitGitGadget
  4 siblings, 1 reply; 28+ messages in thread
From: brian m. carlson @ 2020-01-16 23:54 UTC (permalink / raw)
  To: Johannes Schindelin via GitGitGadget
  Cc: git, Alban Gruin, Eric Sunshine, Junio C Hamano,
	Johannes Schindelin

[-- Attachment #1: Type: text/plain, Size: 1314 bytes --]

On 2020-01-16 at 21:18:23, Johannes Schindelin via GitGitGadget wrote:
> Triggered by one of brian's SHA-256 patch series, I looked at the reason why
> the SHA-1 collision test case passed when it shouldn't. Turns out that the
> regression test was not quite thorough enough, and the interactive rebase 
> did regress recently.
> 
> While in the area, I realized that the same bug exists in the code backing
> the rebase.missingCommitsCheck feature: the backed-up todo list uses
> shortened commit IDs that may very well become ambiguous during the rebase.
> For good measure, this patch series fixes that, too.
> 
> Finally, I saw that git rebase --edit-todo reported the line in an awkward,
> maybe even incorrect, way when there was an ambiguous commit ID, and I also
> fixed that.
> 
> To make sure that the code can be easily adapted to SHA-256 after these
> patches, I actually already made those adjustments on top and offered them
> up at https://github.com/bk2204/git/pull/1.

This series looks great to me, and thanks for fixing this.

As mentioned in the PR, I'm happy for you to drop the SHA-256 patch into
this series if you like, or I can carry it in a future series.  Either
way is fine with me.
-- 
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 868 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 0/3] Re-fix rebase -i with SHA-1 collisions
  2020-01-16 23:54 ` [PATCH 0/3] Re-fix rebase -i with SHA-1 collisions brian m. carlson
@ 2020-01-17  9:51   ` Johannes Schindelin
  2020-01-17 22:37     ` brian m. carlson
  0 siblings, 1 reply; 28+ messages in thread
From: Johannes Schindelin @ 2020-01-17  9:51 UTC (permalink / raw)
  To: brian m. carlson
  Cc: Johannes Schindelin via GitGitGadget, git, Alban Gruin,
	Eric Sunshine, Junio C Hamano

Hi brian,

On Thu, 16 Jan 2020, brian m. carlson wrote:

> On 2020-01-16 at 21:18:23, Johannes Schindelin via GitGitGadget wrote:
> > Triggered by one of brian's SHA-256 patch series, I looked at the reason why
> > the SHA-1 collision test case passed when it shouldn't. Turns out that the
> > regression test was not quite thorough enough, and the interactive rebase
> > did regress recently.
> >
> > While in the area, I realized that the same bug exists in the code backing
> > the rebase.missingCommitsCheck feature: the backed-up todo list uses
> > shortened commit IDs that may very well become ambiguous during the rebase.
> > For good measure, this patch series fixes that, too.
> >
> > Finally, I saw that git rebase --edit-todo reported the line in an awkward,
> > maybe even incorrect, way when there was an ambiguous commit ID, and I also
> > fixed that.
> >
> > To make sure that the code can be easily adapted to SHA-256 after these
> > patches, I actually already made those adjustments on top and offered them
> > up at https://github.com/bk2204/git/pull/1.
>
> This series looks great to me, and thanks for fixing this.
>
> As mentioned in the PR, I'm happy for you to drop the SHA-256 patch into
> this series if you like, or I can carry it in a future series.  Either
> way is fine with me.

Excellent. Given that the re-fix to avoid short commit ID collisions has
little to do with supporting SHA-256, I would like to keep the patch
series separate, then.

The question whether to move the SHA-256 support patch into your series is
more a question to Junio, i.e. which patch series will be merged down
faster.

Junio, what is your preference here?

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/3] parse_insn_line(): improve error message when parsing failed
  2020-01-16 21:18 ` [PATCH 1/3] parse_insn_line(): improve error message when parsing failed Johannes Schindelin via GitGitGadget
@ 2020-01-17 18:00   ` Junio C Hamano
  0 siblings, 0 replies; 28+ messages in thread
From: Junio C Hamano @ 2020-01-17 18:00 UTC (permalink / raw)
  To: Johannes Schindelin via GitGitGadget
  Cc: git, brian m. carlson, Alban Gruin, Eric Sunshine,
	Johannes Schindelin

"Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
writes:

> From: Johannes Schindelin <johannes.schindelin@gmx.de>
>
> In the case that a `get_oid()` call failed, we showed some rather bogus
> part of the line instead of the precise string we sent to said function.
> That makes it rather hard for users to understand what is going wrong,
> so let's fix that.
>
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---
>  sequencer.c | 7 ++++---
>  1 file changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/sequencer.c b/sequencer.c
> index b9dbf1adb0..7c30dad59c 100644
> --- a/sequencer.c
> +++ b/sequencer.c
> @@ -2118,6 +2118,8 @@ static int parse_insn_line(struct repository *r, struct todo_item *item,
>  	saved = *end_of_object_name;
>  	*end_of_object_name = '\0';
>  	status = get_oid(bol, &commit_oid);
> +	if (status < 0)
> +		error(_("could not parse '%s'"), bol); /* return later */
>  	*end_of_object_name = saved;

OK, so after making sure the line begins a command that takes an
object name, we first NUL terminated the token after the command
but it turns out the string is not a valid oid.  We show the part
we thought was the object name before reverting the NUL termination.

Makes sense.  And then later...

>  	bol = end_of_object_name + strspn(end_of_object_name, " \t");
> @@ -2125,11 +2127,10 @@ static int parse_insn_line(struct repository *r, struct todo_item *item,
>  	item->arg_len = (int)(eol - bol);
>  
>  	if (status < 0)
> -		return error(_("could not parse '%.*s'"),
> -			     (int)(end_of_object_name - bol), bol);
> +		return status;

...this is the *only* code that uses the status that was assigned to
the return value of get_oid().  

Because the implementation of this function (mis)uses "bol", which
stands for "beginning of line", as a moving pointer of "beginning of
the token we are looking at", the value of "bol" at this point of
the control in the original code is not where the "bol" pointer used
to be when end-of-object-name was computed (if it were, the '%.*s'
argument would have been correct) and shows a token after where the
object name would have been, which may help the user identify the
line but does not directly show what token we had trouble with
parsing.

And the updated code will avoid the issue by using the "bol" pointer
when it is still pointing at the right place.


>  	item->commit = lookup_commit_reference(r, &commit_oid);
> -	return !item->commit;
> +	return item->commit ? 0 : -1;

This changes the polarity of the error exit from positive 1 to
negative 1.  The only caller of this function takes anything
non-zero as a failure so this would not cause behaviour change, but
returning negative is more in line with the practice so it is an
improvement (it is unrelated to the theme of this patch and is not
explained, though).

OK.


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 2/3] rebase -i: re-fix short SHA-1 collision
  2020-01-16 21:18 ` [PATCH 2/3] rebase -i: re-fix short SHA-1 collision Johannes Schindelin via GitGitGadget
@ 2020-01-17 18:30   ` Junio C Hamano
  2020-01-17 21:31     ` Johannes Schindelin
  0 siblings, 1 reply; 28+ messages in thread
From: Junio C Hamano @ 2020-01-17 18:30 UTC (permalink / raw)
  To: Johannes Schindelin via GitGitGadget
  Cc: git, brian m. carlson, Alban Gruin, Eric Sunshine,
	Johannes Schindelin

"Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
writes:

> However, the bug resurfaced as a side effect of 393adf7a6f6 (sequencer:
> directly call pick_commits() from complete_action(), 2019-11-24): as of
> this commit, the sequencer no longer re-reads the todo list after
> writing it out with expanded commit IDs.

Ouch.  The analysis above is quite understandable.

> @@ -5128,6 +5128,18 @@ int complete_action(struct repository *r, struct replay_opts *opts, unsigned fla
>  		return -1;
>  	}
>  
> +	/* Expand the commit IDs */
> +	todo_list_to_strbuf(r, &new_todo, &buf2, -1, 0);
> +	strbuf_swap(&new_todo.buf, &buf2);
> +	strbuf_release(&buf2);
> +	new_todo.total_nr -= new_todo.nr;
> +	if (todo_list_parse_insn_buffer(r, new_todo.buf.buf, &new_todo) < 0) {
> +		fprintf(stderr, _(edit_todo_list_advice));
> +		checkout_onto(r, opts, onto_name, &onto->object.oid, orig_head);
> +		todo_list_release(&new_todo);
> +		return error(_("invalid todo list after expanding IDs"));
> +	}

The above happens after edit_todo_list() returns and then the
resulting data (i.e. new_todo) that came from the on-disk file has
been parsed with an existing call to todo_lsit_parse_insn_buffer()?

I am wondering when this if() statement would trigger, iow, under
what condition the result of "Expand the commit IDs" operation fails
to be parsed correctly, and what the user can do to remedy it.
Especially given that incoming new_todo has passed the existing
parse and check without hitting "return -1" we see above in the
context, I am not sure if there is anything, other than any
potential glitch in the added code above, that can cause this second
parse_insn_buffer() to fail.  Shouldn't the body of if() be simply a
BUG()?

Or am I somehow missing a hunk that removes an existing call to
parse&check?

> diff --git a/t/t3404-rebase-interactive.sh b/t/t3404-rebase-interactive.sh
> index ae6e55ce79..1cc9f36bc7 100755
> --- a/t/t3404-rebase-interactive.sh
> +++ b/t/t3404-rebase-interactive.sh
> @@ -1264,13 +1264,24 @@ test_expect_success SHA1 'short SHA-1 setup' '
>  test_expect_success SHA1 'short SHA-1 collide' '
>  	test_when_finished "reset_rebase && git checkout master" &&
>  	git checkout collide &&
> +	colliding_sha1=6bcda37 &&
> +	test $colliding_sha1 = "$(git rev-parse HEAD | cut -c 1-7)" &&
>  	(
>  		unset test_tick &&
>  		test_tick &&
>  		set_fake_editor &&
>  		FAKE_COMMIT_MESSAGE="collide2 ac4f2ee" \
> -		FAKE_LINES="reword 1 2" git rebase -i HEAD~2
> -	)
> +		FAKE_LINES="reword 1 break 2" git rebase -i HEAD~2 &&
> +		test $colliding_sha1 = "$(git rev-parse HEAD | cut -c 1-7)" &&
> +		grep "^pick $colliding_sha1 " \
> +			.git/rebase-merge/git-rebase-todo.tmp &&
> +		grep "^pick [0-9a-f]\{40\}" \
> +			.git/rebase-merge/git-rebase-todo &&
> +		git rebase --continue
> +	) &&
> +	collide2="$(git rev-parse HEAD~1 | cut -c 1-4)" &&
> +	collide3="$(git rev-parse collide3 | cut -c 1-4)" &&
> +	test "$collide2" = "$collide3"
>  '
>  
>  test_expect_success 'respect core.abbrev' '

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 2/3] rebase -i: re-fix short SHA-1 collision
  2020-01-17 18:30   ` Junio C Hamano
@ 2020-01-17 21:31     ` Johannes Schindelin
  0 siblings, 0 replies; 28+ messages in thread
From: Johannes Schindelin @ 2020-01-17 21:31 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Johannes Schindelin via GitGitGadget, git, brian m. carlson,
	Alban Gruin, Eric Sunshine

Hi Junio,

On Fri, 17 Jan 2020, Junio C Hamano wrote:

> "Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
> writes:
>
> > +	/* Expand the commit IDs */
> > +	todo_list_to_strbuf(r, &new_todo, &buf2, -1, 0);
> > +	strbuf_swap(&new_todo.buf, &buf2);
> > +	strbuf_release(&buf2);
> > +	new_todo.total_nr -= new_todo.nr;
> > +	if (todo_list_parse_insn_buffer(r, new_todo.buf.buf, &new_todo) < 0) {
> > +		fprintf(stderr, _(edit_todo_list_advice));
> > +		checkout_onto(r, opts, onto_name, &onto->object.oid, orig_head);
> > +		todo_list_release(&new_todo);
> > +		return error(_("invalid todo list after expanding IDs"));
> > +	}
>
> The above happens after edit_todo_list() returns and then the
> resulting data (i.e. new_todo) that came from the on-disk file has
> been parsed with an existing call to todo_lsit_parse_insn_buffer()?
>
> I am wondering when this if() statement would trigger, iow, under
> what condition the result of "Expand the commit IDs" operation fails
> to be parsed correctly, and what the user can do to remedy it.

You're right, this is defensive code to guard against bugs, but the error
message suggests that it might be a user error: it isn't.

> Especially given that incoming new_todo has passed the existing
> parse and check without hitting "return -1" we see above in the
> context, I am not sure if there is anything, other than any
> potential glitch in the added code above, that can cause this second
> parse_insn_buffer() to fail.  Shouldn't the body of if() be simply a
> BUG()?

It should.

Will fix,
Dscho

> Or am I somehow missing a hunk that removes an existing call to
> parse&check?
>
> > diff --git a/t/t3404-rebase-interactive.sh b/t/t3404-rebase-interactive.sh
> > index ae6e55ce79..1cc9f36bc7 100755
> > --- a/t/t3404-rebase-interactive.sh
> > +++ b/t/t3404-rebase-interactive.sh
> > @@ -1264,13 +1264,24 @@ test_expect_success SHA1 'short SHA-1 setup' '
> >  test_expect_success SHA1 'short SHA-1 collide' '
> >  	test_when_finished "reset_rebase && git checkout master" &&
> >  	git checkout collide &&
> > +	colliding_sha1=6bcda37 &&
> > +	test $colliding_sha1 = "$(git rev-parse HEAD | cut -c 1-7)" &&
> >  	(
> >  		unset test_tick &&
> >  		test_tick &&
> >  		set_fake_editor &&
> >  		FAKE_COMMIT_MESSAGE="collide2 ac4f2ee" \
> > -		FAKE_LINES="reword 1 2" git rebase -i HEAD~2
> > -	)
> > +		FAKE_LINES="reword 1 break 2" git rebase -i HEAD~2 &&
> > +		test $colliding_sha1 = "$(git rev-parse HEAD | cut -c 1-7)" &&
> > +		grep "^pick $colliding_sha1 " \
> > +			.git/rebase-merge/git-rebase-todo.tmp &&
> > +		grep "^pick [0-9a-f]\{40\}" \
> > +			.git/rebase-merge/git-rebase-todo &&
> > +		git rebase --continue
> > +	) &&
> > +	collide2="$(git rev-parse HEAD~1 | cut -c 1-4)" &&
> > +	collide3="$(git rev-parse collide3 | cut -c 1-4)" &&
> > +	test "$collide2" = "$collide3"
> >  '
> >
> >  test_expect_success 'respect core.abbrev' '
>
>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 0/3] Re-fix rebase -i with SHA-1 collisions
  2020-01-17  9:51   ` Johannes Schindelin
@ 2020-01-17 22:37     ` brian m. carlson
  2020-01-21 18:00       ` Junio C Hamano
  0 siblings, 1 reply; 28+ messages in thread
From: brian m. carlson @ 2020-01-17 22:37 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Johannes Schindelin via GitGitGadget, git, Alban Gruin,
	Eric Sunshine, Junio C Hamano

[-- Attachment #1: Type: text/plain, Size: 952 bytes --]

On 2020-01-17 at 09:51:42, Johannes Schindelin wrote:
> > This series looks great to me, and thanks for fixing this.
> >
> > As mentioned in the PR, I'm happy for you to drop the SHA-256 patch into
> > this series if you like, or I can carry it in a future series.  Either
> > way is fine with me.
> 
> Excellent. Given that the re-fix to avoid short commit ID collisions has
> little to do with supporting SHA-256, I would like to keep the patch
> series separate, then.
> 
> The question whether to move the SHA-256 support patch into your series is
> more a question to Junio, i.e. which patch series will be merged down
> faster.

I need to do a reroll of part 8, so I'll pick it into a future series
(part 9) of test fixes and drop my patch.  That way it won't interfere
with either series making progress, but it will still be included at the
end.
-- 
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 868 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v2 0/3] Re-fix rebase -i with SHA-1 collisions
  2020-01-16 21:18 [PATCH 0/3] Re-fix rebase -i with SHA-1 collisions Johannes Schindelin via GitGitGadget
                   ` (3 preceding siblings ...)
  2020-01-16 23:54 ` [PATCH 0/3] Re-fix rebase -i with SHA-1 collisions brian m. carlson
@ 2020-01-17 23:38 ` Johannes Schindelin via GitGitGadget
  2020-01-17 23:38   ` [PATCH v2 1/3] parse_insn_line(): improve error message when parsing failed Johannes Schindelin via GitGitGadget
                     ` (3 more replies)
  4 siblings, 4 replies; 28+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2020-01-17 23:38 UTC (permalink / raw)
  To: git
  Cc: brian m. carlson, Alban Gruin, Eric Sunshine, Junio C Hamano,
	Johannes Schindelin

Triggered by one of brian's SHA-256 patch series, I looked at the reason why
the SHA-1 collision test case passed when it shouldn't. Turns out that the
regression test was not quite thorough enough, and the interactive rebase 
did regress recently.

While in the area, I realized that the same bug exists in the code backing
the rebase.missingCommitsCheck feature: the backed-up todo list uses
shortened commit IDs that may very well become ambiguous during the rebase.
For good measure, this patch series fixes that, too.

Finally, I saw that git rebase --edit-todo reported the line in an awkward,
maybe even incorrect, way when there was an ambiguous commit ID, and I also
fixed that.

To make sure that the code can be easily adapted to SHA-256 after these
patches, I actually already made those adjustments on top and offered them
up at https://github.com/bk2204/git/pull/1.

Changes since v1:

 * Turned the error condition when parsing the todo list with just-expanded
   commit IDs failed into a BUG().

Johannes Schindelin (3):
  parse_insn_line(): improve error message when parsing failed
  rebase -i: re-fix short SHA-1 collision
  rebase -i: also avoid SHA-1 collisions with missingCommitsCheck

 rebase-interactive.c          |  8 +++++---
 sequencer.c                   | 18 ++++++++++++++----
 t/t3404-rebase-interactive.sh | 17 +++++++++++++++--
 3 files changed, 34 insertions(+), 9 deletions(-)


base-commit: d0654dc308b0ba76dd8ed7bbb33c8d8f7aacd783
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-529%2Fdscho%2Fre-fix-rebase-i-with-sha-collisions-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-529/dscho/re-fix-rebase-i-with-sha-collisions-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/529

Range-diff vs v1:

 1:  2ae2e435b0 = 1:  2ae2e435b0 parse_insn_line(): improve error message when parsing failed
 2:  ad50cd1b92 ! 2:  102fa568dc rebase -i: re-fix short SHA-1 collision
     @@ -66,12 +66,9 @@
      +	strbuf_swap(&new_todo.buf, &buf2);
      +	strbuf_release(&buf2);
      +	new_todo.total_nr -= new_todo.nr;
     -+	if (todo_list_parse_insn_buffer(r, new_todo.buf.buf, &new_todo) < 0) {
     -+		fprintf(stderr, _(edit_todo_list_advice));
     -+		checkout_onto(r, opts, onto_name, &onto->object.oid, orig_head);
     -+		todo_list_release(&new_todo);
     -+		return error(_("invalid todo list after expanding IDs"));
     -+	}
     ++	if (todo_list_parse_insn_buffer(r, new_todo.buf.buf, &new_todo) < 0)
     ++		BUG("invalid todo list after expanding IDs:\n%s",
     ++		    new_todo.buf.buf);
      +
       	if (opts->allow_ff && skip_unnecessary_picks(r, &new_todo, &oid)) {
       		todo_list_release(&new_todo);
 3:  e7d9ea8992 = 3:  486e9413a6 rebase -i: also avoid SHA-1 collisions with missingCommitsCheck

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v2 1/3] parse_insn_line(): improve error message when parsing failed
  2020-01-17 23:38 ` [PATCH v2 " Johannes Schindelin via GitGitGadget
@ 2020-01-17 23:38   ` Johannes Schindelin via GitGitGadget
  2020-01-21 22:04     ` Junio C Hamano
  2020-01-17 23:38   ` [PATCH v2 2/3] rebase -i: re-fix short SHA-1 collision Johannes Schindelin via GitGitGadget
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 28+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2020-01-17 23:38 UTC (permalink / raw)
  To: git
  Cc: brian m. carlson, Alban Gruin, Eric Sunshine, Junio C Hamano,
	Johannes Schindelin, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

In the case that a `get_oid()` call failed, we showed some rather bogus
part of the line instead of the precise string we sent to said function.
That makes it rather hard for users to understand what is going wrong,
so let's fix that.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 sequencer.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/sequencer.c b/sequencer.c
index b9dbf1adb0..7c30dad59c 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -2118,6 +2118,8 @@ static int parse_insn_line(struct repository *r, struct todo_item *item,
 	saved = *end_of_object_name;
 	*end_of_object_name = '\0';
 	status = get_oid(bol, &commit_oid);
+	if (status < 0)
+		error(_("could not parse '%s'"), bol); /* return later */
 	*end_of_object_name = saved;
 
 	bol = end_of_object_name + strspn(end_of_object_name, " \t");
@@ -2125,11 +2127,10 @@ static int parse_insn_line(struct repository *r, struct todo_item *item,
 	item->arg_len = (int)(eol - bol);
 
 	if (status < 0)
-		return error(_("could not parse '%.*s'"),
-			     (int)(end_of_object_name - bol), bol);
+		return status;
 
 	item->commit = lookup_commit_reference(r, &commit_oid);
-	return !item->commit;
+	return item->commit ? 0 : -1;
 }
 
 int sequencer_get_last_command(struct repository *r, enum replay_action *action)
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 2/3] rebase -i: re-fix short SHA-1 collision
  2020-01-17 23:38 ` [PATCH v2 " Johannes Schindelin via GitGitGadget
  2020-01-17 23:38   ` [PATCH v2 1/3] parse_insn_line(): improve error message when parsing failed Johannes Schindelin via GitGitGadget
@ 2020-01-17 23:38   ` Johannes Schindelin via GitGitGadget
  2020-01-20 16:49     ` Eric Sunshine
  2020-01-17 23:38   ` [PATCH v2 3/3] rebase -i: also avoid SHA-1 collisions with missingCommitsCheck Johannes Schindelin via GitGitGadget
  2020-01-23 12:28   ` [PATCH v3 0/3] Re-fix rebase -i with SHA-1 collisions Johannes Schindelin via GitGitGadget
  3 siblings, 1 reply; 28+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2020-01-17 23:38 UTC (permalink / raw)
  To: git
  Cc: brian m. carlson, Alban Gruin, Eric Sunshine, Junio C Hamano,
	Johannes Schindelin, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

In 66ae9a57b88 (t3404: rebase -i: demonstrate short SHA-1 collision,
2013-08-23), we added a test case that demonstrated how it is possible
that a previously unambiguous short commit ID could become ambiguous
*during* a rebase.

In 75c69766554 (rebase -i: fix short SHA-1 collision, 2013-08-23), we
fixed that problem simply by writing out the todo list with expanded
commit IDs (except *right* before letting the user edit the todo list,
in which case we shorten them, but we expand them right after the file
was edited).

However, the bug resurfaced as a side effect of 393adf7a6f6 (sequencer:
directly call pick_commits() from complete_action(), 2019-11-24): as of
this commit, the sequencer no longer re-reads the todo list after
writing it out with expanded commit IDs.

The only redeeming factor is that the todo list is already parsed at
that stage, including all the commits corresponding to the commands,
therefore the sequencer can continue even if the internal todo list has
short commit IDs.

That does not prevent problems, though: the sequencer writes out the
`done` and `git-rebase-todo` files incrementally (i.e. overwriting the
todo list with a version that has _short_ commit IDs), and if a merge
conflict happens, or if an `edit` or a `break` command is encountered, a
subsequent `git rebase --continue` _will_ re-read the todo list, opening
an opportunity for the "short SHA-1 collision" bug again.

To avoid that, let's make sure that we do expand the commit IDs in the
todo list as soon as we have parsed it after letting the user edit it.

Additionally, we improve the 'short SHA-1 collide' test case in t3404 to
test specifically for the case where the rebase is resumed. We also
hard-code the expected colliding short SHA-1s, to document the
expectation (and to make it easier on future readers).

Note that we specifically test that the short commit ID is used in the
`git-rebase-todo.tmp` file: this file is created by the fake editor in
the test script and reflects the state that would have been presented to
the user to edit.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 sequencer.c                   | 11 ++++++++++-
 t/t3404-rebase-interactive.sh | 15 +++++++++++++--
 2 files changed, 23 insertions(+), 3 deletions(-)

diff --git a/sequencer.c b/sequencer.c
index 7c30dad59c..5f69b47e32 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -5076,7 +5076,7 @@ int complete_action(struct repository *r, struct replay_opts *opts, unsigned fla
 {
 	const char *shortonto, *todo_file = rebase_path_todo();
 	struct todo_list new_todo = TODO_LIST_INIT;
-	struct strbuf *buf = &todo_list->buf;
+	struct strbuf *buf = &todo_list->buf, buf2 = STRBUF_INIT;
 	struct object_id oid = onto->object.oid;
 	int res;
 
@@ -5128,6 +5128,15 @@ int complete_action(struct repository *r, struct replay_opts *opts, unsigned fla
 		return -1;
 	}
 
+	/* Expand the commit IDs */
+	todo_list_to_strbuf(r, &new_todo, &buf2, -1, 0);
+	strbuf_swap(&new_todo.buf, &buf2);
+	strbuf_release(&buf2);
+	new_todo.total_nr -= new_todo.nr;
+	if (todo_list_parse_insn_buffer(r, new_todo.buf.buf, &new_todo) < 0)
+		BUG("invalid todo list after expanding IDs:\n%s",
+		    new_todo.buf.buf);
+
 	if (opts->allow_ff && skip_unnecessary_picks(r, &new_todo, &oid)) {
 		todo_list_release(&new_todo);
 		return error(_("could not skip unnecessary pick commands"));
diff --git a/t/t3404-rebase-interactive.sh b/t/t3404-rebase-interactive.sh
index ae6e55ce79..1cc9f36bc7 100755
--- a/t/t3404-rebase-interactive.sh
+++ b/t/t3404-rebase-interactive.sh
@@ -1264,13 +1264,24 @@ test_expect_success SHA1 'short SHA-1 setup' '
 test_expect_success SHA1 'short SHA-1 collide' '
 	test_when_finished "reset_rebase && git checkout master" &&
 	git checkout collide &&
+	colliding_sha1=6bcda37 &&
+	test $colliding_sha1 = "$(git rev-parse HEAD | cut -c 1-7)" &&
 	(
 		unset test_tick &&
 		test_tick &&
 		set_fake_editor &&
 		FAKE_COMMIT_MESSAGE="collide2 ac4f2ee" \
-		FAKE_LINES="reword 1 2" git rebase -i HEAD~2
-	)
+		FAKE_LINES="reword 1 break 2" git rebase -i HEAD~2 &&
+		test $colliding_sha1 = "$(git rev-parse HEAD | cut -c 1-7)" &&
+		grep "^pick $colliding_sha1 " \
+			.git/rebase-merge/git-rebase-todo.tmp &&
+		grep "^pick [0-9a-f]\{40\}" \
+			.git/rebase-merge/git-rebase-todo &&
+		git rebase --continue
+	) &&
+	collide2="$(git rev-parse HEAD~1 | cut -c 1-4)" &&
+	collide3="$(git rev-parse collide3 | cut -c 1-4)" &&
+	test "$collide2" = "$collide3"
 '
 
 test_expect_success 'respect core.abbrev' '
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 3/3] rebase -i: also avoid SHA-1 collisions with missingCommitsCheck
  2020-01-17 23:38 ` [PATCH v2 " Johannes Schindelin via GitGitGadget
  2020-01-17 23:38   ` [PATCH v2 1/3] parse_insn_line(): improve error message when parsing failed Johannes Schindelin via GitGitGadget
  2020-01-17 23:38   ` [PATCH v2 2/3] rebase -i: re-fix short SHA-1 collision Johannes Schindelin via GitGitGadget
@ 2020-01-17 23:38   ` Johannes Schindelin via GitGitGadget
  2020-01-23 12:28   ` [PATCH v3 0/3] Re-fix rebase -i with SHA-1 collisions Johannes Schindelin via GitGitGadget
  3 siblings, 0 replies; 28+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2020-01-17 23:38 UTC (permalink / raw)
  To: git
  Cc: brian m. carlson, Alban Gruin, Eric Sunshine, Junio C Hamano,
	Johannes Schindelin, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

When `rebase.missingCommitsCheck` is in effect, we use the backup of the
todo list that was copied just before the user was allowed to edit it.

That backup is, of course, just as susceptible to the hash collision as
the todo list itself: a reworded commit could make a previously
unambiguous short commit ID ambiguous all of a sudden.

So let's not just copy the todo list, but let's instead write out the
backup with expanded commit IDs.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 rebase-interactive.c          | 8 +++++---
 t/t3404-rebase-interactive.sh | 2 ++
 2 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/rebase-interactive.c b/rebase-interactive.c
index aa18ae82b7..1259adc8ea 100644
--- a/rebase-interactive.c
+++ b/rebase-interactive.c
@@ -104,9 +104,11 @@ int edit_todo_list(struct repository *r, struct todo_list *todo_list,
 				    -1, flags | TODO_LIST_SHORTEN_IDS | TODO_LIST_APPEND_TODO_HELP))
 		return error_errno(_("could not write '%s'"), todo_file);
 
-	if (initial && copy_file(rebase_path_todo_backup(), todo_file, 0666))
-		return error(_("could not copy '%s' to '%s'."), todo_file,
-			     rebase_path_todo_backup());
+	if (initial &&
+	    todo_list_write_to_file(r, todo_list, rebase_path_todo_backup(),
+				    shortrevisions, shortonto, -1,
+				    (flags | TODO_LIST_APPEND_TODO_HELP) & ~TODO_LIST_SHORTEN_IDS) < 0)
+		return error(_("could not write '%s'."), rebase_path_todo_backup());
 
 	if (launch_sequence_editor(todo_file, &new_todo->buf, NULL))
 		return -2;
diff --git a/t/t3404-rebase-interactive.sh b/t/t3404-rebase-interactive.sh
index 1cc9f36bc7..b90ea0fe44 100755
--- a/t/t3404-rebase-interactive.sh
+++ b/t/t3404-rebase-interactive.sh
@@ -1277,6 +1277,8 @@ test_expect_success SHA1 'short SHA-1 collide' '
 			.git/rebase-merge/git-rebase-todo.tmp &&
 		grep "^pick [0-9a-f]\{40\}" \
 			.git/rebase-merge/git-rebase-todo &&
+		grep "^pick [0-9a-f]\{40\}" \
+			.git/rebase-merge/git-rebase-todo.backup &&
 		git rebase --continue
 	) &&
 	collide2="$(git rev-parse HEAD~1 | cut -c 1-4)" &&
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 2/3] rebase -i: re-fix short SHA-1 collision
  2020-01-17 23:38   ` [PATCH v2 2/3] rebase -i: re-fix short SHA-1 collision Johannes Schindelin via GitGitGadget
@ 2020-01-20 16:49     ` Eric Sunshine
  2020-01-20 20:04       ` Johannes Schindelin
  0 siblings, 1 reply; 28+ messages in thread
From: Eric Sunshine @ 2020-01-20 16:49 UTC (permalink / raw)
  To: Johannes Schindelin via GitGitGadget
  Cc: Git List, brian m. carlson, Alban Gruin, Junio C Hamano,
	Johannes Schindelin

On Fri, Jan 17, 2020 at 6:38 PM Johannes Schindelin via GitGitGadget
<gitgitgadget@gmail.com> wrote:
> diff --git a/t/t3404-rebase-interactive.sh b/t/t3404-rebase-interactive.sh
> @@ -1264,13 +1264,24 @@ test_expect_success SHA1 'short SHA-1 setup' '
>  test_expect_success SHA1 'short SHA-1 collide' '
>         test_when_finished "reset_rebase && git checkout master" &&
>         git checkout collide &&
> +       colliding_sha1=6bcda37 &&
> +       test $colliding_sha1 = "$(git rev-parse HEAD | cut -c 1-7)" &&

How much do we care that this is introducing new code with git
upstream of a pipe (considering recent efforts to eradicate such
usage)? Same question regarding several other new instances introduce
by this patch.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 2/3] rebase -i: re-fix short SHA-1 collision
  2020-01-20 16:49     ` Eric Sunshine
@ 2020-01-20 20:04       ` Johannes Schindelin
  2020-01-20 20:08         ` Eric Sunshine
  0 siblings, 1 reply; 28+ messages in thread
From: Johannes Schindelin @ 2020-01-20 20:04 UTC (permalink / raw)
  To: Eric Sunshine
  Cc: Johannes Schindelin via GitGitGadget, Git List, brian m. carlson,
	Alban Gruin, Junio C Hamano

Hi Eric,

On Mon, 20 Jan 2020, Eric Sunshine wrote:

> On Fri, Jan 17, 2020 at 6:38 PM Johannes Schindelin via GitGitGadget
> <gitgitgadget@gmail.com> wrote:
> > diff --git a/t/t3404-rebase-interactive.sh b/t/t3404-rebase-interactive.sh
> > @@ -1264,13 +1264,24 @@ test_expect_success SHA1 'short SHA-1 setup' '
> >  test_expect_success SHA1 'short SHA-1 collide' '
> >         test_when_finished "reset_rebase && git checkout master" &&
> >         git checkout collide &&
> > +       colliding_sha1=6bcda37 &&
> > +       test $colliding_sha1 = "$(git rev-parse HEAD | cut -c 1-7)" &&
>
> How much do we care that this is introducing new code with git
> upstream of a pipe (considering recent efforts to eradicate such
> usage)? Same question regarding several other new instances introduce
> by this patch.

I would argue that the test case will fail if the `git` call fails. So I
am not overly concerned if that `git` call is upstream of a pipe.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 2/3] rebase -i: re-fix short SHA-1 collision
  2020-01-20 20:04       ` Johannes Schindelin
@ 2020-01-20 20:08         ` Eric Sunshine
  2020-01-21 11:49           ` Johannes Schindelin
  2020-01-21 22:02           ` Junio C Hamano
  0 siblings, 2 replies; 28+ messages in thread
From: Eric Sunshine @ 2020-01-20 20:08 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Johannes Schindelin via GitGitGadget, Git List, brian m. carlson,
	Alban Gruin, Junio C Hamano

On Mon, Jan 20, 2020 at 3:04 PM Johannes Schindelin
<Johannes.Schindelin@gmx.de> wrote:
> On Mon, 20 Jan 2020, Eric Sunshine wrote:
> > On Fri, Jan 17, 2020 at 6:38 PM Johannes Schindelin via GitGitGadget
> > <gitgitgadget@gmail.com> wrote:
> > > +       test $colliding_sha1 = "$(git rev-parse HEAD | cut -c 1-7)" &&
> >
> > How much do we care that this is introducing new code with git
> > upstream of a pipe (considering recent efforts to eradicate such
> > usage)? Same question regarding several other new instances introduce
> > by this patch.
>
> I would argue that the test case will fail if the `git` call fails. So I
> am not overly concerned if that `git` call is upstream of a pipe.

Unless the git command crashes _after_ it produces the correct output...

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 2/3] rebase -i: re-fix short SHA-1 collision
  2020-01-20 20:08         ` Eric Sunshine
@ 2020-01-21 11:49           ` Johannes Schindelin
  2020-01-21 22:02           ` Junio C Hamano
  1 sibling, 0 replies; 28+ messages in thread
From: Johannes Schindelin @ 2020-01-21 11:49 UTC (permalink / raw)
  To: Eric Sunshine
  Cc: Johannes Schindelin via GitGitGadget, Git List, brian m. carlson,
	Alban Gruin, Junio C Hamano

Hi Eric,

On Mon, 20 Jan 2020, Eric Sunshine wrote:

> On Mon, Jan 20, 2020 at 3:04 PM Johannes Schindelin
> <Johannes.Schindelin@gmx.de> wrote:
> > On Mon, 20 Jan 2020, Eric Sunshine wrote:
> > > On Fri, Jan 17, 2020 at 6:38 PM Johannes Schindelin via GitGitGadget
> > > <gitgitgadget@gmail.com> wrote:
> > > > +       test $colliding_sha1 = "$(git rev-parse HEAD | cut -c 1-7)" &&
> > >
> > > How much do we care that this is introducing new code with git
> > > upstream of a pipe (considering recent efforts to eradicate such
> > > usage)? Same question regarding several other new instances introduce
> > > by this patch.
> >
> > I would argue that the test case will fail if the `git` call fails. So I
> > am not overly concerned if that `git` call is upstream of a pipe.
>
> Unless the git command crashes _after_ it produces the correct output...

Yes, that is true. In a very hypothetical way, of course.

:-)

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 0/3] Re-fix rebase -i with SHA-1 collisions
  2020-01-17 22:37     ` brian m. carlson
@ 2020-01-21 18:00       ` Junio C Hamano
  0 siblings, 0 replies; 28+ messages in thread
From: Junio C Hamano @ 2020-01-21 18:00 UTC (permalink / raw)
  To: brian m. carlson
  Cc: Johannes Schindelin, Johannes Schindelin via GitGitGadget, git,
	Alban Gruin, Eric Sunshine

"brian m. carlson" <sandals@crustytoothpaste.net> writes:

>> The question whether to move the SHA-256 support patch into your series is
>> more a question to Junio, i.e. which patch series will be merged down
>> faster.
>
> I need to do a reroll of part 8, so I'll pick it into a future series
> (part 9) of test fixes and drop my patch.  That way it won't interfere
> with either series making progress, but it will still be included at the
> end.

Alright.  Thanks for coordinating between you two.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 2/3] rebase -i: re-fix short SHA-1 collision
  2020-01-20 20:08         ` Eric Sunshine
  2020-01-21 11:49           ` Johannes Schindelin
@ 2020-01-21 22:02           ` Junio C Hamano
  2020-01-22 14:10             ` Johannes Schindelin
  1 sibling, 1 reply; 28+ messages in thread
From: Junio C Hamano @ 2020-01-21 22:02 UTC (permalink / raw)
  To: Eric Sunshine
  Cc: Johannes Schindelin, Johannes Schindelin via GitGitGadget,
	Git List, brian m. carlson, Alban Gruin

Eric Sunshine <sunshine@sunshineco.com> writes:

> On Mon, Jan 20, 2020 at 3:04 PM Johannes Schindelin
> <Johannes.Schindelin@gmx.de> wrote:
>> On Mon, 20 Jan 2020, Eric Sunshine wrote:
>> > On Fri, Jan 17, 2020 at 6:38 PM Johannes Schindelin via GitGitGadget
>> > <gitgitgadget@gmail.com> wrote:
>> > > +       test $colliding_sha1 = "$(git rev-parse HEAD | cut -c 1-7)" &&
>> >
>> > How much do we care that this is introducing new code with git
>> > upstream of a pipe (considering recent efforts to eradicate such
>> > usage)? Same question regarding several other new instances introduce
>> > by this patch.
>>
>> I would argue that the test case will fail if the `git` call fails. So I
>> am not overly concerned if that `git` call is upstream of a pipe.
>
> Unless the git command crashes _after_ it produces the correct output...

True.  Doesn't rev-parse have an appropriate option for this kind of
thing that gets rid of the need for "cut" in the first place?

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 1/3] parse_insn_line(): improve error message when parsing failed
  2020-01-17 23:38   ` [PATCH v2 1/3] parse_insn_line(): improve error message when parsing failed Johannes Schindelin via GitGitGadget
@ 2020-01-21 22:04     ` Junio C Hamano
  2020-01-23 12:26       ` Johannes Schindelin
  0 siblings, 1 reply; 28+ messages in thread
From: Junio C Hamano @ 2020-01-21 22:04 UTC (permalink / raw)
  To: Johannes Schindelin via GitGitGadget
  Cc: git, brian m. carlson, Alban Gruin, Eric Sunshine,
	Johannes Schindelin

"Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
writes:

> From: Johannes Schindelin <johannes.schindelin@gmx.de>
>
> In the case that a `get_oid()` call failed, we showed some rather bogus
> part of the line instead of the precise string we sent to said function.
> That makes it rather hard for users to understand what is going wrong,
> so let's fix that.
>
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ...
> @@ -2125,11 +2127,10 @@ static int parse_insn_line(struct repository *r, struct todo_item *item,
>  	item->arg_len = (int)(eol - bol);
>  
>  	if (status < 0)
> -		return error(_("could not parse '%.*s'"),
> -			     (int)(end_of_object_name - bol), bol);
> +		return status;
>  
>  	item->commit = lookup_commit_reference(r, &commit_oid);
> -	return !item->commit;
> +	return item->commit ? 0 : -1;

This changes the polarity of the error exit from positive 1 to
negative 1.  The only caller of this function takes anything
non-zero as a failure so this would not cause behaviour change, but
returning negative is more in line with the practice so it is an
improvement.

It is unrelated to the theme of this patch, and the proposed log
message does not even mention it, though.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 2/3] rebase -i: re-fix short SHA-1 collision
  2020-01-21 22:02           ` Junio C Hamano
@ 2020-01-22 14:10             ` Johannes Schindelin
  2020-01-22 18:15               ` Junio C Hamano
  0 siblings, 1 reply; 28+ messages in thread
From: Johannes Schindelin @ 2020-01-22 14:10 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Eric Sunshine, Johannes Schindelin via GitGitGadget, Git List,
	brian m. carlson, Alban Gruin

Hi Junio,

On Tue, 21 Jan 2020, Junio C Hamano wrote:

> Eric Sunshine <sunshine@sunshineco.com> writes:
>
> > On Mon, Jan 20, 2020 at 3:04 PM Johannes Schindelin
> > <Johannes.Schindelin@gmx.de> wrote:
> >> On Mon, 20 Jan 2020, Eric Sunshine wrote:
> >> > On Fri, Jan 17, 2020 at 6:38 PM Johannes Schindelin via GitGitGadget
> >> > <gitgitgadget@gmail.com> wrote:
> >> > > +       test $colliding_sha1 = "$(git rev-parse HEAD | cut -c 1-7)" &&
> >> >
> >> > How much do we care that this is introducing new code with git
> >> > upstream of a pipe (considering recent efforts to eradicate such
> >> > usage)? Same question regarding several other new instances introduce
> >> > by this patch.
> >>
> >> I would argue that the test case will fail if the `git` call fails. So I
> >> am not overly concerned if that `git` call is upstream of a pipe.
> >
> > Unless the git command crashes _after_ it produces the correct output...
>
> True.  Doesn't rev-parse have an appropriate option for this kind of
> thing that gets rid of the need for "cut" in the first place?

You mean `git rev-parse --short=4`? That does something _sligthly_
different: it tries to shorten the OID to 4 characters _unless that would
be ambiguous_. In our case, it _will_ be ambiguous. That's why I use
`cut`.

As to the crash in `rev-parse` _after_ printing out the OID: yes, there is
a possibility for that. But that regression test is not about `rev-parse`,
so it is actually a good thing that it would not trigger on such a bug ;-)

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 2/3] rebase -i: re-fix short SHA-1 collision
  2020-01-22 14:10             ` Johannes Schindelin
@ 2020-01-22 18:15               ` Junio C Hamano
  0 siblings, 0 replies; 28+ messages in thread
From: Junio C Hamano @ 2020-01-22 18:15 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Eric Sunshine, Johannes Schindelin via GitGitGadget, Git List,
	brian m. carlson, Alban Gruin

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

>> True.  Doesn't rev-parse have an appropriate option for this kind of
>> thing that gets rid of the need for "cut" in the first place?
>
> You mean `git rev-parse --short=4`? That does something _sligthly_
> different: it tries to shorten the OID to 4 characters _unless that would
> be ambiguous_. In our case, it _will_ be ambiguous. That's why I use
> `cut`.

Ah, yes of course; we want ambiguous prefix.  I think a more
thorough test would be to see that the output with --short=$n (where
n is the length of the abbreviated object name in $colliding_sha1)
is longer than $colliding_sha1 and the output prefix-matches
$colliding_sha1 iow, something like

	abbreviated=$(git rev-parse --short=7 HEAD) &&
	case "$abbreviated" in
	"$colliding_sha1"?*) : happy ;;
	*) false ;;
	esac &&
	...

which would make sure that we are testing colliding case.


> As to the crash in `rev-parse` _after_ printing out the OID: yes, there is
> a possibility for that. But that regression test is not about `rev-parse`,
> so it is actually a good thing that it would not trigger on such a bug ;-)

No, I do not think this test should be about rev-parse working
correctly---just that if it is easy enough to make the test robust
enough against such a breakage, it would be nice to do so, that's
all.

I'm not Eric but I suspect his primary point was not about worrying
about rev-parse crashing but more about avoiding to add a pattern
less experienced developers can copy&paste without thinking enough
to realize why it would be OK here and not OK in the context of the
tests they are adding.  That would be what I would worry about more
than rev-parse crashing in the part of the test under discussion.

Thanks.


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 1/3] parse_insn_line(): improve error message when parsing failed
  2020-01-21 22:04     ` Junio C Hamano
@ 2020-01-23 12:26       ` Johannes Schindelin
  0 siblings, 0 replies; 28+ messages in thread
From: Johannes Schindelin @ 2020-01-23 12:26 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Johannes Schindelin via GitGitGadget, git, brian m. carlson,
	Alban Gruin, Eric Sunshine

Hi Junio,

On Tue, 21 Jan 2020, Junio C Hamano wrote:

> "Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
> writes:
>
> > From: Johannes Schindelin <johannes.schindelin@gmx.de>
> >
> > In the case that a `get_oid()` call failed, we showed some rather bogus
> > part of the line instead of the precise string we sent to said function.
> > That makes it rather hard for users to understand what is going wrong,
> > so let's fix that.
> >
> > Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> > ...
> > @@ -2125,11 +2127,10 @@ static int parse_insn_line(struct repository *r, struct todo_item *item,
> >  	item->arg_len = (int)(eol - bol);
> >
> >  	if (status < 0)
> > -		return error(_("could not parse '%.*s'"),
> > -			     (int)(end_of_object_name - bol), bol);
> > +		return status;
> >
> >  	item->commit = lookup_commit_reference(r, &commit_oid);
> > -	return !item->commit;
> > +	return item->commit ? 0 : -1;
>
> This changes the polarity of the error exit from positive 1 to
> negative 1.  The only caller of this function takes anything
> non-zero as a failure so this would not cause behaviour change, but
> returning negative is more in line with the practice so it is an
> improvement.
>
> It is unrelated to the theme of this patch, and the proposed log
> message does not even mention it, though.

You're right. I amended the commit message. Will send out a v3 soon.

Thanks,
Dscho

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v3 0/3] Re-fix rebase -i with SHA-1 collisions
  2020-01-17 23:38 ` [PATCH v2 " Johannes Schindelin via GitGitGadget
                     ` (2 preceding siblings ...)
  2020-01-17 23:38   ` [PATCH v2 3/3] rebase -i: also avoid SHA-1 collisions with missingCommitsCheck Johannes Schindelin via GitGitGadget
@ 2020-01-23 12:28   ` Johannes Schindelin via GitGitGadget
  2020-01-23 12:28     ` [PATCH v3 1/3] parse_insn_line(): improve error message when parsing failed Johannes Schindelin via GitGitGadget
                       ` (2 more replies)
  3 siblings, 3 replies; 28+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2020-01-23 12:28 UTC (permalink / raw)
  To: git
  Cc: brian m. carlson, Alban Gruin, Eric Sunshine, Junio C Hamano,
	Johannes Schindelin

Triggered by one of brian's SHA-256 patch series, I looked at the reason why
the SHA-1 collision test case passed when it shouldn't. Turns out that the
regression test was not quite thorough enough, and the interactive rebase 
did regress recently.

While in the area, I realized that the same bug exists in the code backing
the rebase.missingCommitsCheck feature: the backed-up todo list uses
shortened commit IDs that may very well become ambiguous during the rebase.
For good measure, this patch series fixes that, too.

Finally, I saw that git rebase --edit-todo reported the line in an awkward,
maybe even incorrect, way when there was an ambiguous commit ID, and I also
fixed that.

To make sure that the code can be easily adapted to SHA-256 after these
patches, I actually already made those adjustments on top and offered them
up at https://github.com/bk2204/git/pull/1.

Changes since v2:

 * Clarified in the first commit message that the change from a positive
   return value to a negative return value (to indicate failure, in both
   cases) both was intentional, and does not require any change in the
   corresponding function's only caller.

Changes since v1:

 * Turned the error condition when parsing the todo list with just-expanded
   commit IDs failed into a BUG().

Johannes Schindelin (3):
  parse_insn_line(): improve error message when parsing failed
  rebase -i: re-fix short SHA-1 collision
  rebase -i: also avoid SHA-1 collisions with missingCommitsCheck

 rebase-interactive.c          |  8 +++++---
 sequencer.c                   | 18 ++++++++++++++----
 t/t3404-rebase-interactive.sh | 17 +++++++++++++++--
 3 files changed, 34 insertions(+), 9 deletions(-)


base-commit: d0654dc308b0ba76dd8ed7bbb33c8d8f7aacd783
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-529%2Fdscho%2Fre-fix-rebase-i-with-sha-collisions-v3
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-529/dscho/re-fix-rebase-i-with-sha-collisions-v3
Pull-Request: https://github.com/gitgitgadget/git/pull/529

Range-diff vs v2:

 1:  2ae2e435b0 ! 1:  6f5c7b0325 parse_insn_line(): improve error message when parsing failed
     @@ -7,6 +7,11 @@
          That makes it rather hard for users to understand what is going wrong,
          so let's fix that.
      
     +    While at it, return a negative value from `parse_insn_line()` in case of
     +    an error, as per our convention. This function's only caller,
     +    `todo_list_parse_insn_buffer()`, cares only whether that return value is
     +    non-zero or not, i.e. does not need to be changed.
     +
          Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
      
       diff --git a/sequencer.c b/sequencer.c
 2:  102fa568dc = 2:  595ba81e09 rebase -i: re-fix short SHA-1 collision
 3:  486e9413a6 = 3:  b7b063408f rebase -i: also avoid SHA-1 collisions with missingCommitsCheck

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v3 1/3] parse_insn_line(): improve error message when parsing failed
  2020-01-23 12:28   ` [PATCH v3 0/3] Re-fix rebase -i with SHA-1 collisions Johannes Schindelin via GitGitGadget
@ 2020-01-23 12:28     ` Johannes Schindelin via GitGitGadget
  2020-01-23 12:28     ` [PATCH v3 2/3] rebase -i: re-fix short SHA-1 collision Johannes Schindelin via GitGitGadget
  2020-01-23 12:28     ` [PATCH v3 3/3] rebase -i: also avoid SHA-1 collisions with missingCommitsCheck Johannes Schindelin via GitGitGadget
  2 siblings, 0 replies; 28+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2020-01-23 12:28 UTC (permalink / raw)
  To: git
  Cc: brian m. carlson, Alban Gruin, Eric Sunshine, Junio C Hamano,
	Johannes Schindelin, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

In the case that a `get_oid()` call failed, we showed some rather bogus
part of the line instead of the precise string we sent to said function.
That makes it rather hard for users to understand what is going wrong,
so let's fix that.

While at it, return a negative value from `parse_insn_line()` in case of
an error, as per our convention. This function's only caller,
`todo_list_parse_insn_buffer()`, cares only whether that return value is
non-zero or not, i.e. does not need to be changed.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 sequencer.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/sequencer.c b/sequencer.c
index b9dbf1adb0..7c30dad59c 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -2118,6 +2118,8 @@ static int parse_insn_line(struct repository *r, struct todo_item *item,
 	saved = *end_of_object_name;
 	*end_of_object_name = '\0';
 	status = get_oid(bol, &commit_oid);
+	if (status < 0)
+		error(_("could not parse '%s'"), bol); /* return later */
 	*end_of_object_name = saved;
 
 	bol = end_of_object_name + strspn(end_of_object_name, " \t");
@@ -2125,11 +2127,10 @@ static int parse_insn_line(struct repository *r, struct todo_item *item,
 	item->arg_len = (int)(eol - bol);
 
 	if (status < 0)
-		return error(_("could not parse '%.*s'"),
-			     (int)(end_of_object_name - bol), bol);
+		return status;
 
 	item->commit = lookup_commit_reference(r, &commit_oid);
-	return !item->commit;
+	return item->commit ? 0 : -1;
 }
 
 int sequencer_get_last_command(struct repository *r, enum replay_action *action)
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v3 2/3] rebase -i: re-fix short SHA-1 collision
  2020-01-23 12:28   ` [PATCH v3 0/3] Re-fix rebase -i with SHA-1 collisions Johannes Schindelin via GitGitGadget
  2020-01-23 12:28     ` [PATCH v3 1/3] parse_insn_line(): improve error message when parsing failed Johannes Schindelin via GitGitGadget
@ 2020-01-23 12:28     ` Johannes Schindelin via GitGitGadget
  2020-01-23 12:28     ` [PATCH v3 3/3] rebase -i: also avoid SHA-1 collisions with missingCommitsCheck Johannes Schindelin via GitGitGadget
  2 siblings, 0 replies; 28+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2020-01-23 12:28 UTC (permalink / raw)
  To: git
  Cc: brian m. carlson, Alban Gruin, Eric Sunshine, Junio C Hamano,
	Johannes Schindelin, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

In 66ae9a57b88 (t3404: rebase -i: demonstrate short SHA-1 collision,
2013-08-23), we added a test case that demonstrated how it is possible
that a previously unambiguous short commit ID could become ambiguous
*during* a rebase.

In 75c69766554 (rebase -i: fix short SHA-1 collision, 2013-08-23), we
fixed that problem simply by writing out the todo list with expanded
commit IDs (except *right* before letting the user edit the todo list,
in which case we shorten them, but we expand them right after the file
was edited).

However, the bug resurfaced as a side effect of 393adf7a6f6 (sequencer:
directly call pick_commits() from complete_action(), 2019-11-24): as of
this commit, the sequencer no longer re-reads the todo list after
writing it out with expanded commit IDs.

The only redeeming factor is that the todo list is already parsed at
that stage, including all the commits corresponding to the commands,
therefore the sequencer can continue even if the internal todo list has
short commit IDs.

That does not prevent problems, though: the sequencer writes out the
`done` and `git-rebase-todo` files incrementally (i.e. overwriting the
todo list with a version that has _short_ commit IDs), and if a merge
conflict happens, or if an `edit` or a `break` command is encountered, a
subsequent `git rebase --continue` _will_ re-read the todo list, opening
an opportunity for the "short SHA-1 collision" bug again.

To avoid that, let's make sure that we do expand the commit IDs in the
todo list as soon as we have parsed it after letting the user edit it.

Additionally, we improve the 'short SHA-1 collide' test case in t3404 to
test specifically for the case where the rebase is resumed. We also
hard-code the expected colliding short SHA-1s, to document the
expectation (and to make it easier on future readers).

Note that we specifically test that the short commit ID is used in the
`git-rebase-todo.tmp` file: this file is created by the fake editor in
the test script and reflects the state that would have been presented to
the user to edit.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 sequencer.c                   | 11 ++++++++++-
 t/t3404-rebase-interactive.sh | 15 +++++++++++++--
 2 files changed, 23 insertions(+), 3 deletions(-)

diff --git a/sequencer.c b/sequencer.c
index 7c30dad59c..5f69b47e32 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -5076,7 +5076,7 @@ int complete_action(struct repository *r, struct replay_opts *opts, unsigned fla
 {
 	const char *shortonto, *todo_file = rebase_path_todo();
 	struct todo_list new_todo = TODO_LIST_INIT;
-	struct strbuf *buf = &todo_list->buf;
+	struct strbuf *buf = &todo_list->buf, buf2 = STRBUF_INIT;
 	struct object_id oid = onto->object.oid;
 	int res;
 
@@ -5128,6 +5128,15 @@ int complete_action(struct repository *r, struct replay_opts *opts, unsigned fla
 		return -1;
 	}
 
+	/* Expand the commit IDs */
+	todo_list_to_strbuf(r, &new_todo, &buf2, -1, 0);
+	strbuf_swap(&new_todo.buf, &buf2);
+	strbuf_release(&buf2);
+	new_todo.total_nr -= new_todo.nr;
+	if (todo_list_parse_insn_buffer(r, new_todo.buf.buf, &new_todo) < 0)
+		BUG("invalid todo list after expanding IDs:\n%s",
+		    new_todo.buf.buf);
+
 	if (opts->allow_ff && skip_unnecessary_picks(r, &new_todo, &oid)) {
 		todo_list_release(&new_todo);
 		return error(_("could not skip unnecessary pick commands"));
diff --git a/t/t3404-rebase-interactive.sh b/t/t3404-rebase-interactive.sh
index ae6e55ce79..1cc9f36bc7 100755
--- a/t/t3404-rebase-interactive.sh
+++ b/t/t3404-rebase-interactive.sh
@@ -1264,13 +1264,24 @@ test_expect_success SHA1 'short SHA-1 setup' '
 test_expect_success SHA1 'short SHA-1 collide' '
 	test_when_finished "reset_rebase && git checkout master" &&
 	git checkout collide &&
+	colliding_sha1=6bcda37 &&
+	test $colliding_sha1 = "$(git rev-parse HEAD | cut -c 1-7)" &&
 	(
 		unset test_tick &&
 		test_tick &&
 		set_fake_editor &&
 		FAKE_COMMIT_MESSAGE="collide2 ac4f2ee" \
-		FAKE_LINES="reword 1 2" git rebase -i HEAD~2
-	)
+		FAKE_LINES="reword 1 break 2" git rebase -i HEAD~2 &&
+		test $colliding_sha1 = "$(git rev-parse HEAD | cut -c 1-7)" &&
+		grep "^pick $colliding_sha1 " \
+			.git/rebase-merge/git-rebase-todo.tmp &&
+		grep "^pick [0-9a-f]\{40\}" \
+			.git/rebase-merge/git-rebase-todo &&
+		git rebase --continue
+	) &&
+	collide2="$(git rev-parse HEAD~1 | cut -c 1-4)" &&
+	collide3="$(git rev-parse collide3 | cut -c 1-4)" &&
+	test "$collide2" = "$collide3"
 '
 
 test_expect_success 'respect core.abbrev' '
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v3 3/3] rebase -i: also avoid SHA-1 collisions with missingCommitsCheck
  2020-01-23 12:28   ` [PATCH v3 0/3] Re-fix rebase -i with SHA-1 collisions Johannes Schindelin via GitGitGadget
  2020-01-23 12:28     ` [PATCH v3 1/3] parse_insn_line(): improve error message when parsing failed Johannes Schindelin via GitGitGadget
  2020-01-23 12:28     ` [PATCH v3 2/3] rebase -i: re-fix short SHA-1 collision Johannes Schindelin via GitGitGadget
@ 2020-01-23 12:28     ` Johannes Schindelin via GitGitGadget
  2 siblings, 0 replies; 28+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2020-01-23 12:28 UTC (permalink / raw)
  To: git
  Cc: brian m. carlson, Alban Gruin, Eric Sunshine, Junio C Hamano,
	Johannes Schindelin, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

When `rebase.missingCommitsCheck` is in effect, we use the backup of the
todo list that was copied just before the user was allowed to edit it.

That backup is, of course, just as susceptible to the hash collision as
the todo list itself: a reworded commit could make a previously
unambiguous short commit ID ambiguous all of a sudden.

So let's not just copy the todo list, but let's instead write out the
backup with expanded commit IDs.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 rebase-interactive.c          | 8 +++++---
 t/t3404-rebase-interactive.sh | 2 ++
 2 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/rebase-interactive.c b/rebase-interactive.c
index aa18ae82b7..1259adc8ea 100644
--- a/rebase-interactive.c
+++ b/rebase-interactive.c
@@ -104,9 +104,11 @@ int edit_todo_list(struct repository *r, struct todo_list *todo_list,
 				    -1, flags | TODO_LIST_SHORTEN_IDS | TODO_LIST_APPEND_TODO_HELP))
 		return error_errno(_("could not write '%s'"), todo_file);
 
-	if (initial && copy_file(rebase_path_todo_backup(), todo_file, 0666))
-		return error(_("could not copy '%s' to '%s'."), todo_file,
-			     rebase_path_todo_backup());
+	if (initial &&
+	    todo_list_write_to_file(r, todo_list, rebase_path_todo_backup(),
+				    shortrevisions, shortonto, -1,
+				    (flags | TODO_LIST_APPEND_TODO_HELP) & ~TODO_LIST_SHORTEN_IDS) < 0)
+		return error(_("could not write '%s'."), rebase_path_todo_backup());
 
 	if (launch_sequence_editor(todo_file, &new_todo->buf, NULL))
 		return -2;
diff --git a/t/t3404-rebase-interactive.sh b/t/t3404-rebase-interactive.sh
index 1cc9f36bc7..b90ea0fe44 100755
--- a/t/t3404-rebase-interactive.sh
+++ b/t/t3404-rebase-interactive.sh
@@ -1277,6 +1277,8 @@ test_expect_success SHA1 'short SHA-1 collide' '
 			.git/rebase-merge/git-rebase-todo.tmp &&
 		grep "^pick [0-9a-f]\{40\}" \
 			.git/rebase-merge/git-rebase-todo &&
+		grep "^pick [0-9a-f]\{40\}" \
+			.git/rebase-merge/git-rebase-todo.backup &&
 		git rebase --continue
 	) &&
 	collide2="$(git rev-parse HEAD~1 | cut -c 1-4)" &&
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2020-01-23 12:28 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-16 21:18 [PATCH 0/3] Re-fix rebase -i with SHA-1 collisions Johannes Schindelin via GitGitGadget
2020-01-16 21:18 ` [PATCH 1/3] parse_insn_line(): improve error message when parsing failed Johannes Schindelin via GitGitGadget
2020-01-17 18:00   ` Junio C Hamano
2020-01-16 21:18 ` [PATCH 2/3] rebase -i: re-fix short SHA-1 collision Johannes Schindelin via GitGitGadget
2020-01-17 18:30   ` Junio C Hamano
2020-01-17 21:31     ` Johannes Schindelin
2020-01-16 21:18 ` [PATCH 3/3] rebase -i: also avoid SHA-1 collisions with missingCommitsCheck Johannes Schindelin via GitGitGadget
2020-01-16 23:54 ` [PATCH 0/3] Re-fix rebase -i with SHA-1 collisions brian m. carlson
2020-01-17  9:51   ` Johannes Schindelin
2020-01-17 22:37     ` brian m. carlson
2020-01-21 18:00       ` Junio C Hamano
2020-01-17 23:38 ` [PATCH v2 " Johannes Schindelin via GitGitGadget
2020-01-17 23:38   ` [PATCH v2 1/3] parse_insn_line(): improve error message when parsing failed Johannes Schindelin via GitGitGadget
2020-01-21 22:04     ` Junio C Hamano
2020-01-23 12:26       ` Johannes Schindelin
2020-01-17 23:38   ` [PATCH v2 2/3] rebase -i: re-fix short SHA-1 collision Johannes Schindelin via GitGitGadget
2020-01-20 16:49     ` Eric Sunshine
2020-01-20 20:04       ` Johannes Schindelin
2020-01-20 20:08         ` Eric Sunshine
2020-01-21 11:49           ` Johannes Schindelin
2020-01-21 22:02           ` Junio C Hamano
2020-01-22 14:10             ` Johannes Schindelin
2020-01-22 18:15               ` Junio C Hamano
2020-01-17 23:38   ` [PATCH v2 3/3] rebase -i: also avoid SHA-1 collisions with missingCommitsCheck Johannes Schindelin via GitGitGadget
2020-01-23 12:28   ` [PATCH v3 0/3] Re-fix rebase -i with SHA-1 collisions Johannes Schindelin via GitGitGadget
2020-01-23 12:28     ` [PATCH v3 1/3] parse_insn_line(): improve error message when parsing failed Johannes Schindelin via GitGitGadget
2020-01-23 12:28     ` [PATCH v3 2/3] rebase -i: re-fix short SHA-1 collision Johannes Schindelin via GitGitGadget
2020-01-23 12:28     ` [PATCH v3 3/3] rebase -i: also avoid SHA-1 collisions with missingCommitsCheck Johannes Schindelin via GitGitGadget

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).