git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH 0/9] The final building block for a faster rebase -i
@ 2016-09-02 16:22 Johannes Schindelin
  2016-09-02 16:23 ` [PATCH 1/9] rebase -i: generate the script via rebase--helper Johannes Schindelin
                   ` (9 more replies)
  0 siblings, 10 replies; 100+ messages in thread
From: Johannes Schindelin @ 2016-09-02 16:22 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano

This patch series reimplements the expensive pre- and post-processing of
the todo script in C.

And it concludes the work I did to accelerate rebase -i.


Johannes Schindelin (9):
  rebase -i: generate the script via rebase--helper
  rebase -i: remove useless indentation
  rebase -i: do not invent onelines when expanding/collapsing SHA-1s
  rebase -i: also expand/collapse the SHA-1s via the rebase--helper
  t3404: relax rebase.missingCommitsCheck tests
  rebase -i: check for missing commits in the rebase--helper
  rebase -i: skip unnecessary picks using the rebase--helper
  t3415: test fixup with wrapped oneline
  rebase -i: rearrange fixup/squash lines using the rebase--helper

 builtin/rebase--helper.c      |  29 ++-
 git-rebase--interactive.sh    | 362 ++++-------------------------
 sequencer.c                   | 514 ++++++++++++++++++++++++++++++++++++++++++
 sequencer.h                   |   7 +
 t/t3404-rebase-interactive.sh |  22 +-
 t/t3415-rebase-autosquash.sh  |  16 +-
 6 files changed, 614 insertions(+), 336 deletions(-)

Based-On: rebase--helper at https://github.com/dscho/git
Fetch-Base-Via: git fetch https://github.com/dscho/git rebase--helper
Published-As: https://github.com/dscho/git/releases/tag/rebase-i-extra-v1
Fetch-It-Via: git fetch https://github.com/dscho/git rebase-i-extra-v1

-- 
2.9.3.windows.3

base-commit: 4c39918f42eb8228ea4241073f86f2ac851f4636

^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH 1/9] rebase -i: generate the script via rebase--helper
  2016-09-02 16:22 [PATCH 0/9] The final building block for a faster rebase -i Johannes Schindelin
@ 2016-09-02 16:23 ` Johannes Schindelin
  2016-09-02 16:23 ` [PATCH 2/9] rebase -i: remove useless indentation Johannes Schindelin
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 100+ messages in thread
From: Johannes Schindelin @ 2016-09-02 16:23 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano

The first step of an interactive rebase is to generate the so-called "todo
script", to be stored in the state directory as "git-rebase-todo" and to
be edited by the user.

Originally, we adjusted the output of `git log <options>` using a simple
sed script. Over the course of the years, the code became more
complicated. We now use shell scripting to edit the output of `git log`
conditionally, depending whether to keep "empty" commits (i.e. commits
that do not change any files).

On platforms where shell scripting is not native, this can be a serious
drag. And it opens the door for incompatibilities between platforms when
it comes to shell scripting or to Unix-y commands.

Let's just re-implement the todo script generation in plain C, using the
revision machinery directly.

This is substantially faster, improving the speed relative to the
shell script version of the interactive rebase from 2x to 3x on Windows.

Note that the rearrange_squash() function in git-rebase--interactive
relied on the fact that we set the "format" variable to the config setting
rebase.instructionFormat. Relying on a side effect like this is no good,
hence we explicitly perform that assignment (possibly again) in
rearrange_squash().

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 builtin/rebase--helper.c   |  8 +++++++-
 git-rebase--interactive.sh | 44 +++++++++++++++++++++++---------------------
 sequencer.c                | 44 ++++++++++++++++++++++++++++++++++++++++++++
 sequencer.h                |  2 ++
 4 files changed, 76 insertions(+), 22 deletions(-)

diff --git a/builtin/rebase--helper.c b/builtin/rebase--helper.c
index ca1ebb2..821058d 100644
--- a/builtin/rebase--helper.c
+++ b/builtin/rebase--helper.c
@@ -11,15 +11,19 @@ static const char * const builtin_rebase_helper_usage[] = {
 int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
 {
 	struct replay_opts opts = REPLAY_OPTS_INIT;
+	int keep_empty = 0;
 	enum {
-		CONTINUE = 1, ABORT
+		CONTINUE = 1, ABORT, MAKE_SCRIPT
 	} command = 0;
 	struct option options[] = {
 		OPT_BOOL(0, "ff", &opts.allow_ff, N_("allow fast-forward")),
+		OPT_BOOL(0, "keep-empty", &keep_empty, N_("keep empty commits")),
 		OPT_CMDMODE(0, "continue", &command, N_("continue rebase"),
 				CONTINUE),
 		OPT_CMDMODE(0, "abort", &command, N_("abort rebase"),
 				ABORT),
+		OPT_CMDMODE(0, "make-script", &command,
+			N_("make rebase script"), MAKE_SCRIPT),
 		OPT_END()
 	};
 
@@ -36,5 +40,7 @@ int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
 		return !!sequencer_continue(&opts);
 	if (command == ABORT && argc == 1)
 		return !!sequencer_remove_state(&opts);
+	if (command == MAKE_SCRIPT && argc > 1)
+		return !!sequencer_make_script(keep_empty, stdout, argc, argv);
 	usage_with_options(builtin_rebase_helper_usage, options);
 }
diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
index 022766b..01c9fec 100644
--- a/git-rebase--interactive.sh
+++ b/git-rebase--interactive.sh
@@ -775,6 +775,7 @@ collapse_todo_ids() {
 # each log message will be re-retrieved in order to normalize the
 # autosquash arrangement
 rearrange_squash () {
+	format=$(git config --get rebase.instructionFormat)
 	# extract fixup!/squash! lines and resolve any referenced sha1's
 	while read -r pick sha1 message
 	do
@@ -1203,26 +1204,27 @@ else
 	revisions=$onto...$orig_head
 	shortrevisions=$shorthead
 fi
-format=$(git config --get rebase.instructionFormat)
-# the 'rev-list .. | sed' requires %m to parse; the instruction requires %H to parse
-git rev-list $merges_option --format="%m%H ${format:-%s}" \
-	--reverse --left-right --topo-order \
-	$revisions ${restrict_revision+^$restrict_revision} | \
-	sed -n "s/^>//p" |
-while read -r sha1 rest
-do
-
-	if test -z "$keep_empty" && is_empty_commit $sha1 && ! is_merge_commit $sha1
-	then
-		comment_out="$comment_char "
-	else
-		comment_out=
-	fi
+if test t != "$preserve_merges"
+then
+	git rebase--helper --make-script ${keep_empty:+--keep-empty} \
+		$revisions ${restrict_revision+^$restrict_revision} >"$todo"
+else
+	format=$(git config --get rebase.instructionFormat)
+	# the 'rev-list .. | sed' requires %m to parse; the instruction requires %H to parse
+	git rev-list $merges_option --format="%m%H ${format:-%s}" \
+		--reverse --left-right --topo-order \
+		$revisions ${restrict_revision+^$restrict_revision} | \
+		sed -n "s/^>//p" |
+	while read -r sha1 rest
+	do
+
+		if test -z "$keep_empty" && is_empty_commit $sha1 && ! is_merge_commit $sha1
+		then
+			comment_out="$comment_char "
+		else
+			comment_out=
+		fi
 
-	if test t != "$preserve_merges"
-	then
-		printf '%s\n' "${comment_out}pick $sha1 $rest" >>"$todo"
-	else
 		if test -z "$rebase_root"
 		then
 			preserve=t
@@ -1241,8 +1243,8 @@ do
 			touch "$rewritten"/$sha1
 			printf '%s\n' "${comment_out}pick $sha1 $rest" >>"$todo"
 		fi
-	fi
-done
+	done
+fi
 
 # Watch for commits that been dropped by --cherry-pick
 if test t = "$preserve_merges"
diff --git a/sequencer.c b/sequencer.c
index c0c6661..43e078a 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -2347,3 +2347,47 @@ void append_signoff(struct strbuf *msgbuf, int ignore_footer, unsigned flag)
 
 	strbuf_release(&sob);
 }
+
+int sequencer_make_script(int keep_empty, FILE *out,
+		int argc, const char **argv)
+{
+	char *format = "%s";
+	struct pretty_print_context pp = {0};
+	struct strbuf buf = STRBUF_INIT;
+	struct rev_info revs;
+	struct commit *commit;
+
+	init_revisions(&revs, NULL);
+	revs.verbose_header = 1;
+	revs.max_parents = 1;
+	revs.cherry_pick = 1;
+	revs.limited = 1;
+	revs.reverse = 1;
+	revs.right_only = 1;
+	revs.sort_order = REV_SORT_IN_GRAPH_ORDER;
+	revs.topo_order = 1;
+
+	revs.pretty_given = 1;
+	git_config_get_string("rebase.instructionFormat", &format);
+	get_commit_format(format, &revs);
+	pp.fmt = revs.commit_format;
+	pp.output_encoding = get_log_output_encoding();
+
+	if (setup_revisions(argc, argv, &revs, NULL) > 1)
+		return error("make_script: unhandled options");
+
+	if (prepare_revision_walk(&revs) < 0)
+		return error("make_script: error preparing revisions");
+
+	while ((commit = get_revision(&revs))) {
+		strbuf_reset(&buf);
+		if (!keep_empty && is_original_commit_empty(commit))
+			strbuf_addf(&buf, "%c ", comment_line_char);
+		strbuf_addf(&buf, "pick %s ", oid_to_hex(&commit->object.oid));
+		pretty_print_commit(&pp, commit, &buf);
+		strbuf_addch(&buf, '\n');
+		fputs(buf.buf, out);
+	}
+	strbuf_release(&buf);
+	return 0;
+}
diff --git a/sequencer.h b/sequencer.h
index fd2a719..bc524be 100644
--- a/sequencer.h
+++ b/sequencer.h
@@ -58,6 +58,8 @@ int sequencer_remove_state(struct replay_opts *opts);
 int sequencer_commit(const char *defmsg, struct replay_opts *opts,
 			  int allow_empty, int edit, int amend,
 			  int cleanup_commit_message);
+int sequencer_make_script(int keep_empty, FILE *out,
+		int argc, const char **argv);
 
 extern const char sign_off_header[];
 
-- 
2.9.3.windows.3



^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [PATCH 2/9] rebase -i: remove useless indentation
  2016-09-02 16:22 [PATCH 0/9] The final building block for a faster rebase -i Johannes Schindelin
  2016-09-02 16:23 ` [PATCH 1/9] rebase -i: generate the script via rebase--helper Johannes Schindelin
@ 2016-09-02 16:23 ` Johannes Schindelin
  2016-09-02 16:23 ` [PATCH 3/9] rebase -i: do not invent onelines when expanding/collapsing SHA-1s Johannes Schindelin
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 100+ messages in thread
From: Johannes Schindelin @ 2016-09-02 16:23 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano

The commands used to be indented, and it is nice to look at, but when we
transform the SHA-1s, the indentation is removed. So let's do away with it.

For the moment, at least: when we will use the upcoming rebase--helper
to transform the SHA-1s, we *will* keep the indentation and can
reintroduce it. Yet, to be able to validate the rebase--helper against
the output of the current shell script version, we need to remove the
extra indentation.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 git-rebase--interactive.sh | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
index 01c9fec..5df5850 100644
--- a/git-rebase--interactive.sh
+++ b/git-rebase--interactive.sh
@@ -146,13 +146,13 @@ reschedule_last_action () {
 append_todo_help () {
 	gettext "
 Commands:
- p, pick = use commit
- r, reword = use commit, but edit the commit message
- e, edit = use commit, but stop for amending
- s, squash = use commit, but meld into previous commit
- f, fixup = like \"squash\", but discard this commit's log message
- x, exec = run command (the rest of the line) using shell
- d, drop = remove commit
+p, pick = use commit
+r, reword = use commit, but edit the commit message
+e, edit = use commit, but stop for amending
+s, squash = use commit, but meld into previous commit
+f, fixup = like \"squash\", but discard this commit's log message
+x, exec = run command (the rest of the line) using shell
+d, drop = remove commit
 
 These lines can be re-ordered; they are executed from top to bottom.
 " | git stripspace --comment-lines >>"$todo"
-- 
2.9.3.windows.3



^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [PATCH 3/9] rebase -i: do not invent onelines when expanding/collapsing SHA-1s
  2016-09-02 16:22 [PATCH 0/9] The final building block for a faster rebase -i Johannes Schindelin
  2016-09-02 16:23 ` [PATCH 1/9] rebase -i: generate the script via rebase--helper Johannes Schindelin
  2016-09-02 16:23 ` [PATCH 2/9] rebase -i: remove useless indentation Johannes Schindelin
@ 2016-09-02 16:23 ` Johannes Schindelin
  2016-09-02 16:23 ` [PATCH 4/9] rebase -i: also expand/collapse the SHA-1s via the rebase--helper Johannes Schindelin
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 100+ messages in thread
From: Johannes Schindelin @ 2016-09-02 16:23 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano

To avoid problems with short SHA-1s that become non-unique during the
rebase, we rewrite the todo script with short/long SHA-1s before and
after letting the user edit the script. Since SHA-1s are not intuitive
for humans, rebase -i also provides the onelines (commit message
subjects) in the script, purely for the user's convenience.

It is very possible to generate a todo script via different means than
rebase -i and then to let rebase -i run with it; In this case, these
onelines are not required.

And this is where the expand/collapse machinery has a bug: it *expects*
that oneline, and failing to find one reuses the previous SHA-1 as
"oneline".

It was most likely an oversight, and made implementation in the (quite
limiting) shell script language less convoluted. However, we are about
to reimplement performance-critical parts in C (and due to spawning a
git.exe process for every single line of the todo script, the
expansion/collapsing of the SHA-1s *is* performance-hampering on
Windows), therefore let's fix this bug to make cross-validation with the
C version of that functionality possible.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 git-rebase--interactive.sh | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
index 5df5850..0eb5583 100644
--- a/git-rebase--interactive.sh
+++ b/git-rebase--interactive.sh
@@ -750,7 +750,12 @@ transform_todo_ids () {
 			;;
 		*)
 			sha1=$(git rev-parse --verify --quiet "$@" ${rest%%[	 ]*}) &&
-			rest="$sha1 ${rest#*[	 ]}"
+			if test "a$rest" = "a${rest#*[	 ]}"
+			then
+				rest=$sha1
+			else
+				rest="$sha1 ${rest#*[	 ]}"
+			fi
 			;;
 		esac
 		printf '%s\n' "$command${rest:+ }$rest"
-- 
2.9.3.windows.3



^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [PATCH 4/9] rebase -i: also expand/collapse the SHA-1s via the rebase--helper
  2016-09-02 16:22 [PATCH 0/9] The final building block for a faster rebase -i Johannes Schindelin
                   ` (2 preceding siblings ...)
  2016-09-02 16:23 ` [PATCH 3/9] rebase -i: do not invent onelines when expanding/collapsing SHA-1s Johannes Schindelin
@ 2016-09-02 16:23 ` Johannes Schindelin
  2016-09-02 20:56   ` Dennis Kaarsemaker
  2016-09-02 16:23 ` [PATCH 5/9] t3404: relax rebase.missingCommitsCheck tests Johannes Schindelin
                   ` (5 subsequent siblings)
  9 siblings, 1 reply; 100+ messages in thread
From: Johannes Schindelin @ 2016-09-02 16:23 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano

This is crucial to improve performance on Windows, as the speed is now
mostly dominated by the SHA-1 transformation (because it spawns a new
rev-parse process for *every* line, and spawning processes is pretty
slow from Git for Windows' MSYS2 Bash).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 builtin/rebase--helper.c   | 10 +++++++-
 git-rebase--interactive.sh |  4 ++--
 sequencer.c                | 59 ++++++++++++++++++++++++++++++++++++++++++++++
 sequencer.h                |  2 ++
 4 files changed, 72 insertions(+), 3 deletions(-)

diff --git a/builtin/rebase--helper.c b/builtin/rebase--helper.c
index 821058d..9444c8d 100644
--- a/builtin/rebase--helper.c
+++ b/builtin/rebase--helper.c
@@ -13,7 +13,7 @@ int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
 	struct replay_opts opts = REPLAY_OPTS_INIT;
 	int keep_empty = 0;
 	enum {
-		CONTINUE = 1, ABORT, MAKE_SCRIPT
+		CONTINUE = 1, ABORT, MAKE_SCRIPT, SHORTEN_SHA1S, EXPAND_SHA1S
 	} command = 0;
 	struct option options[] = {
 		OPT_BOOL(0, "ff", &opts.allow_ff, N_("allow fast-forward")),
@@ -24,6 +24,10 @@ int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
 				ABORT),
 		OPT_CMDMODE(0, "make-script", &command,
 			N_("make rebase script"), MAKE_SCRIPT),
+		OPT_CMDMODE(0, "shorten-sha1s", &command,
+			N_("shorten SHA-1s in the todo list"), SHORTEN_SHA1S),
+		OPT_CMDMODE(0, "expand-sha1s", &command,
+			N_("expand SHA-1s in the todo list"), EXPAND_SHA1S),
 		OPT_END()
 	};
 
@@ -42,5 +46,9 @@ int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
 		return !!sequencer_remove_state(&opts);
 	if (command == MAKE_SCRIPT && argc > 1)
 		return !!sequencer_make_script(keep_empty, stdout, argc, argv);
+	if (command == SHORTEN_SHA1S && argc == 1)
+		return !!transform_todo_ids(1);
+	if (command == EXPAND_SHA1S && argc == 1)
+		return !!transform_todo_ids(0);
 	usage_with_options(builtin_rebase_helper_usage, options);
 }
diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
index 0eb5583..f642ec2 100644
--- a/git-rebase--interactive.sh
+++ b/git-rebase--interactive.sh
@@ -764,11 +764,11 @@ transform_todo_ids () {
 }
 
 expand_todo_ids() {
-	transform_todo_ids
+	git rebase--helper --expand-sha1s
 }
 
 collapse_todo_ids() {
-	transform_todo_ids --short
+	git rebase--helper --shorten-sha1s
 }
 
 # Rearrange the todo list that has both "pick sha1 msg" and
diff --git a/sequencer.c b/sequencer.c
index 43e078a..ee4fdb0 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -2391,3 +2391,62 @@ int sequencer_make_script(int keep_empty, FILE *out,
 	strbuf_release(&buf);
 	return 0;
 }
+
+
+int transform_todo_ids(int shorten_sha1s)
+{
+	const char *todo_file = rebase_path_todo();
+	struct todo_list todo_list = TODO_LIST_INIT;
+	int fd, res, i;
+	FILE *out;
+
+	strbuf_reset(&todo_list.buf);
+	fd = open(todo_file, O_RDONLY);
+	if (fd < 0)
+		return error_errno(_("Could not open '%s'"), todo_file);
+	if (strbuf_read(&todo_list.buf, fd, 0) < 0) {
+		close(fd);
+		return error(_("Could not read %s."), todo_file);
+	}
+	close(fd);
+
+	res = parse_insn_buffer(todo_list.buf.buf, &todo_list);
+	if (res) {
+		todo_list_release(&todo_list);
+		return error(_("Unusable instruction sheet: %s"), todo_file);
+	}
+
+	out = fopen(todo_file, "w");
+	if (!out) {
+		todo_list_release(&todo_list);
+		return error(_("Unable to open '%s' for writing"), todo_file);
+	}
+	for (i = 0; i < todo_list.nr; i++) {
+		struct todo_item *item = todo_list.items + i;
+		int bol = item->offset_in_buf;
+		const char *p = todo_list.buf.buf + bol;
+		int eol = i + 1 < todo_list.nr ?
+			todo_list.items[i + 1].offset_in_buf :
+			todo_list.buf.len;
+
+		if (item->command >= TODO_EXEC && item->command != TODO_DROP)
+			fwrite(p, eol - bol, 1, out);
+		else {
+			int eoc = strcspn(p, " \t");
+			const char *sha1 = shorten_sha1s ?
+				short_commit_name(item->commit) :
+				oid_to_hex(&item->commit->object.oid);
+
+			if (!eoc) {
+				p += strspn(p, " \t");
+				eoc = strcspn(p, " \t");
+			}
+
+			fprintf(out, "%.*s %s %.*s\n",
+				eoc, p, sha1, item->arg_len, item->arg);
+		}
+	}
+	fclose(out);
+	todo_list_release(&todo_list);
+	return 0;
+}
diff --git a/sequencer.h b/sequencer.h
index bc524be..5feb525 100644
--- a/sequencer.h
+++ b/sequencer.h
@@ -61,6 +61,8 @@ int sequencer_commit(const char *defmsg, struct replay_opts *opts,
 int sequencer_make_script(int keep_empty, FILE *out,
 		int argc, const char **argv);
 
+int transform_todo_ids(int shorten_sha1s);
+
 extern const char sign_off_header[];
 
 void append_signoff(struct strbuf *msgbuf, int ignore_footer, unsigned flag);
-- 
2.9.3.windows.3



^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [PATCH 5/9] t3404: relax rebase.missingCommitsCheck tests
  2016-09-02 16:22 [PATCH 0/9] The final building block for a faster rebase -i Johannes Schindelin
                   ` (3 preceding siblings ...)
  2016-09-02 16:23 ` [PATCH 4/9] rebase -i: also expand/collapse the SHA-1s via the rebase--helper Johannes Schindelin
@ 2016-09-02 16:23 ` Johannes Schindelin
  2016-09-02 16:23 ` [PATCH 6/9] rebase -i: check for missing commits in the rebase--helper Johannes Schindelin
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 100+ messages in thread
From: Johannes Schindelin @ 2016-09-02 16:23 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano

These tests were a bit anal about the *exact* warning/error message
printed by git rebase. But those messages are intended for the *end
user*, therefore it does not make sense to test so rigidly for the
*exact* wording.

In the following, we will reimplement the missing commits check in
the sequencer, with slightly different words.

So let's just test for the parts in the warning/error message that
we *really* care about, nothing more, nothing less.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 t/t3404-rebase-interactive.sh | 22 ++++------------------
 1 file changed, 4 insertions(+), 18 deletions(-)

diff --git a/t/t3404-rebase-interactive.sh b/t/t3404-rebase-interactive.sh
index 597e94e..a18759e 100755
--- a/t/t3404-rebase-interactive.sh
+++ b/t/t3404-rebase-interactive.sh
@@ -1215,20 +1215,13 @@ test_expect_success 'rebase -i respects rebase.missingCommitsCheck = error' '
 	test B = $(git cat-file commit HEAD^ | sed -ne \$p)
 '
 
-cat >expect <<EOF
-Warning: the command isn't recognized in the following line:
- - badcmd $(git rev-list --oneline -1 master~1)
-
-You can fix this with 'git rebase --edit-todo'.
-Or you can abort the rebase with 'git rebase --abort'.
-EOF
-
 test_expect_success 'static check of bad command' '
 	rebase_setup_and_clean bad-cmd &&
 	set_fake_editor &&
 	test_must_fail env FAKE_LINES="1 2 3 bad 4 5" \
 		git rebase -i --root 2>actual &&
-	test_i18ncmp expect actual &&
+	test_i18ngrep "badcmd $(git rev-list --oneline -1 master~1)" actual &&
+	test_i18ngrep "You can fix this with .git rebase --edit-todo.." actual &&
 	FAKE_LINES="1 2 3 drop 4 5" git rebase --edit-todo &&
 	git rebase --continue &&
 	test E = $(git cat-file commit HEAD | sed -ne \$p) &&
@@ -1250,20 +1243,13 @@ test_expect_success 'tabs and spaces are accepted in the todolist' '
 	test E = $(git cat-file commit HEAD | sed -ne \$p)
 '
 
-cat >expect <<EOF
-Warning: the SHA-1 is missing or isn't a commit in the following line:
- - edit XXXXXXX False commit
-
-You can fix this with 'git rebase --edit-todo'.
-Or you can abort the rebase with 'git rebase --abort'.
-EOF
-
 test_expect_success 'static check of bad SHA-1' '
 	rebase_setup_and_clean bad-sha &&
 	set_fake_editor &&
 	test_must_fail env FAKE_LINES="1 2 edit fakesha 3 4 5 #" \
 		git rebase -i --root 2>actual &&
-	test_i18ncmp expect actual &&
+	test_i18ngrep "edit XXXXXXX False commit" actual &&
+	test_i18ngrep "You can fix this with .git rebase --edit-todo.." actual &&
 	FAKE_LINES="1 2 4 5 6" git rebase --edit-todo &&
 	git rebase --continue &&
 	test E = $(git cat-file commit HEAD | sed -ne \$p)
-- 
2.9.3.windows.3



^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [PATCH 6/9] rebase -i: check for missing commits in the rebase--helper
  2016-09-02 16:22 [PATCH 0/9] The final building block for a faster rebase -i Johannes Schindelin
                   ` (4 preceding siblings ...)
  2016-09-02 16:23 ` [PATCH 5/9] t3404: relax rebase.missingCommitsCheck tests Johannes Schindelin
@ 2016-09-02 16:23 ` Johannes Schindelin
  2016-09-02 20:59   ` Dennis Kaarsemaker
  2016-09-02 16:23 ` [PATCH 7/9] rebase -i: skip unnecessary picks using " Johannes Schindelin
                   ` (3 subsequent siblings)
  9 siblings, 1 reply; 100+ messages in thread
From: Johannes Schindelin @ 2016-09-02 16:23 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano

In particular on Windows, where shell scripts are even more expensive
than on MacOSX or Linux, it makes sense to move a loop that forks
Git at least once for every line in the todo list into a builtin.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 builtin/rebase--helper.c   |   7 +-
 git-rebase--interactive.sh | 164 ++-------------------------------------------
 sequencer.c                | 124 ++++++++++++++++++++++++++++++++++
 sequencer.h                |   1 +
 4 files changed, 136 insertions(+), 160 deletions(-)

diff --git a/builtin/rebase--helper.c b/builtin/rebase--helper.c
index 9444c8d..e706eac 100644
--- a/builtin/rebase--helper.c
+++ b/builtin/rebase--helper.c
@@ -13,7 +13,8 @@ int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
 	struct replay_opts opts = REPLAY_OPTS_INIT;
 	int keep_empty = 0;
 	enum {
-		CONTINUE = 1, ABORT, MAKE_SCRIPT, SHORTEN_SHA1S, EXPAND_SHA1S
+		CONTINUE = 1, ABORT, MAKE_SCRIPT, SHORTEN_SHA1S, EXPAND_SHA1S,
+		CHECK_TODO_LIST
 	} command = 0;
 	struct option options[] = {
 		OPT_BOOL(0, "ff", &opts.allow_ff, N_("allow fast-forward")),
@@ -28,6 +29,8 @@ int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
 			N_("shorten SHA-1s in the todo list"), SHORTEN_SHA1S),
 		OPT_CMDMODE(0, "expand-sha1s", &command,
 			N_("expand SHA-1s in the todo list"), EXPAND_SHA1S),
+		OPT_CMDMODE(0, "check-todo-list", &command,
+			N_("check the todo list"), CHECK_TODO_LIST),
 		OPT_END()
 	};
 
@@ -50,5 +53,7 @@ int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
 		return !!transform_todo_ids(1);
 	if (command == EXPAND_SHA1S && argc == 1)
 		return !!transform_todo_ids(0);
+	if (command == CHECK_TODO_LIST && argc == 1)
+		return !!check_todo_list();
 	usage_with_options(builtin_rebase_helper_usage, options);
 }
diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
index f642ec2..02a7698 100644
--- a/git-rebase--interactive.sh
+++ b/git-rebase--interactive.sh
@@ -880,96 +880,6 @@ add_exec_commands () {
 	mv "$1.new" "$1"
 }
 
-# Check if the SHA-1 passed as an argument is a
-# correct one, if not then print $2 in "$todo".badsha
-# $1: the SHA-1 to test
-# $2: the line number of the input
-# $3: the input filename
-check_commit_sha () {
-	badsha=0
-	if test -z "$1"
-	then
-		badsha=1
-	else
-		sha1_verif="$(git rev-parse --verify --quiet $1^{commit})"
-		if test -z "$sha1_verif"
-		then
-			badsha=1
-		fi
-	fi
-
-	if test $badsha -ne 0
-	then
-		line="$(sed -n -e "${2}p" "$3")"
-		warn "$(eval_gettext "\
-Warning: the SHA-1 is missing or isn't a commit in the following line:
- - \$line")"
-		warn
-	fi
-
-	return $badsha
-}
-
-# prints the bad commits and bad commands
-# from the todolist in stdin
-check_bad_cmd_and_sha () {
-	retval=0
-	lineno=0
-	while read -r command rest
-	do
-		lineno=$(( $lineno + 1 ))
-		case $command in
-		"$comment_char"*|''|noop|x|exec)
-			# Doesn't expect a SHA-1
-			;;
-		"$cr")
-			# Work around CR left by "read" (e.g. with Git for
-			# Windows' Bash).
-			;;
-		pick|p|drop|d|reword|r|edit|e|squash|s|fixup|f)
-			if ! check_commit_sha "${rest%%[ 	]*}" "$lineno" "$1"
-			then
-				retval=1
-			fi
-			;;
-		*)
-			line="$(sed -n -e "${lineno}p" "$1")"
-			warn "$(eval_gettext "\
-Warning: the command isn't recognized in the following line:
- - \$line")"
-			warn
-			retval=1
-			;;
-		esac
-	done <"$1"
-	return $retval
-}
-
-# Print the list of the SHA-1 of the commits
-# from stdin to stdout
-todo_list_to_sha_list () {
-	git stripspace --strip-comments |
-	while read -r command sha1 rest
-	do
-		case $command in
-		"$comment_char"*|''|noop|x|"exec")
-			;;
-		*)
-			long_sha=$(git rev-list --no-walk "$sha1" 2>/dev/null)
-			printf "%s\n" "$long_sha"
-			;;
-		esac
-	done
-}
-
-# Use warn for each line in stdin
-warn_lines () {
-	while read -r line
-	do
-		warn " - $line"
-	done
-}
-
 # Switch to the branch in $into and notify it in the reflog
 checkout_onto () {
 	GIT_REFLOG_ACTION="$GIT_REFLOG_ACTION: checkout $onto_name"
@@ -984,74 +894,6 @@ get_missing_commit_check_level () {
 	printf '%s' "$check_level" | tr 'A-Z' 'a-z'
 }
 
-# Check if the user dropped some commits by mistake
-# Behaviour determined by rebase.missingCommitsCheck.
-# Check if there is an unrecognized command or a
-# bad SHA-1 in a command.
-check_todo_list () {
-	raise_error=f
-
-	check_level=$(get_missing_commit_check_level)
-
-	case "$check_level" in
-	warn|error)
-		# Get the SHA-1 of the commits
-		todo_list_to_sha_list <"$todo".backup >"$todo".oldsha1
-		todo_list_to_sha_list <"$todo" >"$todo".newsha1
-
-		# Sort the SHA-1 and compare them
-		sort -u "$todo".oldsha1 >"$todo".oldsha1+
-		mv "$todo".oldsha1+ "$todo".oldsha1
-		sort -u "$todo".newsha1 >"$todo".newsha1+
-		mv "$todo".newsha1+ "$todo".newsha1
-		comm -2 -3 "$todo".oldsha1 "$todo".newsha1 >"$todo".miss
-
-		# Warn about missing commits
-		if test -s "$todo".miss
-		then
-			test "$check_level" = error && raise_error=t
-
-			warn "$(gettext "\
-Warning: some commits may have been dropped accidentally.
-Dropped commits (newer to older):")"
-
-			# Make the list user-friendly and display
-			opt="--no-walk=sorted --format=oneline --abbrev-commit --stdin"
-			git rev-list $opt <"$todo".miss | warn_lines
-
-			warn "$(gettext "\
-To avoid this message, use \"drop\" to explicitly remove a commit.
-
-Use 'git config rebase.missingCommitsCheck' to change the level of warnings.
-The possible behaviours are: ignore, warn, error.")"
-			warn
-		fi
-		;;
-	ignore)
-		;;
-	*)
-		warn "$(eval_gettext "Unrecognized setting \$check_level for option rebase.missingCommitsCheck. Ignoring.")"
-		;;
-	esac
-
-	if ! check_bad_cmd_and_sha "$todo"
-	then
-		raise_error=t
-	fi
-
-	if test $raise_error = t
-	then
-		# Checkout before the first commit of the
-		# rebase: this way git rebase --continue
-		# will work correctly as it expects HEAD to be
-		# placed before the commit of the next action
-		checkout_onto
-
-		warn "$(gettext "You can fix this with 'git rebase --edit-todo'.")"
-		die "$(gettext "Or you can abort the rebase with 'git rebase --abort'.")"
-	fi
-}
-
 # The whole contents of this file is run by dot-sourcing it from
 # inside a shell function.  It used to be that "return"s we see
 # below were not inside any function, and expected to return
@@ -1315,7 +1157,11 @@ git_sequence_editor "$todo" ||
 has_action "$todo" ||
 	return 2
 
-check_todo_list
+git rebase--helper --check-todo-list || {
+	ret=$?
+	checkout_onto
+	exit $ret
+}
 
 expand_todo_ids
 
diff --git a/sequencer.c b/sequencer.c
index ee4fdb0..0c82925 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -2450,3 +2450,127 @@ int transform_todo_ids(int shorten_sha1s)
 	todo_list_release(&todo_list);
 	return 0;
 }
+
+enum check_level {
+	CHECK_IGNORE = 0, CHECK_WARN, CHECK_ERROR
+};
+
+static enum check_level get_missing_commit_check_level(void)
+{
+	const char *value;
+
+	if (git_config_get_value("rebase.missingcommitscheck", &value) ||
+			!strcasecmp("ignore", value))
+		return CHECK_IGNORE;
+	if (!strcasecmp("warn", value))
+		return CHECK_WARN;
+	if (!strcasecmp("error", value))
+		return CHECK_ERROR;
+	warning(_("Unrecognized setting $check_level for option"
+			"rebase.missingCommitsCheck. Ignoring."));
+	return CHECK_IGNORE;
+}
+
+/*
+ * Check if the user dropped some commits by mistake
+ * Behaviour determined by rebase.missingCommitsCheck.
+ * Check if there is an unrecognized command or a
+ * bad SHA-1 in a command.
+ */
+int check_todo_list(void)
+{
+	enum check_level check_level = get_missing_commit_check_level();
+	struct strbuf todo_file = STRBUF_INIT;
+	struct todo_list todo_list = TODO_LIST_INIT;
+	struct commit_list *missing = NULL;
+	int raise_error = 0, res = 0, fd, i;
+
+	strbuf_addstr(&todo_file, rebase_path_todo());
+	fd = open(todo_file.buf, O_RDONLY);
+	if (fd < 0) {
+		res = error_errno(_("Could not open %s"), todo_file.buf);
+		goto leave_check;
+	}
+	if (strbuf_read(&todo_list.buf, fd, 0) < 0) {
+		close(fd);
+		res = error(_("Could not read %s."), todo_file.buf);
+		goto leave_check;
+	}
+	close(fd);
+	raise_error = res =
+		parse_insn_buffer(todo_list.buf.buf, &todo_list);
+
+	if (check_level == CHECK_IGNORE)
+		goto leave_check;
+
+	/* Get the SHA-1 of the commits */
+	for (i = 0; i < todo_list.nr; i++) {
+		struct commit *commit = todo_list.items[i].commit;
+		if (commit)
+			commit->util = todo_list.items + i;
+	}
+
+	todo_list_release(&todo_list);
+	strbuf_addstr(&todo_file, ".backup");
+	fd = open(todo_file.buf, O_RDONLY);
+	if (fd < 0) {
+		res = error_errno(_("Could not open %s"), todo_file.buf);
+		goto leave_check;
+	}
+	if (strbuf_read(&todo_list.buf, fd, 0) < 0) {
+		close(fd);
+		res = error(_("Could not read %s."), todo_file.buf);
+		goto leave_check;
+	}
+	close(fd);
+	strbuf_release(&todo_file);
+	res = !!parse_insn_buffer(todo_list.buf.buf, &todo_list);
+
+	/* Find commits that are missing after editing */
+	for (i = 0; i < todo_list.nr; i++) {
+		struct commit *commit = todo_list.items[i].commit;
+		if (commit && !commit->util) {
+			commit_list_insert(commit, &missing);
+			commit->util = todo_list.items + i;
+		}
+	}
+
+	/* Warn about missing commits */
+	if (!missing)
+		goto leave_check;
+
+	if (check_level == CHECK_ERROR)
+		raise_error = res = 1;
+
+	fprintf(stderr,
+		_("Warning: some commits may have been dropped accidentally.\n"
+		"Dropped commits (newer to older):\n"));
+
+	/* Make the list user-friendly and display */
+	while (missing) {
+		struct commit *commit = pop_commit(&missing);
+		struct todo_item *item = commit->util;
+
+		fprintf(stderr, " - %s %.*s\n", short_commit_name(commit),
+			item->arg_len, item->arg);
+	}
+	free_commit_list(missing);
+
+	fprintf(stderr, _("To avoid this message, use \"drop\" to "
+		"explicitly remove a commit.\n\n"
+		"Use 'git config rebase.missingCommitsCheck' to change "
+		"the level of warnings.\n"
+		"The possible behaviours are: ignore, warn, error.\n\n"));
+
+leave_check:
+	strbuf_release(&todo_file);
+	todo_list_release(&todo_list);
+
+	if (raise_error)
+		fprintf(stderr,
+			_("You can fix this with 'git rebase --edit-todo'.\n"
+			"Or you can abort the rebase with 'git rebase"
+			" --abort'.\n"));
+
+	return res;
+}
diff --git a/sequencer.h b/sequencer.h
index 5feb525..8e3daf9 100644
--- a/sequencer.h
+++ b/sequencer.h
@@ -62,6 +62,7 @@ int sequencer_make_script(int keep_empty, FILE *out,
 		int argc, const char **argv);
 
 int transform_todo_ids(int shorten_sha1s);
+int check_todo_list(void);
 
 extern const char sign_off_header[];
 
-- 
2.9.3.windows.3



^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [PATCH 7/9] rebase -i: skip unnecessary picks using the rebase--helper
  2016-09-02 16:22 [PATCH 0/9] The final building block for a faster rebase -i Johannes Schindelin
                   ` (5 preceding siblings ...)
  2016-09-02 16:23 ` [PATCH 6/9] rebase -i: check for missing commits in the rebase--helper Johannes Schindelin
@ 2016-09-02 16:23 ` Johannes Schindelin
  2016-09-02 16:23 ` [PATCH 8/9] t3415: test fixup with wrapped oneline Johannes Schindelin
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 100+ messages in thread
From: Johannes Schindelin @ 2016-09-02 16:23 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano

In particular on Windows, where shell scripts are even more expensive
than on MacOSX or Linux, it makes sense to move a loop that forks
Git at least once for every line in the todo list into a builtin.

Note: The original code did not try to skip unnecessary picks of root
commits but punts instead (probably --root was not considered common
enough of a use case to bother optimizing). We do the same, for now.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 builtin/rebase--helper.c   |  6 +++-
 git-rebase--interactive.sh | 41 ++-------------------
 sequencer.c                | 90 ++++++++++++++++++++++++++++++++++++++++++++++
 sequencer.h                |  1 +
 4 files changed, 99 insertions(+), 39 deletions(-)

diff --git a/builtin/rebase--helper.c b/builtin/rebase--helper.c
index e706eac..de3ccd9 100644
--- a/builtin/rebase--helper.c
+++ b/builtin/rebase--helper.c
@@ -14,7 +14,7 @@ int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
 	int keep_empty = 0;
 	enum {
 		CONTINUE = 1, ABORT, MAKE_SCRIPT, SHORTEN_SHA1S, EXPAND_SHA1S,
-		CHECK_TODO_LIST
+		CHECK_TODO_LIST, SKIP_UNNECESSARY_PICKS
 	} command = 0;
 	struct option options[] = {
 		OPT_BOOL(0, "ff", &opts.allow_ff, N_("allow fast-forward")),
@@ -31,6 +31,8 @@ int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
 			N_("expand SHA-1s in the todo list"), EXPAND_SHA1S),
 		OPT_CMDMODE(0, "check-todo-list", &command,
 			N_("check the todo list"), CHECK_TODO_LIST),
+		OPT_CMDMODE(0, "skip-unnecessary-picks", &command,
+			N_("skip unnecessary picks"), SKIP_UNNECESSARY_PICKS),
 		OPT_END()
 	};
 
@@ -55,5 +57,7 @@ int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
 		return !!transform_todo_ids(0);
 	if (command == CHECK_TODO_LIST && argc == 1)
 		return !!check_todo_list();
+	if (command == SKIP_UNNECESSARY_PICKS && argc == 1)
+		return !!skip_unnecessary_picks();
 	usage_with_options(builtin_rebase_helper_usage, options);
 }
diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
index 02a7698..a34ebdc 100644
--- a/git-rebase--interactive.sh
+++ b/git-rebase--interactive.sh
@@ -703,43 +703,6 @@ do_rest () {
 	done
 }
 
-# skip picking commits whose parents are unchanged
-skip_unnecessary_picks () {
-	fd=3
-	while read -r command rest
-	do
-		# fd=3 means we skip the command
-		case "$fd,$command" in
-		3,pick|3,p)
-			# pick a commit whose parent is current $onto -> skip
-			sha1=${rest%% *}
-			case "$(git rev-parse --verify --quiet "$sha1"^)" in
-			"$onto"*)
-				onto=$sha1
-				;;
-			*)
-				fd=1
-				;;
-			esac
-			;;
-		3,"$comment_char"*|3,)
-			# copy comments
-			;;
-		*)
-			fd=1
-			;;
-		esac
-		printf '%s\n' "$command${rest:+ }$rest" >&$fd
-	done <"$todo" >"$todo.new" 3>>"$done" &&
-	mv -f "$todo".new "$todo" &&
-	case "$(peek_next_command)" in
-	squash|s|fixup|f)
-		record_in_rewritten "$onto"
-		;;
-	esac ||
-		die "$(gettext "Could not skip unnecessary pick commands")"
-}
-
 transform_todo_ids () {
 	while read -r command rest
 	do
@@ -1165,7 +1128,9 @@ git rebase--helper --check-todo-list || {
 
 expand_todo_ids
 
-test -d "$rewritten" || test -n "$force_rebase" || skip_unnecessary_picks
+test -d "$rewritten" || test -n "$force_rebase" ||
+onto="$(git rebase--helper --skip-unnecessary-picks)" ||
+die "Could not skip unnecessary pick commands"
 
 checkout_onto
 if test -z "$rebase_root" && test ! -d "$rewritten"
diff --git a/sequencer.c b/sequencer.c
index 0c82925..6cc94c9 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -2574,3 +2574,93 @@ int check_todo_list(void)
 
 	return res;
 }
+
+/* skip picking commits whose parents are unchanged */
+int skip_unnecessary_picks(void)
+{
+	const char *todo_file = rebase_path_todo();
+	struct strbuf buf = STRBUF_INIT;
+	struct todo_list todo_list = TODO_LIST_INIT;
+	struct object_id onto_oid, *oid = &onto_oid, *parent_oid;
+	int fd, i;
+
+	if (!read_oneliner(&buf, rebase_path_onto(), 0))
+		return error("Could not read 'onto'");
+	if (get_sha1(buf.buf, onto_oid.hash)) {
+		strbuf_release(&buf);
+		return error("Need a HEAD to fixup");
+	}
+	strbuf_release(&buf);
+
+	fd = open(todo_file, O_RDONLY);
+	if (fd < 0) {
+		return error_errno(_("Could not open '%s'"), todo_file);
+	}
+	if (strbuf_read(&todo_list.buf, fd, 0) < 0) {
+		close(fd);
+		return error(_("Could not read '%s'."), todo_file);
+	}
+	close(fd);
+	if (parse_insn_buffer(todo_list.buf.buf, &todo_list) < 0) {
+		todo_list_release(&todo_list);
+		return -1;
+	}
+
+	for (i = 0; i < todo_list.nr; i++) {
+		struct todo_item *item = todo_list.items + i;
+
+		if (item->command >= TODO_NOOP)
+			continue;
+		if (item->command != TODO_PICK)
+			break;
+		if (parse_commit(item->commit)) {
+			todo_list_release(&todo_list);
+			return error(_("Could not parse commit '%s'"),
+				oid_to_hex(&item->commit->object.oid));
+		}
+		if (!item->commit->parents)
+			break; /* root commit */
+		if (item->commit->parents->next)
+			break; /* merge commit */
+		parent_oid = &item->commit->parents->item->object.oid;
+		if (hashcmp(parent_oid->hash, oid->hash))
+			break;
+		oid = &item->commit->object.oid;
+	}
+	if (i > 0) {
+		int offset = i < todo_list.nr ?
+			todo_list.items[i].offset_in_buf : todo_list.buf.len;
+		const char *done_path = rebase_path_done();
+
+		fd = open(done_path, O_CREAT | O_WRONLY | O_APPEND, 0666);
+		if (write_in_full(fd, todo_list.buf.buf, offset) < 0) {
+			todo_list_release(&todo_list);
+			return error_errno(_("Could not write to '%s'"),
+				done_path);
+		}
+		close(fd);
+
+		fd = open(rebase_path_todo(), O_WRONLY, 0666);
+		if (write_in_full(fd, todo_list.buf.buf + offset,
+				todo_list.buf.len - offset) < 0) {
+			todo_list_release(&todo_list);
+			return error_errno(_("Could not write to '%s'"),
+				rebase_path_todo());
+		}
+		if (ftruncate(fd, todo_list.buf.len - offset) < 0) {
+			todo_list_release(&todo_list);
+			return error_errno(_("Could not truncate '%s'"),
+				rebase_path_todo());
+		}
+		close(fd);
+
+		todo_list.current = i;
+		if (is_fixup(peek_command(&todo_list, 0)))
+			record_in_rewritten(oid, peek_command(&todo_list, 0));
+	}
+
+	todo_list_release(&todo_list);
+	printf("%s\n", oid_to_hex(oid));
+
+	return 0;
+}
diff --git a/sequencer.h b/sequencer.h
index 8e3daf9..4da215c 100644
--- a/sequencer.h
+++ b/sequencer.h
@@ -63,6 +63,7 @@ int sequencer_make_script(int keep_empty, FILE *out,
 
 int transform_todo_ids(int shorten_sha1s);
 int check_todo_list(void);
+int skip_unnecessary_picks(void);
 
 extern const char sign_off_header[];
 
-- 
2.9.3.windows.3



^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [PATCH 8/9] t3415: test fixup with wrapped oneline
  2016-09-02 16:22 [PATCH 0/9] The final building block for a faster rebase -i Johannes Schindelin
                   ` (6 preceding siblings ...)
  2016-09-02 16:23 ` [PATCH 7/9] rebase -i: skip unnecessary picks using " Johannes Schindelin
@ 2016-09-02 16:23 ` Johannes Schindelin
  2016-09-02 16:23 ` [PATCH 9/9] rebase -i: rearrange fixup/squash lines using the rebase--helper Johannes Schindelin
  2017-04-25 13:51 ` [PATCH v2 0/9] The final building block for a faster rebase -i Johannes Schindelin
  9 siblings, 0 replies; 100+ messages in thread
From: Johannes Schindelin @ 2016-09-02 16:23 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano

The `git commit --fixup` command unwraps wrapped onelines when
constructing the commit message, without wrapping the result.

We need to make sure that `git rebase --autosquash` keeps handling such
cases correctly, in particular since we are about to move the autosquash
handling into the rebase--helper.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 t/t3415-rebase-autosquash.sh | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/t/t3415-rebase-autosquash.sh b/t/t3415-rebase-autosquash.sh
index 48346f1..9fd629a 100755
--- a/t/t3415-rebase-autosquash.sh
+++ b/t/t3415-rebase-autosquash.sh
@@ -304,4 +304,18 @@ test_expect_success 'extra spaces after fixup!' '
 	test $base = $parent
 '
 
+test_expect_success 'wrapped original subject' '
+	if test -d .git/rebase-merge; then git rebase --abort; fi &&
+	base=$(git rev-parse HEAD) &&
+	echo "wrapped subject" >wrapped &&
+	git add wrapped &&
+	test_tick &&
+	git commit --allow-empty -m "$(printf "To\nfixup")" &&
+	test_tick &&
+	git commit --allow-empty -m "fixup! To fixup" &&
+	git rebase -i --autosquash --keep-empty HEAD~2 &&
+	parent=$(git rev-parse HEAD^) &&
+	test $base = $parent
+'
+
 test_done
-- 
2.9.3.windows.3



^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [PATCH 9/9] rebase -i: rearrange fixup/squash lines using the rebase--helper
  2016-09-02 16:22 [PATCH 0/9] The final building block for a faster rebase -i Johannes Schindelin
                   ` (7 preceding siblings ...)
  2016-09-02 16:23 ` [PATCH 8/9] t3415: test fixup with wrapped oneline Johannes Schindelin
@ 2016-09-02 16:23 ` Johannes Schindelin
  2016-09-03 18:03   ` Josh Triplett
  2017-04-25 13:51 ` [PATCH v2 0/9] The final building block for a faster rebase -i Johannes Schindelin
  9 siblings, 1 reply; 100+ messages in thread
From: Johannes Schindelin @ 2016-09-02 16:23 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano

This operation has quadratic complexity, which is especially painful
on Windows, where shell scripts are *already* slow (mainly due to the
overhead of the POSIX emulation layer).

Let's reimplement this with linear complexity (using a hash map to
match the commits' subject lines) for the common case; Sadly, the
fixup/squash feature's design neglected performance considerations,
allowing arbitrary prefixes (read: `fixup! hell` will match the
commit subject `hello world`), which means that we are stuck with
quadratic performance in the worst case.

The reimplemented logic also happens to fix a bug where commented-out
lines (representing empty patches) were dropped by the previous code.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 builtin/rebase--helper.c     |   6 +-
 git-rebase--interactive.sh   |  90 +-------------------
 sequencer.c                  | 197 +++++++++++++++++++++++++++++++++++++++++++
 sequencer.h                  |   1 +
 t/t3415-rebase-autosquash.sh |   2 +-
 5 files changed, 205 insertions(+), 91 deletions(-)

diff --git a/builtin/rebase--helper.c b/builtin/rebase--helper.c
index de3ccd9..e6591f0 100644
--- a/builtin/rebase--helper.c
+++ b/builtin/rebase--helper.c
@@ -14,7 +14,7 @@ int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
 	int keep_empty = 0;
 	enum {
 		CONTINUE = 1, ABORT, MAKE_SCRIPT, SHORTEN_SHA1S, EXPAND_SHA1S,
-		CHECK_TODO_LIST, SKIP_UNNECESSARY_PICKS
+		CHECK_TODO_LIST, SKIP_UNNECESSARY_PICKS, REARRANGE_SQUASH
 	} command = 0;
 	struct option options[] = {
 		OPT_BOOL(0, "ff", &opts.allow_ff, N_("allow fast-forward")),
@@ -33,6 +33,8 @@ int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
 			N_("check the todo list"), CHECK_TODO_LIST),
 		OPT_CMDMODE(0, "skip-unnecessary-picks", &command,
 			N_("skip unnecessary picks"), SKIP_UNNECESSARY_PICKS),
+		OPT_CMDMODE(0, "rearrange-squash", &command,
+			N_("rearrange fixup/squash lines"), REARRANGE_SQUASH),
 		OPT_END()
 	};
 
@@ -59,5 +61,7 @@ int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
 		return !!check_todo_list();
 	if (command == SKIP_UNNECESSARY_PICKS && argc == 1)
 		return !!skip_unnecessary_picks();
+	if (command == REARRANGE_SQUASH && argc == 1)
+		return !!rearrange_squash();
 	usage_with_options(builtin_rebase_helper_usage, options);
 }
diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
index a34ebdc..68a6e6a 100644
--- a/git-rebase--interactive.sh
+++ b/git-rebase--interactive.sh
@@ -734,94 +734,6 @@ collapse_todo_ids() {
 	git rebase--helper --shorten-sha1s
 }
 
-# Rearrange the todo list that has both "pick sha1 msg" and
-# "pick sha1 fixup!/squash! msg" appears in it so that the latter
-# comes immediately after the former, and change "pick" to
-# "fixup"/"squash".
-#
-# Note that if the config has specified a custom instruction format
-# each log message will be re-retrieved in order to normalize the
-# autosquash arrangement
-rearrange_squash () {
-	format=$(git config --get rebase.instructionFormat)
-	# extract fixup!/squash! lines and resolve any referenced sha1's
-	while read -r pick sha1 message
-	do
-		test -z "${format}" || message=$(git log -n 1 --format="%s" ${sha1})
-		case "$message" in
-		"squash! "*|"fixup! "*)
-			action="${message%%!*}"
-			rest=$message
-			prefix=
-			# skip all squash! or fixup! (but save for later)
-			while :
-			do
-				case "$rest" in
-				"squash! "*|"fixup! "*)
-					prefix="$prefix${rest%%!*},"
-					rest="${rest#*! }"
-					;;
-				*)
-					break
-					;;
-				esac
-			done
-			printf '%s %s %s %s\n' "$sha1" "$action" "$prefix" "$rest"
-			# if it's a single word, try to resolve to a full sha1 and
-			# emit a second copy. This allows us to match on both message
-			# and on sha1 prefix
-			if test "${rest#* }" = "$rest"; then
-				fullsha="$(git rev-parse -q --verify "$rest" 2>/dev/null)"
-				if test -n "$fullsha"; then
-					# prefix the action to uniquely identify this line as
-					# intended for full sha1 match
-					echo "$sha1 +$action $prefix $fullsha"
-				fi
-			fi
-		esac
-	done >"$1.sq" <"$1"
-	test -s "$1.sq" || return
-
-	used=
-	while read -r pick sha1 message
-	do
-		case " $used" in
-		*" $sha1 "*) continue ;;
-		esac
-		printf '%s\n' "$pick $sha1 $message"
-		test -z "${format}" || message=$(git log -n 1 --format="%s" ${sha1})
-		used="$used$sha1 "
-		while read -r squash action msg_prefix msg_content
-		do
-			case " $used" in
-			*" $squash "*) continue ;;
-			esac
-			emit=0
-			case "$action" in
-			+*)
-				action="${action#+}"
-				# full sha1 prefix test
-				case "$msg_content" in "$sha1"*) emit=1;; esac ;;
-			*)
-				# message prefix test
-				case "$message" in "$msg_content"*) emit=1;; esac ;;
-			esac
-			if test $emit = 1; then
-				if test -n "${format}"
-				then
-					msg_content=$(git log -n 1 --format="${format}" ${squash})
-				else
-					msg_content="$(echo "$msg_prefix" | sed "s/,/! /g")$msg_content"
-				fi
-				printf '%s\n' "$action $squash $msg_content"
-				used="$used$squash "
-			fi
-		done <"$1.sq"
-	done >"$1.rearranged" <"$1"
-	cat "$1.rearranged" >"$1"
-	rm -f "$1.sq" "$1.rearranged"
-}
-
 # Add commands after a pick or after a squash/fixup serie
 # in the todo list.
 add_exec_commands () {
@@ -1084,7 +996,7 @@ then
 fi
 
 test -s "$todo" || echo noop >> "$todo"
-test -n "$autosquash" && rearrange_squash "$todo"
+test -z "$autosquash" || git rebase--helper --rearrange-squash || exit
 test -n "$cmd" && add_exec_commands "$todo"
 
 todocount=$(git stripspace --strip-comments <"$todo" | wc -l)
diff --git a/sequencer.c b/sequencer.c
index 6cc94c9..4b50a4c 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -18,6 +18,7 @@
 #include "quote.h"
 #include "log-tree.h"
 #include "wt-status.h"
+#include "hashmap.h"
 
 #define GIT_REFLOG_ACTION "GIT_REFLOG_ACTION"
 
@@ -2664,3 +2665,199 @@ int skip_unnecessary_picks(void)
 
 	return 0;
 }
+
+struct subject2item_entry {
+	struct hashmap_entry entry;
+	int i;
+	char subject[FLEX_ARRAY];
+};
+
+static int subject2item_cmp(const struct subject2item_entry *a,
+	const struct subject2item_entry *b, const void *key)
+{
+	return key ? strcmp(a->subject, key) : strcmp(a->subject, b->subject);
+}
+
+/*
+ * Rearrange the todo list that has both "pick sha1 msg" and "pick sha1
+ * fixup!/squash! msg" in it so that the latter is put immediately after the
+ * former, and change "pick" to "fixup"/"squash".
+ *
+ * Note that if the config has specified a custom instruction format, each log
+ * message will have to be retrieved from the commit (as the oneline in the
+ * script cannot be trusted) in order to normalize the autosquash arrangement.
+ */
+int rearrange_squash(void)
+{
+	const char *todo_file = rebase_path_todo();
+	struct todo_list todo_list = TODO_LIST_INIT;
+	struct hashmap subject2item;
+	int res = 0, rearranged = 0, *next, *tail, fd, i;
+	char **subjects;
+
+	fd = open(todo_file, O_RDONLY);
+	if (fd < 0)
+		return error_errno(_("Could not open %s"), todo_file);
+	if (strbuf_read(&todo_list.buf, fd, 0) < 0) {
+		close(fd);
+		return error(_("Could not read %s."), todo_file);
+	}
+	close(fd);
+	if (parse_insn_buffer(todo_list.buf.buf, &todo_list) < 0) {
+		todo_list_release(&todo_list);
+		return -1;
+	}
+
+	/*
+	 * The hashmap maps onelines to the respective todo list index.
+	 *
+	 * If any items need to be rearranged, the next[i] value will indicate
+	 * which item was moved directly after the i'th.
+	 *
+	 * In that case, last[i] will indicate the index of the latest item to
+	 * be moved to appear after the i'th.
+	 */
+	hashmap_init(&subject2item, (hashmap_cmp_fn) subject2item_cmp,
+		     todo_list.nr);
+	ALLOC_ARRAY(next, todo_list.nr);
+	ALLOC_ARRAY(tail, todo_list.nr);
+	ALLOC_ARRAY(subjects, todo_list.nr);
+	for (i = 0; i < todo_list.nr; i++) {
+		struct strbuf buf = STRBUF_INIT;
+		struct todo_item *item = todo_list.items + i;
+		const char *commit_buffer, *subject, *p;
+		int i2 = -1;
+		struct subject2item_entry *entry;
+
+		next[i] = tail[i] = -1;
+		if (item->command >= TODO_EXEC) {
+			subjects[i] = NULL;
+			continue;
+		}
+
+		if (is_fixup(item->command)) {
+			todo_list_release(&todo_list);
+			return error(_("The script was already rearranged."));
+		}
+
+		item->commit->util = item;
+
+		parse_commit(item->commit);
+		commit_buffer = get_commit_buffer(item->commit, NULL);
+		find_commit_subject(commit_buffer, &subject);
+		format_subject(&buf, subject, " ");
+		subject = subjects[i] = buf.buf;
+		unuse_commit_buffer(item->commit, commit_buffer);
+		if ((skip_prefix(subject, "fixup! ", &p) ||
+		     skip_prefix(subject, "squash! ", &p))) {
+			struct commit *commit2;
+
+			for (;;) {
+				while (isspace(*p))
+					p++;
+				if (!skip_prefix(p, "fixup! ", &p) &&
+				    !skip_prefix(p, "squash! ", &p))
+					break;
+			}
+
+			if ((entry = hashmap_get_from_hash(&subject2item,
+							   strhash(p), p)))
+				/* found by title */
+				i2 = entry->i;
+			else if (!strchr(p, ' ') &&
+				 (commit2 =
+				  lookup_commit_reference_by_name(p)) &&
+				 commit2->util)
+				/* found by commit name */
+				i2 = (struct todo_item *)commit2->util
+					- todo_list.items;
+			else {
+				/* copy can be a prefix of the commit subject */
+				for (i2 = 0; i2 < i; i2++)
+					if (subjects[i2] &&
+					    starts_with(subjects[i2], p))
+						break;
+				if (i2 == i)
+					i2 = -1;
+			}
+		}
+		if (i2 >= 0) {
+			rearranged = 1;
+			todo_list.items[i].command =
+				starts_with(subject, "fixup!") ?
+				TODO_FIXUP : TODO_SQUASH;
+			if (next[i2] < 0)
+				next[i2] = i;
+			else
+				next[tail[i2]] = i;
+			tail[i2] = i;
+		}
+		else if (!hashmap_get_from_hash(&subject2item,
+						strhash(subject), subject)) {
+			FLEX_ALLOC_MEM(entry, subject, buf.buf, buf.len);
+			entry->i = i;
+			hashmap_entry_init(entry, strhash(entry->subject));
+			hashmap_put(&subject2item, entry);
+		}
+		strbuf_detach(&buf, NULL);
+	}
+
+	if (rearranged) {
+		struct strbuf buf = STRBUF_INIT;
+		char *format = NULL;
+
+		git_config_get_string("rebase.instructionFormat", &format);
+		for (i = 0; i < todo_list.nr; i++) {
+			enum todo_command command = todo_list.items[i].command;
+			int cur = i;
+
+			/*
+			 * Initially, all commands are 'pick's. If it is a
+			 * fixup or a squash now, we have rearranged it.
+			 */
+			if (is_fixup(command))
+				continue;
+
+			while (cur >= 0) {
+				int offset = todo_list.items[cur].offset_in_buf;
+				int end_offset = cur + 1 < todo_list.nr ?
+					todo_list.items[cur + 1].offset_in_buf :
+					todo_list.buf.len;
+				char *bol = todo_list.buf.buf + offset;
+				char *eol = todo_list.buf.buf + end_offset;
+
+				/* replace 'pick', by 'fixup' or 'squash' */
+				command = todo_list.items[cur].command;
+				if (is_fixup(command)) {
+					strbuf_addstr(&buf,
+						todo_command_info[command].str);
+					bol += strcspn(bol, " \t");
+				}
+
+				strbuf_add(&buf, bol, eol - bol);
+
+				cur = next[cur];
+			}
+		}
+
+		fd = open(todo_file, O_WRONLY);
+		if (fd < 0)
+			res |= error_errno(_("Could not open %s"), todo_file);
+		else if (write(fd, buf.buf, buf.len) < 0)
+			res |= error_errno(_("Could not read %s."), todo_file);
+		else if (ftruncate(fd, buf.len) < 0)
+			res |= error_errno(_("Could not finish %s"), todo_file);
+		close(fd);
+		strbuf_release(&buf);
+	}
+
+	free(next);
+	free(tail);
+	for (i = 0; i < todo_list.nr; i++)
+		free(subjects[i]);
+	free(subjects);
+	hashmap_free(&subject2item, 1);
+	todo_list_release(&todo_list);
+
+	return res;
+}
diff --git a/sequencer.h b/sequencer.h
index 4da215c..a3aa3cb 100644
--- a/sequencer.h
+++ b/sequencer.h
@@ -64,6 +64,7 @@ int sequencer_make_script(int keep_empty, FILE *out,
 int transform_todo_ids(int shorten_sha1s);
 int check_todo_list(void);
 int skip_unnecessary_picks(void);
+int rearrange_squash(void);
 
 extern const char sign_off_header[];
 
diff --git a/t/t3415-rebase-autosquash.sh b/t/t3415-rebase-autosquash.sh
index 9fd629a..b9e2600 100755
--- a/t/t3415-rebase-autosquash.sh
+++ b/t/t3415-rebase-autosquash.sh
@@ -278,7 +278,7 @@ set_backup_editor () {
 	test_set_editor "$PWD/backup-editor.sh"
 }
 
-test_expect_failure 'autosquash with multiple empty patches' '
+test_expect_success 'autosquash with multiple empty patches' '
 	test_tick &&
 	git commit --allow-empty -m "empty" &&
 	test_tick &&
-- 
2.9.3.windows.3

^ permalink raw reply related	[flat|nested] 100+ messages in thread

* Re: [PATCH 4/9] rebase -i: also expand/collapse the SHA-1s via the rebase--helper
  2016-09-02 16:23 ` [PATCH 4/9] rebase -i: also expand/collapse the SHA-1s via the rebase--helper Johannes Schindelin
@ 2016-09-02 20:56   ` Dennis Kaarsemaker
  2016-09-03  7:01     ` Johannes Schindelin
  0 siblings, 1 reply; 100+ messages in thread
From: Dennis Kaarsemaker @ 2016-09-02 20:56 UTC (permalink / raw)
  To: Johannes Schindelin, git; +Cc: Junio C Hamano

On vr, 2016-09-02 at 18:23 +0200, Johannes Schindelin wrote:
> This is crucial to improve performance on Windows, as the speed is now
> mostly dominated by the SHA-1 transformation (because it spawns a new
> rev-parse process for *every* line, and spawning processes is pretty
> slow from Git for Windows' MSYS2 Bash).

I see these functions only used as part of an shorten-edit-expand
sequence. Why not do a git rebase-helper --edit-todo instead? Saves
another few process spawnings.

Something for yet another later followup patch?

D.

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 6/9] rebase -i: check for missing commits in the rebase--helper
  2016-09-02 16:23 ` [PATCH 6/9] rebase -i: check for missing commits in the rebase--helper Johannes Schindelin
@ 2016-09-02 20:59   ` Dennis Kaarsemaker
  0 siblings, 0 replies; 100+ messages in thread
From: Dennis Kaarsemaker @ 2016-09-02 20:59 UTC (permalink / raw)
  To: Johannes Schindelin, git; +Cc: Junio C Hamano

On vr, 2016-09-02 at 18:23 +0200, Johannes Schindelin wrote:
> In particular on Windows, where shell scripts are even more expensive
> than on MacOSX or Linux, it makes sense to move a loop that forks
> Git at least once for every line in the todo list into a builtin.

Heh, this was the one thing that made me hesitate sending the
suggestion about rebase-helper --edit-todo, but with this bit already
moved, I think rebase-helper --edit-todo makes even more sense to do.

D.

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 4/9] rebase -i: also expand/collapse the SHA-1s via the rebase--helper
  2016-09-02 20:56   ` Dennis Kaarsemaker
@ 2016-09-03  7:01     ` Johannes Schindelin
  0 siblings, 0 replies; 100+ messages in thread
From: Johannes Schindelin @ 2016-09-03  7:01 UTC (permalink / raw)
  To: Dennis Kaarsemaker; +Cc: git, Junio C Hamano

Hi Dennis,

On Fri, 2 Sep 2016, Dennis Kaarsemaker wrote:

> On vr, 2016-09-02 at 18:23 +0200, Johannes Schindelin wrote:
> > This is crucial to improve performance on Windows, as the speed is now
> > mostly dominated by the SHA-1 transformation (because it spawns a new
> > rev-parse process for *every* line, and spawning processes is pretty
> > slow from Git for Windows' MSYS2 Bash).
> 
> I see these functions only used as part of an shorten-edit-expand
> sequence. Why not do a git rebase-helper --edit-todo instead? Saves
> another few process spawnings.

It would make sense to consolidate the steps, yes. I just tried to be
careful in my incremental approach and am fairly certain about the current
revision faithfully replicating the previous behavior.

> Something for yet another later followup patch?

Sure. Probably more than one patch, though: I could imagine that a minor
refactoring would allow us to read in the todo script once, then apply the
individual processing steps in-memory, and then write out the new todo
script.

And then we can implement an --edit-todo with an optional --initial flag
that triggers the check for validity and the rearranging of the
fixup/squash commands (when the user calls `git rebase --edit-todo`,
neither of those steps should be run).

Maybe you will want to have a look into that while I am mostly offline?

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 9/9] rebase -i: rearrange fixup/squash lines using the rebase--helper
  2016-09-02 16:23 ` [PATCH 9/9] rebase -i: rearrange fixup/squash lines using the rebase--helper Johannes Schindelin
@ 2016-09-03 18:03   ` Josh Triplett
  2016-09-04  6:47     ` Johannes Schindelin
  0 siblings, 1 reply; 100+ messages in thread
From: Josh Triplett @ 2016-09-03 18:03 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, Junio C Hamano

On Fri, Sep 02, 2016 at 06:23:42PM +0200, Johannes Schindelin wrote:
> Let's reimplement this with linear complexity (using a hash map to
> match the commits' subject lines) for the common case; Sadly, the
> fixup/squash feature's design neglected performance considerations,
> allowing arbitrary prefixes (read: `fixup! hell` will match the
> commit subject `hello world`), which means that we are stuck with
> quadratic performance in the worst case.

If the performance of that case matters enough, we can do better than
quadratic complexity: maintain a trie of the subjects, allowing prefix
lookups.  (Or hash all the prefixes, which you can do in linear time on
a string: hash next char, save hash, repeat.)  However, that would
pessimize the normal case of either a complete subject or a sha1, due to
the extra time taken constructing the data structure.  Probably not
worth it, if you assume that most "fixup!" subjects come from `git
commit --fixup` or similar automated means.

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 9/9] rebase -i: rearrange fixup/squash lines using the rebase--helper
  2016-09-03 18:03   ` Josh Triplett
@ 2016-09-04  6:47     ` Johannes Schindelin
  0 siblings, 0 replies; 100+ messages in thread
From: Johannes Schindelin @ 2016-09-04  6:47 UTC (permalink / raw)
  To: Josh Triplett; +Cc: git, Junio C Hamano

Hi Josh,

On Sat, 3 Sep 2016, Josh Triplett wrote:

> On Fri, Sep 02, 2016 at 06:23:42PM +0200, Johannes Schindelin wrote:
> > Let's reimplement this with linear complexity (using a hash map to
> > match the commits' subject lines) for the common case; Sadly, the
> > fixup/squash feature's design neglected performance considerations,
> > allowing arbitrary prefixes (read: `fixup! hell` will match the
> > commit subject `hello world`), which means that we are stuck with
> > quadratic performance in the worst case.
> 
> If the performance of that case matters enough, we can do better than
> quadratic complexity: maintain a trie of the subjects, allowing prefix
> lookups.  (Or hash all the prefixes, which you can do in linear time on
> a string: hash next char, save hash, repeat.)  However, that would
> pessimize the normal case of either a complete subject or a sha1, due to
> the extra time taken constructing the data structure.  Probably not
> worth it, if you assume that most "fixup!" subjects come from `git
> commit --fixup` or similar automated means.

Right. My reaction to finding our that subject prefixes were allowed, too,
was "WTF?". And then: who uses that? And then: that's gonna hurt
performance! And then: but I can optimize for the common case!

The point is: only when people specify a strict prefix will the
performance be hurt. Meaning that the performance is linear in the most
common cases.

That is good enough for me, and probably good enough for the vast majority
of the users. If it ain't broke, don't fix it.

In the case that somebody needs strict prefixes to be handled more
efficiently, which I do not expect, the "hash all prefixes" approach may
work well, but it would slow down the common case, so I'd suggest doing
that only as a fallback (i.e. if a fixup! could not be matched up, fall
back to hashing the prefixes, re-hashing the commit subjects that were
already seen so far). If this needs to be implemented at all, I would also
suggest that the person in need of that improvement also needs to take
charge of this: I will not spend more time thinking about this.

Ciao,
Johannes

^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH v2 0/9] The final building block for a faster rebase -i
  2016-09-02 16:22 [PATCH 0/9] The final building block for a faster rebase -i Johannes Schindelin
                   ` (8 preceding siblings ...)
  2016-09-02 16:23 ` [PATCH 9/9] rebase -i: rearrange fixup/squash lines using the rebase--helper Johannes Schindelin
@ 2017-04-25 13:51 ` Johannes Schindelin
  2017-04-25 13:51   ` [PATCH v2 1/9] rebase -i: generate the script via rebase--helper Johannes Schindelin
                     ` (10 more replies)
  9 siblings, 11 replies; 100+ messages in thread
From: Johannes Schindelin @ 2017-04-25 13:51 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Philip Oakley

This patch series reimplements the expensive pre- and post-processing of
the todo script in C.

And it concludes the work I did to accelerate rebase -i.

Changes since v1:

- rebased to the current `master` (which advanced quite a bit since v1
  was sent to the list on September 2nd, 2016)

- downcased error messages

- marked error messages for translation

- moved an "else" onto the same line as the preceding closing brace

- described the priorities of `fixup! <something>` in the man page, as
  suggested by Philip Oakley


Johannes Schindelin (9):
  rebase -i: generate the script via rebase--helper
  rebase -i: remove useless indentation
  rebase -i: do not invent onelines when expanding/collapsing SHA-1s
  rebase -i: also expand/collapse the SHA-1s via the rebase--helper
  t3404: relax rebase.missingCommitsCheck tests
  rebase -i: check for missing commits in the rebase--helper
  rebase -i: skip unnecessary picks using the rebase--helper
  t3415: test fixup with wrapped oneline
  rebase -i: rearrange fixup/squash lines using the rebase--helper

 Documentation/git-rebase.txt  |  16 +-
 builtin/rebase--helper.c      |  29 ++-
 git-rebase--interactive.sh    | 362 ++++-------------------------
 sequencer.c                   | 515 ++++++++++++++++++++++++++++++++++++++++++
 sequencer.h                   |   8 +
 t/t3404-rebase-interactive.sh |  22 +-
 t/t3415-rebase-autosquash.sh  |  16 +-
 7 files changed, 625 insertions(+), 343 deletions(-)


base-commit: e2cb6ab84c94f147f1259260961513b40c36108a
Based-On: rebase--helper at https://github.com/dscho/git
Fetch-Base-Via: git fetch https://github.com/dscho/git rebase--helper
Published-As: https://github.com/dscho/git/releases/tag/rebase-i-extra-v2
Fetch-It-Via: git fetch https://github.com/dscho/git rebase-i-extra-v2

Interdiff vs v1:

 diff --git a/Documentation/git-rebase.txt b/Documentation/git-rebase.txt
 index 67d48e68831..da79fbda5b3 100644
 --- a/Documentation/git-rebase.txt
 +++ b/Documentation/git-rebase.txt
 @@ -425,13 +425,15 @@ without an explicit `--interactive`.
  --autosquash::
  --no-autosquash::
  	When the commit log message begins with "squash! ..." (or
 -	"fixup! ..."), and there is a commit whose title begins with
 -	the same ..., automatically modify the todo list of rebase -i
 -	so that the commit marked for squashing comes right after the
 -	commit to be modified, and change the action of the moved
 -	commit from `pick` to `squash` (or `fixup`).  Ignores subsequent
 -	"fixup! " or "squash! " after the first, in case you referred to an
 -	earlier fixup/squash with `git commit --fixup/--squash`.
 +	"fixup! ..."), and there is already a commit in the todo list that
 +	matches the same `...`, automatically modify the todo list of rebase
 +	-i so that the commit marked for squashing comes right after the
 +	commit to be modified, and change the action of the moved commit
 +	from `pick` to `squash` (or `fixup`).  A commit matches the `...` if
 +	the commit subject matches, or if the `...` refers to the commit's
 +	hash. As a fall-back, partial matches of the commit subject work,
 +	too.  The recommended way to create fixup/squash commits is by using
 +	the `--fixup`/`--squash` options of linkgit:git-commit[1].
  +
  This option is only valid when the `--interactive` option is used.
  +
 diff --git a/sequencer.c b/sequencer.c
 index 5502b9006ed..2b07fb9e0ce 100644
 --- a/sequencer.c
 +++ b/sequencer.c
 @@ -2416,10 +2416,10 @@ int sequencer_make_script(int keep_empty, FILE *out,
  	pp.output_encoding = get_log_output_encoding();
  
  	if (setup_revisions(argc, argv, &revs, NULL) > 1)
 -		return error("make_script: unhandled options");
 +		return error(_("make_script: unhandled options"));
  
  	if (prepare_revision_walk(&revs) < 0)
 -		return error("make_script: error preparing revisions");
 +		return error(_("make_script: error preparing revisions"));
  
  	while ((commit = get_revision(&revs))) {
  		strbuf_reset(&buf);
 @@ -2445,23 +2445,23 @@ int transform_todo_ids(int shorten_sha1s)
  	strbuf_reset(&todo_list.buf);
  	fd = open(todo_file, O_RDONLY);
  	if (fd < 0)
 -		return error_errno(_("Could not open '%s'"), todo_file);
 +		return error_errno(_("could not open '%s'"), todo_file);
  	if (strbuf_read(&todo_list.buf, fd, 0) < 0) {
  		close(fd);
 -		return error(_("Could not read %s."), todo_file);
 +		return error(_("could not read '%s'."), todo_file);
  	}
  	close(fd);
  
  	res = parse_insn_buffer(todo_list.buf.buf, &todo_list);
  	if (res) {
  		todo_list_release(&todo_list);
 -		return error(_("Unusable instruction sheet: %s"), todo_file);
 +		return error(_("unusable instruction sheet: '%s'"), todo_file);
  	}
  
  	out = fopen(todo_file, "w");
  	if (!out) {
  		todo_list_release(&todo_list);
 -		return error(_("Unable to open '%s' for writing"), todo_file);
 +		return error(_("unable to open '%s' for writing"), todo_file);
  	}
  	for (i = 0; i < todo_list.nr; i++) {
  		struct todo_item *item = todo_list.items + i;
 @@ -2508,8 +2508,8 @@ static enum check_level get_missing_commit_check_level(void)
  		return CHECK_WARN;
  	if (!strcasecmp("error", value))
  		return CHECK_ERROR;
 -	warning(_("Unrecognized setting $check_level for option"
 -			"rebase.missingCommitsCheck. Ignoring."));
 +	warning(_("unrecognized setting %s for option"
 +		  "rebase.missingCommitsCheck. Ignoring."), value);
  	return CHECK_IGNORE;
  }
  
 @@ -2530,12 +2530,12 @@ int check_todo_list(void)
  	strbuf_addstr(&todo_file, rebase_path_todo());
  	fd = open(todo_file.buf, O_RDONLY);
  	if (fd < 0) {
 -		res = error_errno(_("Could not open %s"), todo_file.buf);
 +		res = error_errno(_("could not open '%s'"), todo_file.buf);
  		goto leave_check;
  	}
  	if (strbuf_read(&todo_list.buf, fd, 0) < 0) {
  		close(fd);
 -		res = error(_("Could not read %s."), todo_file.buf);
 +		res = error(_("could not read '%s'."), todo_file.buf);
  		goto leave_check;
  	}
  	close(fd);
 @@ -2556,12 +2556,12 @@ int check_todo_list(void)
  	strbuf_addstr(&todo_file, ".backup");
  	fd = open(todo_file.buf, O_RDONLY);
  	if (fd < 0) {
 -		res = error_errno(_("Could not open %s"), todo_file.buf);
 +		res = error_errno(_("could not open '%s'"), todo_file.buf);
  		goto leave_check;
  	}
  	if (strbuf_read(&todo_list.buf, fd, 0) < 0) {
  		close(fd);
 -		res = error(_("Could not read %s."), todo_file.buf);
 +		res = error(_("could not read '%s'."), todo_file.buf);
  		goto leave_check;
  	}
  	close(fd);
 @@ -2610,9 +2610,10 @@ int check_todo_list(void)
  
  	if (raise_error)
  		fprintf(stderr,
 -			_("You can fix this with 'git rebase --edit-todo'.\n"
 -			"Or you can abort the rebase with 'git rebase"
 -			" --abort'.\n"));
 +			_("You can fix this with 'git rebase --edit-todo' "
 +			  "and then run 'git rebase --continue'.\n"
 +			  "Or you can abort the rebase with 'git rebase"
 +			  " --abort'.\n"));
  
  	return res;
  }
 @@ -2627,20 +2628,20 @@ int skip_unnecessary_picks(void)
  	int fd, i;
  
  	if (!read_oneliner(&buf, rebase_path_onto(), 0))
 -		return error("Could not read 'onto'");
 +		return error(_("could not read 'onto'"));
  	if (get_sha1(buf.buf, onto_oid.hash)) {
  		strbuf_release(&buf);
 -		return error("Need a HEAD to fixup");
 +		return error(_("need a HEAD to fixup"));
  	}
  	strbuf_release(&buf);
  
  	fd = open(todo_file, O_RDONLY);
  	if (fd < 0) {
 -		return error_errno(_("Could not open '%s'"), todo_file);
 +		return error_errno(_("could not open '%s'"), todo_file);
  	}
  	if (strbuf_read(&todo_list.buf, fd, 0) < 0) {
  		close(fd);
 -		return error(_("Could not read '%s'."), todo_file);
 +		return error(_("could not read '%s'."), todo_file);
  	}
  	close(fd);
  	if (parse_insn_buffer(todo_list.buf.buf, &todo_list) < 0) {
 @@ -2657,7 +2658,7 @@ int skip_unnecessary_picks(void)
  			break;
  		if (parse_commit(item->commit)) {
  			todo_list_release(&todo_list);
 -			return error(_("Could not parse commit '%s'"),
 +			return error(_("could not parse commit '%s'"),
  				oid_to_hex(&item->commit->object.oid));
  		}
  		if (!item->commit->parents)
 @@ -2677,7 +2678,7 @@ int skip_unnecessary_picks(void)
  		fd = open(done_path, O_CREAT | O_WRONLY | O_APPEND, 0666);
  		if (write_in_full(fd, todo_list.buf.buf, offset) < 0) {
  			todo_list_release(&todo_list);
 -			return error_errno(_("Could not write to '%s'"),
 +			return error_errno(_("could not write to '%s'"),
  				done_path);
  		}
  		close(fd);
 @@ -2686,12 +2687,12 @@ int skip_unnecessary_picks(void)
  		if (write_in_full(fd, todo_list.buf.buf + offset,
  				todo_list.buf.len - offset) < 0) {
  			todo_list_release(&todo_list);
 -			return error_errno(_("Could not write to '%s'"),
 +			return error_errno(_("could not write to '%s'"),
  				rebase_path_todo());
  		}
  		if (ftruncate(fd, todo_list.buf.len - offset) < 0) {
  			todo_list_release(&todo_list);
 -			return error_errno(_("Could not truncate '%s'"),
 +			return error_errno(_("could not truncate '%s'"),
  				rebase_path_todo());
  		}
  		close(fd);
 @@ -2738,10 +2739,10 @@ int rearrange_squash(void)
  
  	fd = open(todo_file, O_RDONLY);
  	if (fd < 0)
 -		return error_errno(_("Could not open %s"), todo_file);
 +		return error_errno(_("could not open '%s'"), todo_file);
  	if (strbuf_read(&todo_list.buf, fd, 0) < 0) {
  		close(fd);
 -		return error(_("Could not read %s."), todo_file);
 +		return error(_("could not read '%s'."), todo_file);
  	}
  	close(fd);
  	if (parse_insn_buffer(todo_list.buf.buf, &todo_list) < 0) {
 @@ -2778,7 +2779,7 @@ int rearrange_squash(void)
  
  		if (is_fixup(item->command)) {
  			todo_list_release(&todo_list);
 -			return error(_("The script was already rearranged."));
 +			return error(_("the script was already rearranged."));
  		}
  
  		item->commit->util = item;
 @@ -2832,8 +2833,7 @@ int rearrange_squash(void)
  			else
  				next[tail[i2]] = i;
  			tail[i2] = i;
 -		}
 -		else if (!hashmap_get_from_hash(&subject2item,
 +		} else if (!hashmap_get_from_hash(&subject2item,
  						strhash(subject), subject)) {
  			FLEX_ALLOC_MEM(entry, subject, buf.buf, buf.len);
  			entry->i = i;
 @@ -2883,11 +2883,12 @@ int rearrange_squash(void)
  
  		fd = open(todo_file, O_WRONLY);
  		if (fd < 0)
 -			res |= error_errno(_("Could not open %s"), todo_file);
 +			res = error_errno(_("could not open '%s'"), todo_file);
  		else if (write(fd, buf.buf, buf.len) < 0)
 -			res |= error_errno(_("Could not read %s."), todo_file);
 +			res = error_errno(_("could not read '%s'."), todo_file);
  		else if (ftruncate(fd, buf.len) < 0)
 -			res |= error_errno(_("Could not finish %s"), todo_file);
 +			res = error_errno(_("could not finish '%s'"),
 +					   todo_file);
  		close(fd);
  		strbuf_release(&buf);
  	}

-- 
2.12.2.windows.2.406.gd14a8f8640f


^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH v2 1/9] rebase -i: generate the script via rebase--helper
  2017-04-25 13:51 ` [PATCH v2 0/9] The final building block for a faster rebase -i Johannes Schindelin
@ 2017-04-25 13:51   ` Johannes Schindelin
  2017-04-26 10:45     ` Jeff King
  2017-04-25 13:51   ` [PATCH v2 2/9] rebase -i: remove useless indentation Johannes Schindelin
                     ` (9 subsequent siblings)
  10 siblings, 1 reply; 100+ messages in thread
From: Johannes Schindelin @ 2017-04-25 13:51 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Philip Oakley

The first step of an interactive rebase is to generate the so-called "todo
script", to be stored in the state directory as "git-rebase-todo" and to
be edited by the user.

Originally, we adjusted the output of `git log <options>` using a simple
sed script. Over the course of the years, the code became more
complicated. We now use shell scripting to edit the output of `git log`
conditionally, depending whether to keep "empty" commits (i.e. commits
that do not change any files).

On platforms where shell scripting is not native, this can be a serious
drag. And it opens the door for incompatibilities between platforms when
it comes to shell scripting or to Unix-y commands.

Let's just re-implement the todo script generation in plain C, using the
revision machinery directly.

This is substantially faster, improving the speed relative to the
shell script version of the interactive rebase from 2x to 3x on Windows.

Note that the rearrange_squash() function in git-rebase--interactive
relied on the fact that we set the "format" variable to the config setting
rebase.instructionFormat. Relying on a side effect like this is no good,
hence we explicitly perform that assignment (possibly again) in
rearrange_squash().

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 builtin/rebase--helper.c   |  8 +++++++-
 git-rebase--interactive.sh | 44 +++++++++++++++++++++++---------------------
 sequencer.c                | 44 ++++++++++++++++++++++++++++++++++++++++++++
 sequencer.h                |  3 +++
 4 files changed, 77 insertions(+), 22 deletions(-)

diff --git a/builtin/rebase--helper.c b/builtin/rebase--helper.c
index ca1ebb2fa18..821058d452d 100644
--- a/builtin/rebase--helper.c
+++ b/builtin/rebase--helper.c
@@ -11,15 +11,19 @@ static const char * const builtin_rebase_helper_usage[] = {
 int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
 {
 	struct replay_opts opts = REPLAY_OPTS_INIT;
+	int keep_empty = 0;
 	enum {
-		CONTINUE = 1, ABORT
+		CONTINUE = 1, ABORT, MAKE_SCRIPT
 	} command = 0;
 	struct option options[] = {
 		OPT_BOOL(0, "ff", &opts.allow_ff, N_("allow fast-forward")),
+		OPT_BOOL(0, "keep-empty", &keep_empty, N_("keep empty commits")),
 		OPT_CMDMODE(0, "continue", &command, N_("continue rebase"),
 				CONTINUE),
 		OPT_CMDMODE(0, "abort", &command, N_("abort rebase"),
 				ABORT),
+		OPT_CMDMODE(0, "make-script", &command,
+			N_("make rebase script"), MAKE_SCRIPT),
 		OPT_END()
 	};
 
@@ -36,5 +40,7 @@ int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
 		return !!sequencer_continue(&opts);
 	if (command == ABORT && argc == 1)
 		return !!sequencer_remove_state(&opts);
+	if (command == MAKE_SCRIPT && argc > 1)
+		return !!sequencer_make_script(keep_empty, stdout, argc, argv);
 	usage_with_options(builtin_rebase_helper_usage, options);
 }
diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
index 2c9c0165b5a..609e150d38f 100644
--- a/git-rebase--interactive.sh
+++ b/git-rebase--interactive.sh
@@ -785,6 +785,7 @@ collapse_todo_ids() {
 # each log message will be re-retrieved in order to normalize the
 # autosquash arrangement
 rearrange_squash () {
+	format=$(git config --get rebase.instructionFormat)
 	# extract fixup!/squash! lines and resolve any referenced sha1's
 	while read -r pick sha1 message
 	do
@@ -1210,26 +1211,27 @@ else
 	revisions=$onto...$orig_head
 	shortrevisions=$shorthead
 fi
-format=$(git config --get rebase.instructionFormat)
-# the 'rev-list .. | sed' requires %m to parse; the instruction requires %H to parse
-git rev-list $merges_option --format="%m%H ${format:-%s}" \
-	--reverse --left-right --topo-order \
-	$revisions ${restrict_revision+^$restrict_revision} | \
-	sed -n "s/^>//p" |
-while read -r sha1 rest
-do
-
-	if test -z "$keep_empty" && is_empty_commit $sha1 && ! is_merge_commit $sha1
-	then
-		comment_out="$comment_char "
-	else
-		comment_out=
-	fi
+if test t != "$preserve_merges"
+then
+	git rebase--helper --make-script ${keep_empty:+--keep-empty} \
+		$revisions ${restrict_revision+^$restrict_revision} >"$todo"
+else
+	format=$(git config --get rebase.instructionFormat)
+	# the 'rev-list .. | sed' requires %m to parse; the instruction requires %H to parse
+	git rev-list $merges_option --format="%m%H ${format:-%s}" \
+		--reverse --left-right --topo-order \
+		$revisions ${restrict_revision+^$restrict_revision} | \
+		sed -n "s/^>//p" |
+	while read -r sha1 rest
+	do
+
+		if test -z "$keep_empty" && is_empty_commit $sha1 && ! is_merge_commit $sha1
+		then
+			comment_out="$comment_char "
+		else
+			comment_out=
+		fi
 
-	if test t != "$preserve_merges"
-	then
-		printf '%s\n' "${comment_out}pick $sha1 $rest" >>"$todo"
-	else
 		if test -z "$rebase_root"
 		then
 			preserve=t
@@ -1248,8 +1250,8 @@ do
 			touch "$rewritten"/$sha1
 			printf '%s\n' "${comment_out}pick $sha1 $rest" >>"$todo"
 		fi
-	fi
-done
+	done
+fi
 
 # Watch for commits that been dropped by --cherry-pick
 if test t = "$preserve_merges"
diff --git a/sequencer.c b/sequencer.c
index 77afecaebf0..d2d5bcd9b43 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -2388,3 +2388,47 @@ void append_signoff(struct strbuf *msgbuf, int ignore_footer, unsigned flag)
 
 	strbuf_release(&sob);
 }
+
+int sequencer_make_script(int keep_empty, FILE *out,
+		int argc, const char **argv)
+{
+	char *format = "%s";
+	struct pretty_print_context pp = {0};
+	struct strbuf buf = STRBUF_INIT;
+	struct rev_info revs;
+	struct commit *commit;
+
+	init_revisions(&revs, NULL);
+	revs.verbose_header = 1;
+	revs.max_parents = 1;
+	revs.cherry_pick = 1;
+	revs.limited = 1;
+	revs.reverse = 1;
+	revs.right_only = 1;
+	revs.sort_order = REV_SORT_IN_GRAPH_ORDER;
+	revs.topo_order = 1;
+
+	revs.pretty_given = 1;
+	git_config_get_string("rebase.instructionFormat", &format);
+	get_commit_format(format, &revs);
+	pp.fmt = revs.commit_format;
+	pp.output_encoding = get_log_output_encoding();
+
+	if (setup_revisions(argc, argv, &revs, NULL) > 1)
+		return error(_("make_script: unhandled options"));
+
+	if (prepare_revision_walk(&revs) < 0)
+		return error(_("make_script: error preparing revisions"));
+
+	while ((commit = get_revision(&revs))) {
+		strbuf_reset(&buf);
+		if (!keep_empty && is_original_commit_empty(commit))
+			strbuf_addf(&buf, "%c ", comment_line_char);
+		strbuf_addf(&buf, "pick %s ", oid_to_hex(&commit->object.oid));
+		pretty_print_commit(&pp, commit, &buf);
+		strbuf_addch(&buf, '\n');
+		fputs(buf.buf, out);
+	}
+	strbuf_release(&buf);
+	return 0;
+}
diff --git a/sequencer.h b/sequencer.h
index f885b68395f..83f2943b7a9 100644
--- a/sequencer.h
+++ b/sequencer.h
@@ -45,6 +45,9 @@ int sequencer_continue(struct replay_opts *opts);
 int sequencer_rollback(struct replay_opts *opts);
 int sequencer_remove_state(struct replay_opts *opts);
 
+int sequencer_make_script(int keep_empty, FILE *out,
+		int argc, const char **argv);
+
 extern const char sign_off_header[];
 
 void append_signoff(struct strbuf *msgbuf, int ignore_footer, unsigned flag);
-- 
2.12.2.windows.2.406.gd14a8f8640f



^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [PATCH v2 2/9] rebase -i: remove useless indentation
  2017-04-25 13:51 ` [PATCH v2 0/9] The final building block for a faster rebase -i Johannes Schindelin
  2017-04-25 13:51   ` [PATCH v2 1/9] rebase -i: generate the script via rebase--helper Johannes Schindelin
@ 2017-04-25 13:51   ` Johannes Schindelin
  2017-04-25 13:51   ` [PATCH v2 3/9] rebase -i: do not invent onelines when expanding/collapsing SHA-1s Johannes Schindelin
                     ` (8 subsequent siblings)
  10 siblings, 0 replies; 100+ messages in thread
From: Johannes Schindelin @ 2017-04-25 13:51 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Philip Oakley

The commands used to be indented, and it is nice to look at, but when we
transform the SHA-1s, the indentation is removed. So let's do away with it.

For the moment, at least: when we will use the upcoming rebase--helper
to transform the SHA-1s, we *will* keep the indentation and can
reintroduce it. Yet, to be able to validate the rebase--helper against
the output of the current shell script version, we need to remove the
extra indentation.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 git-rebase--interactive.sh | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
index 609e150d38f..c40b1fd1d2e 100644
--- a/git-rebase--interactive.sh
+++ b/git-rebase--interactive.sh
@@ -155,13 +155,13 @@ reschedule_last_action () {
 append_todo_help () {
 	gettext "
 Commands:
- p, pick = use commit
- r, reword = use commit, but edit the commit message
- e, edit = use commit, but stop for amending
- s, squash = use commit, but meld into previous commit
- f, fixup = like \"squash\", but discard this commit's log message
- x, exec = run command (the rest of the line) using shell
- d, drop = remove commit
+p, pick = use commit
+r, reword = use commit, but edit the commit message
+e, edit = use commit, but stop for amending
+s, squash = use commit, but meld into previous commit
+f, fixup = like \"squash\", but discard this commit's log message
+x, exec = run command (the rest of the line) using shell
+d, drop = remove commit
 
 These lines can be re-ordered; they are executed from top to bottom.
 " | git stripspace --comment-lines >>"$todo"
-- 
2.12.2.windows.2.406.gd14a8f8640f



^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [PATCH v2 3/9] rebase -i: do not invent onelines when expanding/collapsing SHA-1s
  2017-04-25 13:51 ` [PATCH v2 0/9] The final building block for a faster rebase -i Johannes Schindelin
  2017-04-25 13:51   ` [PATCH v2 1/9] rebase -i: generate the script via rebase--helper Johannes Schindelin
  2017-04-25 13:51   ` [PATCH v2 2/9] rebase -i: remove useless indentation Johannes Schindelin
@ 2017-04-25 13:51   ` Johannes Schindelin
  2017-04-25 13:51   ` [PATCH v2 4/9] rebase -i: also expand/collapse the SHA-1s via the rebase--helper Johannes Schindelin
                     ` (7 subsequent siblings)
  10 siblings, 0 replies; 100+ messages in thread
From: Johannes Schindelin @ 2017-04-25 13:51 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Philip Oakley

To avoid problems with short SHA-1s that become non-unique during the
rebase, we rewrite the todo script with short/long SHA-1s before and
after letting the user edit the script. Since SHA-1s are not intuitive
for humans, rebase -i also provides the onelines (commit message
subjects) in the script, purely for the user's convenience.

It is very possible to generate a todo script via different means than
rebase -i and then to let rebase -i run with it; In this case, these
onelines are not required.

And this is where the expand/collapse machinery has a bug: it *expects*
that oneline, and failing to find one reuses the previous SHA-1 as
"oneline".

It was most likely an oversight, and made implementation in the (quite
limiting) shell script language less convoluted. However, we are about
to reimplement performance-critical parts in C (and due to spawning a
git.exe process for every single line of the todo script, the
expansion/collapsing of the SHA-1s *is* performance-hampering on
Windows), therefore let's fix this bug to make cross-validation with the
C version of that functionality possible.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 git-rebase--interactive.sh | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
index c40b1fd1d2e..214af0372ba 100644
--- a/git-rebase--interactive.sh
+++ b/git-rebase--interactive.sh
@@ -760,7 +760,12 @@ transform_todo_ids () {
 			;;
 		*)
 			sha1=$(git rev-parse --verify --quiet "$@" ${rest%%[	 ]*}) &&
-			rest="$sha1 ${rest#*[	 ]}"
+			if test "a$rest" = "a${rest#*[	 ]}"
+			then
+				rest=$sha1
+			else
+				rest="$sha1 ${rest#*[	 ]}"
+			fi
 			;;
 		esac
 		printf '%s\n' "$command${rest:+ }$rest"
-- 
2.12.2.windows.2.406.gd14a8f8640f



^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [PATCH v2 4/9] rebase -i: also expand/collapse the SHA-1s via the rebase--helper
  2017-04-25 13:51 ` [PATCH v2 0/9] The final building block for a faster rebase -i Johannes Schindelin
                     ` (2 preceding siblings ...)
  2017-04-25 13:51   ` [PATCH v2 3/9] rebase -i: do not invent onelines when expanding/collapsing SHA-1s Johannes Schindelin
@ 2017-04-25 13:51   ` Johannes Schindelin
  2017-04-25 13:52   ` [PATCH v2 5/9] t3404: relax rebase.missingCommitsCheck tests Johannes Schindelin
                     ` (6 subsequent siblings)
  10 siblings, 0 replies; 100+ messages in thread
From: Johannes Schindelin @ 2017-04-25 13:51 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Philip Oakley

This is crucial to improve performance on Windows, as the speed is now
mostly dominated by the SHA-1 transformation (because it spawns a new
rev-parse process for *every* line, and spawning processes is pretty
slow from Git for Windows' MSYS2 Bash).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 builtin/rebase--helper.c   | 10 +++++++-
 git-rebase--interactive.sh |  4 ++--
 sequencer.c                | 59 ++++++++++++++++++++++++++++++++++++++++++++++
 sequencer.h                |  2 ++
 4 files changed, 72 insertions(+), 3 deletions(-)

diff --git a/builtin/rebase--helper.c b/builtin/rebase--helper.c
index 821058d452d..9444c8d6c60 100644
--- a/builtin/rebase--helper.c
+++ b/builtin/rebase--helper.c
@@ -13,7 +13,7 @@ int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
 	struct replay_opts opts = REPLAY_OPTS_INIT;
 	int keep_empty = 0;
 	enum {
-		CONTINUE = 1, ABORT, MAKE_SCRIPT
+		CONTINUE = 1, ABORT, MAKE_SCRIPT, SHORTEN_SHA1S, EXPAND_SHA1S
 	} command = 0;
 	struct option options[] = {
 		OPT_BOOL(0, "ff", &opts.allow_ff, N_("allow fast-forward")),
@@ -24,6 +24,10 @@ int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
 				ABORT),
 		OPT_CMDMODE(0, "make-script", &command,
 			N_("make rebase script"), MAKE_SCRIPT),
+		OPT_CMDMODE(0, "shorten-sha1s", &command,
+			N_("shorten SHA-1s in the todo list"), SHORTEN_SHA1S),
+		OPT_CMDMODE(0, "expand-sha1s", &command,
+			N_("expand SHA-1s in the todo list"), EXPAND_SHA1S),
 		OPT_END()
 	};
 
@@ -42,5 +46,9 @@ int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
 		return !!sequencer_remove_state(&opts);
 	if (command == MAKE_SCRIPT && argc > 1)
 		return !!sequencer_make_script(keep_empty, stdout, argc, argv);
+	if (command == SHORTEN_SHA1S && argc == 1)
+		return !!transform_todo_ids(1);
+	if (command == EXPAND_SHA1S && argc == 1)
+		return !!transform_todo_ids(0);
 	usage_with_options(builtin_rebase_helper_usage, options);
 }
diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
index 214af0372ba..52a19e0bdb3 100644
--- a/git-rebase--interactive.sh
+++ b/git-rebase--interactive.sh
@@ -774,11 +774,11 @@ transform_todo_ids () {
 }
 
 expand_todo_ids() {
-	transform_todo_ids
+	git rebase--helper --expand-sha1s
 }
 
 collapse_todo_ids() {
-	transform_todo_ids --short
+	git rebase--helper --shorten-sha1s
 }
 
 # Rearrange the todo list that has both "pick sha1 msg" and
diff --git a/sequencer.c b/sequencer.c
index d2d5bcd9b43..d6ae6407546 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -2432,3 +2432,62 @@ int sequencer_make_script(int keep_empty, FILE *out,
 	strbuf_release(&buf);
 	return 0;
 }
+
+
+int transform_todo_ids(int shorten_sha1s)
+{
+	const char *todo_file = rebase_path_todo();
+	struct todo_list todo_list = TODO_LIST_INIT;
+	int fd, res, i;
+	FILE *out;
+
+	strbuf_reset(&todo_list.buf);
+	fd = open(todo_file, O_RDONLY);
+	if (fd < 0)
+		return error_errno(_("could not open '%s'"), todo_file);
+	if (strbuf_read(&todo_list.buf, fd, 0) < 0) {
+		close(fd);
+		return error(_("could not read '%s'."), todo_file);
+	}
+	close(fd);
+
+	res = parse_insn_buffer(todo_list.buf.buf, &todo_list);
+	if (res) {
+		todo_list_release(&todo_list);
+		return error(_("unusable instruction sheet: '%s'"), todo_file);
+	}
+
+	out = fopen(todo_file, "w");
+	if (!out) {
+		todo_list_release(&todo_list);
+		return error(_("unable to open '%s' for writing"), todo_file);
+	}
+	for (i = 0; i < todo_list.nr; i++) {
+		struct todo_item *item = todo_list.items + i;
+		int bol = item->offset_in_buf;
+		const char *p = todo_list.buf.buf + bol;
+		int eol = i + 1 < todo_list.nr ?
+			todo_list.items[i + 1].offset_in_buf :
+			todo_list.buf.len;
+
+		if (item->command >= TODO_EXEC && item->command != TODO_DROP)
+			fwrite(p, eol - bol, 1, out);
+		else {
+			int eoc = strcspn(p, " \t");
+			const char *sha1 = shorten_sha1s ?
+				short_commit_name(item->commit) :
+				oid_to_hex(&item->commit->object.oid);
+
+			if (!eoc) {
+				p += strspn(p, " \t");
+				eoc = strcspn(p, " \t");
+			}
+
+			fprintf(out, "%.*s %s %.*s\n",
+				eoc, p, sha1, item->arg_len, item->arg);
+		}
+	}
+	fclose(out);
+	todo_list_release(&todo_list);
+	return 0;
+}
diff --git a/sequencer.h b/sequencer.h
index 83f2943b7a9..47a81034e76 100644
--- a/sequencer.h
+++ b/sequencer.h
@@ -48,6 +48,8 @@ int sequencer_remove_state(struct replay_opts *opts);
 int sequencer_make_script(int keep_empty, FILE *out,
 		int argc, const char **argv);
 
+int transform_todo_ids(int shorten_sha1s);
+
 extern const char sign_off_header[];
 
 void append_signoff(struct strbuf *msgbuf, int ignore_footer, unsigned flag);
-- 
2.12.2.windows.2.406.gd14a8f8640f



^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [PATCH v2 5/9] t3404: relax rebase.missingCommitsCheck tests
  2017-04-25 13:51 ` [PATCH v2 0/9] The final building block for a faster rebase -i Johannes Schindelin
                     ` (3 preceding siblings ...)
  2017-04-25 13:51   ` [PATCH v2 4/9] rebase -i: also expand/collapse the SHA-1s via the rebase--helper Johannes Schindelin
@ 2017-04-25 13:52   ` Johannes Schindelin
  2017-04-25 13:52   ` [PATCH v2 6/9] rebase -i: check for missing commits in the rebase--helper Johannes Schindelin
                     ` (5 subsequent siblings)
  10 siblings, 0 replies; 100+ messages in thread
From: Johannes Schindelin @ 2017-04-25 13:52 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Philip Oakley

These tests were a bit anal about the *exact* warning/error message
printed by git rebase. But those messages are intended for the *end
user*, therefore it does not make sense to test so rigidly for the
*exact* wording.

In the following, we will reimplement the missing commits check in
the sequencer, with slightly different words.

So let's just test for the parts in the warning/error message that
we *really* care about, nothing more, nothing less.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 t/t3404-rebase-interactive.sh | 22 ++++------------------
 1 file changed, 4 insertions(+), 18 deletions(-)

diff --git a/t/t3404-rebase-interactive.sh b/t/t3404-rebase-interactive.sh
index 33d392ba112..61113be08a4 100755
--- a/t/t3404-rebase-interactive.sh
+++ b/t/t3404-rebase-interactive.sh
@@ -1242,20 +1242,13 @@ test_expect_success 'rebase -i respects rebase.missingCommitsCheck = error' '
 	test B = $(git cat-file commit HEAD^ | sed -ne \$p)
 '
 
-cat >expect <<EOF
-Warning: the command isn't recognized in the following line:
- - badcmd $(git rev-list --oneline -1 master~1)
-
-You can fix this with 'git rebase --edit-todo' and then run 'git rebase --continue'.
-Or you can abort the rebase with 'git rebase --abort'.
-EOF
-
 test_expect_success 'static check of bad command' '
 	rebase_setup_and_clean bad-cmd &&
 	set_fake_editor &&
 	test_must_fail env FAKE_LINES="1 2 3 bad 4 5" \
 		git rebase -i --root 2>actual &&
-	test_i18ncmp expect actual &&
+	test_i18ngrep "badcmd $(git rev-list --oneline -1 master~1)" actual &&
+	test_i18ngrep "You can fix this with .git rebase --edit-todo.." actual &&
 	FAKE_LINES="1 2 3 drop 4 5" git rebase --edit-todo &&
 	git rebase --continue &&
 	test E = $(git cat-file commit HEAD | sed -ne \$p) &&
@@ -1277,20 +1270,13 @@ test_expect_success 'tabs and spaces are accepted in the todolist' '
 	test E = $(git cat-file commit HEAD | sed -ne \$p)
 '
 
-cat >expect <<EOF
-Warning: the SHA-1 is missing or isn't a commit in the following line:
- - edit XXXXXXX False commit
-
-You can fix this with 'git rebase --edit-todo' and then run 'git rebase --continue'.
-Or you can abort the rebase with 'git rebase --abort'.
-EOF
-
 test_expect_success 'static check of bad SHA-1' '
 	rebase_setup_and_clean bad-sha &&
 	set_fake_editor &&
 	test_must_fail env FAKE_LINES="1 2 edit fakesha 3 4 5 #" \
 		git rebase -i --root 2>actual &&
-	test_i18ncmp expect actual &&
+	test_i18ngrep "edit XXXXXXX False commit" actual &&
+	test_i18ngrep "You can fix this with .git rebase --edit-todo.." actual &&
 	FAKE_LINES="1 2 4 5 6" git rebase --edit-todo &&
 	git rebase --continue &&
 	test E = $(git cat-file commit HEAD | sed -ne \$p)
-- 
2.12.2.windows.2.406.gd14a8f8640f



^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [PATCH v2 6/9] rebase -i: check for missing commits in the rebase--helper
  2017-04-25 13:51 ` [PATCH v2 0/9] The final building block for a faster rebase -i Johannes Schindelin
                     ` (4 preceding siblings ...)
  2017-04-25 13:52   ` [PATCH v2 5/9] t3404: relax rebase.missingCommitsCheck tests Johannes Schindelin
@ 2017-04-25 13:52   ` Johannes Schindelin
  2017-04-25 13:52   ` [PATCH v2 7/9] rebase -i: skip unnecessary picks using " Johannes Schindelin
                     ` (4 subsequent siblings)
  10 siblings, 0 replies; 100+ messages in thread
From: Johannes Schindelin @ 2017-04-25 13:52 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Philip Oakley

In particular on Windows, where shell scripts are even more expensive
than on MacOSX or Linux, it makes sense to move a loop that forks
Git at least once for every line in the todo list into a builtin.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 builtin/rebase--helper.c   |   7 +-
 git-rebase--interactive.sh | 164 ++-------------------------------------------
 sequencer.c                | 125 ++++++++++++++++++++++++++++++++++
 sequencer.h                |   1 +
 4 files changed, 137 insertions(+), 160 deletions(-)

diff --git a/builtin/rebase--helper.c b/builtin/rebase--helper.c
index 9444c8d6c60..e706eac710d 100644
--- a/builtin/rebase--helper.c
+++ b/builtin/rebase--helper.c
@@ -13,7 +13,8 @@ int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
 	struct replay_opts opts = REPLAY_OPTS_INIT;
 	int keep_empty = 0;
 	enum {
-		CONTINUE = 1, ABORT, MAKE_SCRIPT, SHORTEN_SHA1S, EXPAND_SHA1S
+		CONTINUE = 1, ABORT, MAKE_SCRIPT, SHORTEN_SHA1S, EXPAND_SHA1S,
+		CHECK_TODO_LIST
 	} command = 0;
 	struct option options[] = {
 		OPT_BOOL(0, "ff", &opts.allow_ff, N_("allow fast-forward")),
@@ -28,6 +29,8 @@ int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
 			N_("shorten SHA-1s in the todo list"), SHORTEN_SHA1S),
 		OPT_CMDMODE(0, "expand-sha1s", &command,
 			N_("expand SHA-1s in the todo list"), EXPAND_SHA1S),
+		OPT_CMDMODE(0, "check-todo-list", &command,
+			N_("check the todo list"), CHECK_TODO_LIST),
 		OPT_END()
 	};
 
@@ -50,5 +53,7 @@ int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
 		return !!transform_todo_ids(1);
 	if (command == EXPAND_SHA1S && argc == 1)
 		return !!transform_todo_ids(0);
+	if (command == CHECK_TODO_LIST && argc == 1)
+		return !!check_todo_list();
 	usage_with_options(builtin_rebase_helper_usage, options);
 }
diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
index 52a19e0bdb3..1649506e1e4 100644
--- a/git-rebase--interactive.sh
+++ b/git-rebase--interactive.sh
@@ -890,96 +890,6 @@ add_exec_commands () {
 	mv "$1.new" "$1"
 }
 
-# Check if the SHA-1 passed as an argument is a
-# correct one, if not then print $2 in "$todo".badsha
-# $1: the SHA-1 to test
-# $2: the line number of the input
-# $3: the input filename
-check_commit_sha () {
-	badsha=0
-	if test -z "$1"
-	then
-		badsha=1
-	else
-		sha1_verif="$(git rev-parse --verify --quiet $1^{commit})"
-		if test -z "$sha1_verif"
-		then
-			badsha=1
-		fi
-	fi
-
-	if test $badsha -ne 0
-	then
-		line="$(sed -n -e "${2}p" "$3")"
-		warn "$(eval_gettext "\
-Warning: the SHA-1 is missing or isn't a commit in the following line:
- - \$line")"
-		warn
-	fi
-
-	return $badsha
-}
-
-# prints the bad commits and bad commands
-# from the todolist in stdin
-check_bad_cmd_and_sha () {
-	retval=0
-	lineno=0
-	while read -r command rest
-	do
-		lineno=$(( $lineno + 1 ))
-		case $command in
-		"$comment_char"*|''|noop|x|exec)
-			# Doesn't expect a SHA-1
-			;;
-		"$cr")
-			# Work around CR left by "read" (e.g. with Git for
-			# Windows' Bash).
-			;;
-		pick|p|drop|d|reword|r|edit|e|squash|s|fixup|f)
-			if ! check_commit_sha "${rest%%[ 	]*}" "$lineno" "$1"
-			then
-				retval=1
-			fi
-			;;
-		*)
-			line="$(sed -n -e "${lineno}p" "$1")"
-			warn "$(eval_gettext "\
-Warning: the command isn't recognized in the following line:
- - \$line")"
-			warn
-			retval=1
-			;;
-		esac
-	done <"$1"
-	return $retval
-}
-
-# Print the list of the SHA-1 of the commits
-# from stdin to stdout
-todo_list_to_sha_list () {
-	git stripspace --strip-comments |
-	while read -r command sha1 rest
-	do
-		case $command in
-		"$comment_char"*|''|noop|x|"exec")
-			;;
-		*)
-			long_sha=$(git rev-list --no-walk "$sha1" 2>/dev/null)
-			printf "%s\n" "$long_sha"
-			;;
-		esac
-	done
-}
-
-# Use warn for each line in stdin
-warn_lines () {
-	while read -r line
-	do
-		warn " - $line"
-	done
-}
-
 # Switch to the branch in $into and notify it in the reflog
 checkout_onto () {
 	GIT_REFLOG_ACTION="$GIT_REFLOG_ACTION: checkout $onto_name"
@@ -994,74 +904,6 @@ get_missing_commit_check_level () {
 	printf '%s' "$check_level" | tr 'A-Z' 'a-z'
 }
 
-# Check if the user dropped some commits by mistake
-# Behaviour determined by rebase.missingCommitsCheck.
-# Check if there is an unrecognized command or a
-# bad SHA-1 in a command.
-check_todo_list () {
-	raise_error=f
-
-	check_level=$(get_missing_commit_check_level)
-
-	case "$check_level" in
-	warn|error)
-		# Get the SHA-1 of the commits
-		todo_list_to_sha_list <"$todo".backup >"$todo".oldsha1
-		todo_list_to_sha_list <"$todo" >"$todo".newsha1
-
-		# Sort the SHA-1 and compare them
-		sort -u "$todo".oldsha1 >"$todo".oldsha1+
-		mv "$todo".oldsha1+ "$todo".oldsha1
-		sort -u "$todo".newsha1 >"$todo".newsha1+
-		mv "$todo".newsha1+ "$todo".newsha1
-		comm -2 -3 "$todo".oldsha1 "$todo".newsha1 >"$todo".miss
-
-		# Warn about missing commits
-		if test -s "$todo".miss
-		then
-			test "$check_level" = error && raise_error=t
-
-			warn "$(gettext "\
-Warning: some commits may have been dropped accidentally.
-Dropped commits (newer to older):")"
-
-			# Make the list user-friendly and display
-			opt="--no-walk=sorted --format=oneline --abbrev-commit --stdin"
-			git rev-list $opt <"$todo".miss | warn_lines
-
-			warn "$(gettext "\
-To avoid this message, use \"drop\" to explicitly remove a commit.
-
-Use 'git config rebase.missingCommitsCheck' to change the level of warnings.
-The possible behaviours are: ignore, warn, error.")"
-			warn
-		fi
-		;;
-	ignore)
-		;;
-	*)
-		warn "$(eval_gettext "Unrecognized setting \$check_level for option rebase.missingCommitsCheck. Ignoring.")"
-		;;
-	esac
-
-	if ! check_bad_cmd_and_sha "$todo"
-	then
-		raise_error=t
-	fi
-
-	if test $raise_error = t
-	then
-		# Checkout before the first commit of the
-		# rebase: this way git rebase --continue
-		# will work correctly as it expects HEAD to be
-		# placed before the commit of the next action
-		checkout_onto
-
-		warn "$(gettext "You can fix this with 'git rebase --edit-todo' and then run 'git rebase --continue'.")"
-		die "$(gettext "Or you can abort the rebase with 'git rebase --abort'.")"
-	fi
-}
-
 # The whole contents of this file is run by dot-sourcing it from
 # inside a shell function.  It used to be that "return"s we see
 # below were not inside any function, and expected to return
@@ -1322,7 +1164,11 @@ git_sequence_editor "$todo" ||
 has_action "$todo" ||
 	return 2
 
-check_todo_list
+git rebase--helper --check-todo-list || {
+	ret=$?
+	checkout_onto
+	exit $ret
+}
 
 expand_todo_ids
 
diff --git a/sequencer.c b/sequencer.c
index d6ae6407546..3a935fa4cbc 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -2491,3 +2491,128 @@ int transform_todo_ids(int shorten_sha1s)
 	todo_list_release(&todo_list);
 	return 0;
 }
+
+enum check_level {
+	CHECK_IGNORE = 0, CHECK_WARN, CHECK_ERROR
+};
+
+static enum check_level get_missing_commit_check_level(void)
+{
+	const char *value;
+
+	if (git_config_get_value("rebase.missingcommitscheck", &value) ||
+			!strcasecmp("ignore", value))
+		return CHECK_IGNORE;
+	if (!strcasecmp("warn", value))
+		return CHECK_WARN;
+	if (!strcasecmp("error", value))
+		return CHECK_ERROR;
+	warning(_("unrecognized setting %s for option"
+		  "rebase.missingCommitsCheck. Ignoring."), value);
+	return CHECK_IGNORE;
+}
+
+/*
+ * Check if the user dropped some commits by mistake
+ * Behaviour determined by rebase.missingCommitsCheck.
+ * Check if there is an unrecognized command or a
+ * bad SHA-1 in a command.
+ */
+int check_todo_list(void)
+{
+	enum check_level check_level = get_missing_commit_check_level();
+	struct strbuf todo_file = STRBUF_INIT;
+	struct todo_list todo_list = TODO_LIST_INIT;
+	struct commit_list *missing = NULL;
+	int raise_error = 0, res = 0, fd, i;
+
+	strbuf_addstr(&todo_file, rebase_path_todo());
+	fd = open(todo_file.buf, O_RDONLY);
+	if (fd < 0) {
+		res = error_errno(_("could not open '%s'"), todo_file.buf);
+		goto leave_check;
+	}
+	if (strbuf_read(&todo_list.buf, fd, 0) < 0) {
+		close(fd);
+		res = error(_("could not read '%s'."), todo_file.buf);
+		goto leave_check;
+	}
+	close(fd);
+	raise_error = res =
+		parse_insn_buffer(todo_list.buf.buf, &todo_list);
+
+	if (check_level == CHECK_IGNORE)
+		goto leave_check;
+
+	/* Get the SHA-1 of the commits */
+	for (i = 0; i < todo_list.nr; i++) {
+		struct commit *commit = todo_list.items[i].commit;
+		if (commit)
+			commit->util = todo_list.items + i;
+	}
+
+	todo_list_release(&todo_list);
+	strbuf_addstr(&todo_file, ".backup");
+	fd = open(todo_file.buf, O_RDONLY);
+	if (fd < 0) {
+		res = error_errno(_("could not open '%s'"), todo_file.buf);
+		goto leave_check;
+	}
+	if (strbuf_read(&todo_list.buf, fd, 0) < 0) {
+		close(fd);
+		res = error(_("could not read '%s'."), todo_file.buf);
+		goto leave_check;
+	}
+	close(fd);
+	strbuf_release(&todo_file);
+	res = !!parse_insn_buffer(todo_list.buf.buf, &todo_list);
+
+	/* Find commits that are missing after editing */
+	for (i = 0; i < todo_list.nr; i++) {
+		struct commit *commit = todo_list.items[i].commit;
+		if (commit && !commit->util) {
+			commit_list_insert(commit, &missing);
+			commit->util = todo_list.items + i;
+		}
+	}
+
+	/* Warn about missing commits */
+	if (!missing)
+		goto leave_check;
+
+	if (check_level == CHECK_ERROR)
+		raise_error = res = 1;
+
+	fprintf(stderr,
+		_("Warning: some commits may have been dropped accidentally.\n"
+		"Dropped commits (newer to older):\n"));
+
+	/* Make the list user-friendly and display */
+	while (missing) {
+		struct commit *commit = pop_commit(&missing);
+		struct todo_item *item = commit->util;
+
+		fprintf(stderr, " - %s %.*s\n", short_commit_name(commit),
+			item->arg_len, item->arg);
+	}
+	free_commit_list(missing);
+
+	fprintf(stderr, _("To avoid this message, use \"drop\" to "
+		"explicitly remove a commit.\n\n"
+		"Use 'git config rebase.missingCommitsCheck' to change "
+		"the level of warnings.\n"
+		"The possible behaviours are: ignore, warn, error.\n\n"));
+
+leave_check:
+	strbuf_release(&todo_file);
+	todo_list_release(&todo_list);
+
+	if (raise_error)
+		fprintf(stderr,
+			_("You can fix this with 'git rebase --edit-todo' "
+			  "and then run 'git rebase --continue'.\n"
+			  "Or you can abort the rebase with 'git rebase"
+			  " --abort'.\n"));
+
+	return res;
+}
diff --git a/sequencer.h b/sequencer.h
index 47a81034e76..4978a61b83b 100644
--- a/sequencer.h
+++ b/sequencer.h
@@ -49,6 +49,7 @@ int sequencer_make_script(int keep_empty, FILE *out,
 		int argc, const char **argv);
 
 int transform_todo_ids(int shorten_sha1s);
+int check_todo_list(void);
 
 extern const char sign_off_header[];
 
-- 
2.12.2.windows.2.406.gd14a8f8640f



^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [PATCH v2 7/9] rebase -i: skip unnecessary picks using the rebase--helper
  2017-04-25 13:51 ` [PATCH v2 0/9] The final building block for a faster rebase -i Johannes Schindelin
                     ` (5 preceding siblings ...)
  2017-04-25 13:52   ` [PATCH v2 6/9] rebase -i: check for missing commits in the rebase--helper Johannes Schindelin
@ 2017-04-25 13:52   ` Johannes Schindelin
  2017-04-26 10:55     ` Jeff King
  2017-04-25 13:52   ` [PATCH v2 8/9] t3415: test fixup with wrapped oneline Johannes Schindelin
                     ` (3 subsequent siblings)
  10 siblings, 1 reply; 100+ messages in thread
From: Johannes Schindelin @ 2017-04-25 13:52 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Philip Oakley

In particular on Windows, where shell scripts are even more expensive
than on MacOSX or Linux, it makes sense to move a loop that forks
Git at least once for every line in the todo list into a builtin.

Note: The original code did not try to skip unnecessary picks of root
commits but punts instead (probably --root was not considered common
enough of a use case to bother optimizing). We do the same, for now.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 builtin/rebase--helper.c   |  6 +++-
 git-rebase--interactive.sh | 41 ++-------------------
 sequencer.c                | 90 ++++++++++++++++++++++++++++++++++++++++++++++
 sequencer.h                |  1 +
 4 files changed, 99 insertions(+), 39 deletions(-)

diff --git a/builtin/rebase--helper.c b/builtin/rebase--helper.c
index e706eac710d..de3ccd9bfbc 100644
--- a/builtin/rebase--helper.c
+++ b/builtin/rebase--helper.c
@@ -14,7 +14,7 @@ int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
 	int keep_empty = 0;
 	enum {
 		CONTINUE = 1, ABORT, MAKE_SCRIPT, SHORTEN_SHA1S, EXPAND_SHA1S,
-		CHECK_TODO_LIST
+		CHECK_TODO_LIST, SKIP_UNNECESSARY_PICKS
 	} command = 0;
 	struct option options[] = {
 		OPT_BOOL(0, "ff", &opts.allow_ff, N_("allow fast-forward")),
@@ -31,6 +31,8 @@ int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
 			N_("expand SHA-1s in the todo list"), EXPAND_SHA1S),
 		OPT_CMDMODE(0, "check-todo-list", &command,
 			N_("check the todo list"), CHECK_TODO_LIST),
+		OPT_CMDMODE(0, "skip-unnecessary-picks", &command,
+			N_("skip unnecessary picks"), SKIP_UNNECESSARY_PICKS),
 		OPT_END()
 	};
 
@@ -55,5 +57,7 @@ int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
 		return !!transform_todo_ids(0);
 	if (command == CHECK_TODO_LIST && argc == 1)
 		return !!check_todo_list();
+	if (command == SKIP_UNNECESSARY_PICKS && argc == 1)
+		return !!skip_unnecessary_picks();
 	usage_with_options(builtin_rebase_helper_usage, options);
 }
diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
index 1649506e1e4..931bc09e0cf 100644
--- a/git-rebase--interactive.sh
+++ b/git-rebase--interactive.sh
@@ -713,43 +713,6 @@ do_rest () {
 	done
 }
 
-# skip picking commits whose parents are unchanged
-skip_unnecessary_picks () {
-	fd=3
-	while read -r command rest
-	do
-		# fd=3 means we skip the command
-		case "$fd,$command" in
-		3,pick|3,p)
-			# pick a commit whose parent is current $onto -> skip
-			sha1=${rest%% *}
-			case "$(git rev-parse --verify --quiet "$sha1"^)" in
-			"$onto"*)
-				onto=$sha1
-				;;
-			*)
-				fd=1
-				;;
-			esac
-			;;
-		3,"$comment_char"*|3,)
-			# copy comments
-			;;
-		*)
-			fd=1
-			;;
-		esac
-		printf '%s\n' "$command${rest:+ }$rest" >&$fd
-	done <"$todo" >"$todo.new" 3>>"$done" &&
-	mv -f "$todo".new "$todo" &&
-	case "$(peek_next_command)" in
-	squash|s|fixup|f)
-		record_in_rewritten "$onto"
-		;;
-	esac ||
-		die "$(gettext "Could not skip unnecessary pick commands")"
-}
-
 transform_todo_ids () {
 	while read -r command rest
 	do
@@ -1172,7 +1135,9 @@ git rebase--helper --check-todo-list || {
 
 expand_todo_ids
 
-test -d "$rewritten" || test -n "$force_rebase" || skip_unnecessary_picks
+test -d "$rewritten" || test -n "$force_rebase" ||
+onto="$(git rebase--helper --skip-unnecessary-picks)" ||
+die "Could not skip unnecessary pick commands"
 
 checkout_onto
 if test -z "$rebase_root" && test ! -d "$rewritten"
diff --git a/sequencer.c b/sequencer.c
index 3a935fa4cbc..bbbc98c9116 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -2616,3 +2616,93 @@ int check_todo_list(void)
 
 	return res;
 }
+
+/* skip picking commits whose parents are unchanged */
+int skip_unnecessary_picks(void)
+{
+	const char *todo_file = rebase_path_todo();
+	struct strbuf buf = STRBUF_INIT;
+	struct todo_list todo_list = TODO_LIST_INIT;
+	struct object_id onto_oid, *oid = &onto_oid, *parent_oid;
+	int fd, i;
+
+	if (!read_oneliner(&buf, rebase_path_onto(), 0))
+		return error(_("could not read 'onto'"));
+	if (get_sha1(buf.buf, onto_oid.hash)) {
+		strbuf_release(&buf);
+		return error(_("need a HEAD to fixup"));
+	}
+	strbuf_release(&buf);
+
+	fd = open(todo_file, O_RDONLY);
+	if (fd < 0) {
+		return error_errno(_("could not open '%s'"), todo_file);
+	}
+	if (strbuf_read(&todo_list.buf, fd, 0) < 0) {
+		close(fd);
+		return error(_("could not read '%s'."), todo_file);
+	}
+	close(fd);
+	if (parse_insn_buffer(todo_list.buf.buf, &todo_list) < 0) {
+		todo_list_release(&todo_list);
+		return -1;
+	}
+
+	for (i = 0; i < todo_list.nr; i++) {
+		struct todo_item *item = todo_list.items + i;
+
+		if (item->command >= TODO_NOOP)
+			continue;
+		if (item->command != TODO_PICK)
+			break;
+		if (parse_commit(item->commit)) {
+			todo_list_release(&todo_list);
+			return error(_("could not parse commit '%s'"),
+				oid_to_hex(&item->commit->object.oid));
+		}
+		if (!item->commit->parents)
+			break; /* root commit */
+		if (item->commit->parents->next)
+			break; /* merge commit */
+		parent_oid = &item->commit->parents->item->object.oid;
+		if (hashcmp(parent_oid->hash, oid->hash))
+			break;
+		oid = &item->commit->object.oid;
+	}
+	if (i > 0) {
+		int offset = i < todo_list.nr ?
+			todo_list.items[i].offset_in_buf : todo_list.buf.len;
+		const char *done_path = rebase_path_done();
+
+		fd = open(done_path, O_CREAT | O_WRONLY | O_APPEND, 0666);
+		if (write_in_full(fd, todo_list.buf.buf, offset) < 0) {
+			todo_list_release(&todo_list);
+			return error_errno(_("could not write to '%s'"),
+				done_path);
+		}
+		close(fd);
+
+		fd = open(rebase_path_todo(), O_WRONLY, 0666);
+		if (write_in_full(fd, todo_list.buf.buf + offset,
+				todo_list.buf.len - offset) < 0) {
+			todo_list_release(&todo_list);
+			return error_errno(_("could not write to '%s'"),
+				rebase_path_todo());
+		}
+		if (ftruncate(fd, todo_list.buf.len - offset) < 0) {
+			todo_list_release(&todo_list);
+			return error_errno(_("could not truncate '%s'"),
+				rebase_path_todo());
+		}
+		close(fd);
+
+		todo_list.current = i;
+		if (is_fixup(peek_command(&todo_list, 0)))
+			record_in_rewritten(oid, peek_command(&todo_list, 0));
+	}
+
+	todo_list_release(&todo_list);
+	printf("%s\n", oid_to_hex(oid));
+
+	return 0;
+}
diff --git a/sequencer.h b/sequencer.h
index 4978a61b83b..28e1fc1e9bb 100644
--- a/sequencer.h
+++ b/sequencer.h
@@ -50,6 +50,7 @@ int sequencer_make_script(int keep_empty, FILE *out,
 
 int transform_todo_ids(int shorten_sha1s);
 int check_todo_list(void);
+int skip_unnecessary_picks(void);
 
 extern const char sign_off_header[];
 
-- 
2.12.2.windows.2.406.gd14a8f8640f



^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [PATCH v2 8/9] t3415: test fixup with wrapped oneline
  2017-04-25 13:51 ` [PATCH v2 0/9] The final building block for a faster rebase -i Johannes Schindelin
                     ` (6 preceding siblings ...)
  2017-04-25 13:52   ` [PATCH v2 7/9] rebase -i: skip unnecessary picks using " Johannes Schindelin
@ 2017-04-25 13:52   ` Johannes Schindelin
  2017-04-25 13:52   ` [PATCH v2 9/9] rebase -i: rearrange fixup/squash lines using the rebase--helper Johannes Schindelin
                     ` (2 subsequent siblings)
  10 siblings, 0 replies; 100+ messages in thread
From: Johannes Schindelin @ 2017-04-25 13:52 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Philip Oakley

The `git commit --fixup` command unwraps wrapped onelines when
constructing the commit message, without wrapping the result.

We need to make sure that `git rebase --autosquash` keeps handling such
cases correctly, in particular since we are about to move the autosquash
handling into the rebase--helper.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 t/t3415-rebase-autosquash.sh | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/t/t3415-rebase-autosquash.sh b/t/t3415-rebase-autosquash.sh
index 48346f1cc0c..9fd629a6e21 100755
--- a/t/t3415-rebase-autosquash.sh
+++ b/t/t3415-rebase-autosquash.sh
@@ -304,4 +304,18 @@ test_expect_success 'extra spaces after fixup!' '
 	test $base = $parent
 '
 
+test_expect_success 'wrapped original subject' '
+	if test -d .git/rebase-merge; then git rebase --abort; fi &&
+	base=$(git rev-parse HEAD) &&
+	echo "wrapped subject" >wrapped &&
+	git add wrapped &&
+	test_tick &&
+	git commit --allow-empty -m "$(printf "To\nfixup")" &&
+	test_tick &&
+	git commit --allow-empty -m "fixup! To fixup" &&
+	git rebase -i --autosquash --keep-empty HEAD~2 &&
+	parent=$(git rev-parse HEAD^) &&
+	test $base = $parent
+'
+
 test_done
-- 
2.12.2.windows.2.406.gd14a8f8640f



^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [PATCH v2 9/9] rebase -i: rearrange fixup/squash lines using the rebase--helper
  2017-04-25 13:51 ` [PATCH v2 0/9] The final building block for a faster rebase -i Johannes Schindelin
                     ` (7 preceding siblings ...)
  2017-04-25 13:52   ` [PATCH v2 8/9] t3415: test fixup with wrapped oneline Johannes Schindelin
@ 2017-04-25 13:52   ` Johannes Schindelin
  2017-04-26  3:32   ` [PATCH v2 0/9] The final building block for a faster rebase -i Junio C Hamano
  2017-04-26 11:59   ` [PATCH v3 " Johannes Schindelin
  10 siblings, 0 replies; 100+ messages in thread
From: Johannes Schindelin @ 2017-04-25 13:52 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Philip Oakley

This operation has quadratic complexity, which is especially painful
on Windows, where shell scripts are *already* slow (mainly due to the
overhead of the POSIX emulation layer).

Let's reimplement this with linear complexity (using a hash map to
match the commits' subject lines) for the common case; Sadly, the
fixup/squash feature's design neglected performance considerations,
allowing arbitrary prefixes (read: `fixup! hell` will match the
commit subject `hello world`), which means that we are stuck with
quadratic performance in the worst case.

The reimplemented logic also happens to fix a bug where commented-out
lines (representing empty patches) were dropped by the previous code.

While at it, clarify how the fixup/squash feature works in `git rebase
-i`'s man page.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 Documentation/git-rebase.txt |  16 ++--
 builtin/rebase--helper.c     |   6 +-
 git-rebase--interactive.sh   |  90 +-------------------
 sequencer.c                  | 197 +++++++++++++++++++++++++++++++++++++++++++
 sequencer.h                  |   1 +
 t/t3415-rebase-autosquash.sh |   2 +-
 6 files changed, 214 insertions(+), 98 deletions(-)

diff --git a/Documentation/git-rebase.txt b/Documentation/git-rebase.txt
index 67d48e68831..da79fbda5b3 100644
--- a/Documentation/git-rebase.txt
+++ b/Documentation/git-rebase.txt
@@ -425,13 +425,15 @@ without an explicit `--interactive`.
 --autosquash::
 --no-autosquash::
 	When the commit log message begins with "squash! ..." (or
-	"fixup! ..."), and there is a commit whose title begins with
-	the same ..., automatically modify the todo list of rebase -i
-	so that the commit marked for squashing comes right after the
-	commit to be modified, and change the action of the moved
-	commit from `pick` to `squash` (or `fixup`).  Ignores subsequent
-	"fixup! " or "squash! " after the first, in case you referred to an
-	earlier fixup/squash with `git commit --fixup/--squash`.
+	"fixup! ..."), and there is already a commit in the todo list that
+	matches the same `...`, automatically modify the todo list of rebase
+	-i so that the commit marked for squashing comes right after the
+	commit to be modified, and change the action of the moved commit
+	from `pick` to `squash` (or `fixup`).  A commit matches the `...` if
+	the commit subject matches, or if the `...` refers to the commit's
+	hash. As a fall-back, partial matches of the commit subject work,
+	too.  The recommended way to create fixup/squash commits is by using
+	the `--fixup`/`--squash` options of linkgit:git-commit[1].
 +
 This option is only valid when the `--interactive` option is used.
 +
diff --git a/builtin/rebase--helper.c b/builtin/rebase--helper.c
index de3ccd9bfbc..e6591f01112 100644
--- a/builtin/rebase--helper.c
+++ b/builtin/rebase--helper.c
@@ -14,7 +14,7 @@ int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
 	int keep_empty = 0;
 	enum {
 		CONTINUE = 1, ABORT, MAKE_SCRIPT, SHORTEN_SHA1S, EXPAND_SHA1S,
-		CHECK_TODO_LIST, SKIP_UNNECESSARY_PICKS
+		CHECK_TODO_LIST, SKIP_UNNECESSARY_PICKS, REARRANGE_SQUASH
 	} command = 0;
 	struct option options[] = {
 		OPT_BOOL(0, "ff", &opts.allow_ff, N_("allow fast-forward")),
@@ -33,6 +33,8 @@ int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
 			N_("check the todo list"), CHECK_TODO_LIST),
 		OPT_CMDMODE(0, "skip-unnecessary-picks", &command,
 			N_("skip unnecessary picks"), SKIP_UNNECESSARY_PICKS),
+		OPT_CMDMODE(0, "rearrange-squash", &command,
+			N_("rearrange fixup/squash lines"), REARRANGE_SQUASH),
 		OPT_END()
 	};
 
@@ -59,5 +61,7 @@ int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
 		return !!check_todo_list();
 	if (command == SKIP_UNNECESSARY_PICKS && argc == 1)
 		return !!skip_unnecessary_picks();
+	if (command == REARRANGE_SQUASH && argc == 1)
+		return !!rearrange_squash();
 	usage_with_options(builtin_rebase_helper_usage, options);
 }
diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
index 931bc09e0cf..d39fe4f5fb7 100644
--- a/git-rebase--interactive.sh
+++ b/git-rebase--interactive.sh
@@ -744,94 +744,6 @@ collapse_todo_ids() {
 	git rebase--helper --shorten-sha1s
 }
 
-# Rearrange the todo list that has both "pick sha1 msg" and
-# "pick sha1 fixup!/squash! msg" appears in it so that the latter
-# comes immediately after the former, and change "pick" to
-# "fixup"/"squash".
-#
-# Note that if the config has specified a custom instruction format
-# each log message will be re-retrieved in order to normalize the
-# autosquash arrangement
-rearrange_squash () {
-	format=$(git config --get rebase.instructionFormat)
-	# extract fixup!/squash! lines and resolve any referenced sha1's
-	while read -r pick sha1 message
-	do
-		test -z "${format}" || message=$(git log -n 1 --format="%s" ${sha1})
-		case "$message" in
-		"squash! "*|"fixup! "*)
-			action="${message%%!*}"
-			rest=$message
-			prefix=
-			# skip all squash! or fixup! (but save for later)
-			while :
-			do
-				case "$rest" in
-				"squash! "*|"fixup! "*)
-					prefix="$prefix${rest%%!*},"
-					rest="${rest#*! }"
-					;;
-				*)
-					break
-					;;
-				esac
-			done
-			printf '%s %s %s %s\n' "$sha1" "$action" "$prefix" "$rest"
-			# if it's a single word, try to resolve to a full sha1 and
-			# emit a second copy. This allows us to match on both message
-			# and on sha1 prefix
-			if test "${rest#* }" = "$rest"; then
-				fullsha="$(git rev-parse -q --verify "$rest" 2>/dev/null)"
-				if test -n "$fullsha"; then
-					# prefix the action to uniquely identify this line as
-					# intended for full sha1 match
-					echo "$sha1 +$action $prefix $fullsha"
-				fi
-			fi
-		esac
-	done >"$1.sq" <"$1"
-	test -s "$1.sq" || return
-
-	used=
-	while read -r pick sha1 message
-	do
-		case " $used" in
-		*" $sha1 "*) continue ;;
-		esac
-		printf '%s\n' "$pick $sha1 $message"
-		test -z "${format}" || message=$(git log -n 1 --format="%s" ${sha1})
-		used="$used$sha1 "
-		while read -r squash action msg_prefix msg_content
-		do
-			case " $used" in
-			*" $squash "*) continue ;;
-			esac
-			emit=0
-			case "$action" in
-			+*)
-				action="${action#+}"
-				# full sha1 prefix test
-				case "$msg_content" in "$sha1"*) emit=1;; esac ;;
-			*)
-				# message prefix test
-				case "$message" in "$msg_content"*) emit=1;; esac ;;
-			esac
-			if test $emit = 1; then
-				if test -n "${format}"
-				then
-					msg_content=$(git log -n 1 --format="${format}" ${squash})
-				else
-					msg_content="$(echo "$msg_prefix" | sed "s/,/! /g")$msg_content"
-				fi
-				printf '%s\n' "$action $squash $msg_content"
-				used="$used$squash "
-			fi
-		done <"$1.sq"
-	done >"$1.rearranged" <"$1"
-	cat "$1.rearranged" >"$1"
-	rm -f "$1.sq" "$1.rearranged"
-}
-
 # Add commands after a pick or after a squash/fixup serie
 # in the todo list.
 add_exec_commands () {
@@ -1091,7 +1003,7 @@ then
 fi
 
 test -s "$todo" || echo noop >> "$todo"
-test -n "$autosquash" && rearrange_squash "$todo"
+test -z "$autosquash" || git rebase--helper --rearrange-squash || exit
 test -n "$cmd" && add_exec_commands "$todo"
 
 todocount=$(git stripspace --strip-comments <"$todo" | wc -l)
diff --git a/sequencer.c b/sequencer.c
index bbbc98c9116..2b07fb9e0ce 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -19,6 +19,7 @@
 #include "trailer.h"
 #include "log-tree.h"
 #include "wt-status.h"
+#include "hashmap.h"
 
 #define GIT_REFLOG_ACTION "GIT_REFLOG_ACTION"
 
@@ -2706,3 +2707,199 @@ int skip_unnecessary_picks(void)
 
 	return 0;
 }
+
+struct subject2item_entry {
+	struct hashmap_entry entry;
+	int i;
+	char subject[FLEX_ARRAY];
+};
+
+static int subject2item_cmp(const struct subject2item_entry *a,
+	const struct subject2item_entry *b, const void *key)
+{
+	return key ? strcmp(a->subject, key) : strcmp(a->subject, b->subject);
+}
+
+/*
+ * Rearrange the todo list that has both "pick sha1 msg" and "pick sha1
+ * fixup!/squash! msg" in it so that the latter is put immediately after the
+ * former, and change "pick" to "fixup"/"squash".
+ *
+ * Note that if the config has specified a custom instruction format, each log
+ * message will have to be retrieved from the commit (as the oneline in the
+ * script cannot be trusted) in order to normalize the autosquash arrangement.
+ */
+int rearrange_squash(void)
+{
+	const char *todo_file = rebase_path_todo();
+	struct todo_list todo_list = TODO_LIST_INIT;
+	struct hashmap subject2item;
+	int res = 0, rearranged = 0, *next, *tail, fd, i;
+	char **subjects;
+
+	fd = open(todo_file, O_RDONLY);
+	if (fd < 0)
+		return error_errno(_("could not open '%s'"), todo_file);
+	if (strbuf_read(&todo_list.buf, fd, 0) < 0) {
+		close(fd);
+		return error(_("could not read '%s'."), todo_file);
+	}
+	close(fd);
+	if (parse_insn_buffer(todo_list.buf.buf, &todo_list) < 0) {
+		todo_list_release(&todo_list);
+		return -1;
+	}
+
+	/*
+	 * The hashmap maps onelines to the respective todo list index.
+	 *
+	 * If any items need to be rearranged, the next[i] value will indicate
+	 * which item was moved directly after the i'th.
+	 *
+	 * In that case, last[i] will indicate the index of the latest item to
+	 * be moved to appear after the i'th.
+	 */
+	hashmap_init(&subject2item, (hashmap_cmp_fn) subject2item_cmp,
+		     todo_list.nr);
+	ALLOC_ARRAY(next, todo_list.nr);
+	ALLOC_ARRAY(tail, todo_list.nr);
+	ALLOC_ARRAY(subjects, todo_list.nr);
+	for (i = 0; i < todo_list.nr; i++) {
+		struct strbuf buf = STRBUF_INIT;
+		struct todo_item *item = todo_list.items + i;
+		const char *commit_buffer, *subject, *p;
+		int i2 = -1;
+		struct subject2item_entry *entry;
+
+		next[i] = tail[i] = -1;
+		if (item->command >= TODO_EXEC) {
+			subjects[i] = NULL;
+			continue;
+		}
+
+		if (is_fixup(item->command)) {
+			todo_list_release(&todo_list);
+			return error(_("the script was already rearranged."));
+		}
+
+		item->commit->util = item;
+
+		parse_commit(item->commit);
+		commit_buffer = get_commit_buffer(item->commit, NULL);
+		find_commit_subject(commit_buffer, &subject);
+		format_subject(&buf, subject, " ");
+		subject = subjects[i] = buf.buf;
+		unuse_commit_buffer(item->commit, commit_buffer);
+		if ((skip_prefix(subject, "fixup! ", &p) ||
+		     skip_prefix(subject, "squash! ", &p))) {
+			struct commit *commit2;
+
+			for (;;) {
+				while (isspace(*p))
+					p++;
+				if (!skip_prefix(p, "fixup! ", &p) &&
+				    !skip_prefix(p, "squash! ", &p))
+					break;
+			}
+
+			if ((entry = hashmap_get_from_hash(&subject2item,
+							   strhash(p), p)))
+				/* found by title */
+				i2 = entry->i;
+			else if (!strchr(p, ' ') &&
+				 (commit2 =
+				  lookup_commit_reference_by_name(p)) &&
+				 commit2->util)
+				/* found by commit name */
+				i2 = (struct todo_item *)commit2->util
+					- todo_list.items;
+			else {
+				/* copy can be a prefix of the commit subject */
+				for (i2 = 0; i2 < i; i2++)
+					if (subjects[i2] &&
+					    starts_with(subjects[i2], p))
+						break;
+				if (i2 == i)
+					i2 = -1;
+			}
+		}
+		if (i2 >= 0) {
+			rearranged = 1;
+			todo_list.items[i].command =
+				starts_with(subject, "fixup!") ?
+				TODO_FIXUP : TODO_SQUASH;
+			if (next[i2] < 0)
+				next[i2] = i;
+			else
+				next[tail[i2]] = i;
+			tail[i2] = i;
+		} else if (!hashmap_get_from_hash(&subject2item,
+						strhash(subject), subject)) {
+			FLEX_ALLOC_MEM(entry, subject, buf.buf, buf.len);
+			entry->i = i;
+			hashmap_entry_init(entry, strhash(entry->subject));
+			hashmap_put(&subject2item, entry);
+		}
+		strbuf_detach(&buf, NULL);
+	}
+
+	if (rearranged) {
+		struct strbuf buf = STRBUF_INIT;
+		char *format = NULL;
+
+		git_config_get_string("rebase.instructionFormat", &format);
+		for (i = 0; i < todo_list.nr; i++) {
+			enum todo_command command = todo_list.items[i].command;
+			int cur = i;
+
+			/*
+			 * Initially, all commands are 'pick's. If it is a
+			 * fixup or a squash now, we have rearranged it.
+			 */
+			if (is_fixup(command))
+				continue;
+
+			while (cur >= 0) {
+				int offset = todo_list.items[cur].offset_in_buf;
+				int end_offset = cur + 1 < todo_list.nr ?
+					todo_list.items[cur + 1].offset_in_buf :
+					todo_list.buf.len;
+				char *bol = todo_list.buf.buf + offset;
+				char *eol = todo_list.buf.buf + end_offset;
+
+				/* replace 'pick', by 'fixup' or 'squash' */
+				command = todo_list.items[cur].command;
+				if (is_fixup(command)) {
+					strbuf_addstr(&buf,
+						todo_command_info[command].str);
+					bol += strcspn(bol, " \t");
+				}
+
+				strbuf_add(&buf, bol, eol - bol);
+
+				cur = next[cur];
+			}
+		}
+
+		fd = open(todo_file, O_WRONLY);
+		if (fd < 0)
+			res = error_errno(_("could not open '%s'"), todo_file);
+		else if (write(fd, buf.buf, buf.len) < 0)
+			res = error_errno(_("could not read '%s'."), todo_file);
+		else if (ftruncate(fd, buf.len) < 0)
+			res = error_errno(_("could not finish '%s'"),
+					   todo_file);
+		close(fd);
+		strbuf_release(&buf);
+	}
+
+	free(next);
+	free(tail);
+	for (i = 0; i < todo_list.nr; i++)
+		free(subjects[i]);
+	free(subjects);
+	hashmap_free(&subject2item, 1);
+	todo_list_release(&todo_list);
+
+	return res;
+}
diff --git a/sequencer.h b/sequencer.h
index 28e1fc1e9bb..1c94bec7622 100644
--- a/sequencer.h
+++ b/sequencer.h
@@ -51,6 +51,7 @@ int sequencer_make_script(int keep_empty, FILE *out,
 int transform_todo_ids(int shorten_sha1s);
 int check_todo_list(void);
 int skip_unnecessary_picks(void);
+int rearrange_squash(void);
 
 extern const char sign_off_header[];
 
diff --git a/t/t3415-rebase-autosquash.sh b/t/t3415-rebase-autosquash.sh
index 9fd629a6e21..b9e26008a79 100755
--- a/t/t3415-rebase-autosquash.sh
+++ b/t/t3415-rebase-autosquash.sh
@@ -278,7 +278,7 @@ set_backup_editor () {
 	test_set_editor "$PWD/backup-editor.sh"
 }
 
-test_expect_failure 'autosquash with multiple empty patches' '
+test_expect_success 'autosquash with multiple empty patches' '
 	test_tick &&
 	git commit --allow-empty -m "empty" &&
 	test_tick &&
-- 
2.12.2.windows.2.406.gd14a8f8640f

^ permalink raw reply related	[flat|nested] 100+ messages in thread

* Re: [PATCH v2 0/9] The final building block for a faster rebase -i
  2017-04-25 13:51 ` [PATCH v2 0/9] The final building block for a faster rebase -i Johannes Schindelin
                     ` (8 preceding siblings ...)
  2017-04-25 13:52   ` [PATCH v2 9/9] rebase -i: rearrange fixup/squash lines using the rebase--helper Johannes Schindelin
@ 2017-04-26  3:32   ` Junio C Hamano
  2017-04-26 11:59   ` [PATCH v3 " Johannes Schindelin
  10 siblings, 0 replies; 100+ messages in thread
From: Junio C Hamano @ 2017-04-26  3:32 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, Philip Oakley

Johannes Schindelin <johannes.schindelin@gmx.de> writes:

> This patch series reimplements the expensive pre- and post-processing of
> the todo script in C.
>
> And it concludes the work I did to accelerate rebase -i.

I noticed (this is merely "I felt"; I didn't measure) that recent
"rebase -i" sessions I do on a Linux box is already plenty faster
than before, and it is good to finally see the end of the long road.

Thanks.

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH v2 1/9] rebase -i: generate the script via rebase--helper
  2017-04-25 13:51   ` [PATCH v2 1/9] rebase -i: generate the script via rebase--helper Johannes Schindelin
@ 2017-04-26 10:45     ` Jeff King
  2017-04-26 11:34       ` Johannes Schindelin
  0 siblings, 1 reply; 100+ messages in thread
From: Jeff King @ 2017-04-26 10:45 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, Junio C Hamano, Philip Oakley

On Tue, Apr 25, 2017 at 03:51:49PM +0200, Johannes Schindelin wrote:

> --- a/sequencer.c
> +++ b/sequencer.c
> [...]
> +int sequencer_make_script(int keep_empty, FILE *out,
> +		int argc, const char **argv)
> +{
> +	char *format = "%s";

I'm surprised the compiler doesn't complain about assigning a string
literal to a non-const pointer. It makes me worried that we would call
free() on it later. We don't, but that means...

> +	git_config_get_string("rebase.instructionFormat", &format);

...that this assignment to "format" leaks.

So perhaps you'd want to xstrdup the literal, and then make sure the
result is freed? Or alternatively use an extra level of indirection (a
to_free pointer to store the config value, and then a const pointer for
"format").

-Peff

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH v2 7/9] rebase -i: skip unnecessary picks using the rebase--helper
  2017-04-25 13:52   ` [PATCH v2 7/9] rebase -i: skip unnecessary picks using " Johannes Schindelin
@ 2017-04-26 10:55     ` Jeff King
  2017-04-26 11:31       ` Johannes Schindelin
  0 siblings, 1 reply; 100+ messages in thread
From: Jeff King @ 2017-04-26 10:55 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, Junio C Hamano, Philip Oakley

On Tue, Apr 25, 2017 at 03:52:10PM +0200, Johannes Schindelin wrote:

> diff --git a/sequencer.c b/sequencer.c
> index 3a935fa4cbc..bbbc98c9116 100644
> --- a/sequencer.c
> +++ b/sequencer.c
> @@ -2616,3 +2616,93 @@ int check_todo_list(void)
>  
>  	return res;
>  }
> +
> +/* skip picking commits whose parents are unchanged */
> +int skip_unnecessary_picks(void)

Coverity warns of some descriptor leaks in this function (and in
rearrange_squash). I think you get those emails, so I won't repeat the
details here. But I while looking at them I did notice something it
didn't mention:

> +	if (i > 0) {
> +		int offset = i < todo_list.nr ?
> +			todo_list.items[i].offset_in_buf : todo_list.buf.len;
> +		const char *done_path = rebase_path_done();
> +
> +		fd = open(done_path, O_CREAT | O_WRONLY | O_APPEND, 0666);
> +		if (write_in_full(fd, todo_list.buf.buf, offset) < 0) {
> +			todo_list_release(&todo_list);
> +			return error_errno(_("could not write to '%s'"),
> +				done_path);
> +		}
> +		close(fd);

This should probably check the result of open(). I know write_in_full()
will fail if fd is -1, but we'd rather show the user the errno from
open(), not EBADF.

Technically the free() calls from todo_list_release() can also munge
errno before you print it. You might want to just call error_errno()
first, then do the cleanup (including the missing close()).

> +
> +		fd = open(rebase_path_todo(), O_WRONLY, 0666);
> +		if (write_in_full(fd, todo_list.buf.buf + offset,
> +				todo_list.buf.len - offset) < 0) {
> +			todo_list_release(&todo_list);
> +			return error_errno(_("could not write to '%s'"),
> +				rebase_path_todo());
> +		}

Ditto here.

-Peff

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH v2 7/9] rebase -i: skip unnecessary picks using the rebase--helper
  2017-04-26 10:55     ` Jeff King
@ 2017-04-26 11:31       ` Johannes Schindelin
  0 siblings, 0 replies; 100+ messages in thread
From: Johannes Schindelin @ 2017-04-26 11:31 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Junio C Hamano, Philip Oakley

Hi Peff,

On Wed, 26 Apr 2017, Jeff King wrote:

> On Tue, Apr 25, 2017 at 03:52:10PM +0200, Johannes Schindelin wrote:
> 
> > diff --git a/sequencer.c b/sequencer.c
> > index 3a935fa4cbc..bbbc98c9116 100644
> > --- a/sequencer.c
> > +++ b/sequencer.c
> > @@ -2616,3 +2616,93 @@ int check_todo_list(void)
> >  
> >  	return res;
> >  }
> > +
> > +/* skip picking commits whose parents are unchanged */
> > +int skip_unnecessary_picks(void)
> 
> Coverity warns of some descriptor leaks in this function (and in
> rearrange_squash). I think you get those emails, so I won't repeat the
> details here.

I do. The leaks in rearrange_squash() seem to be false positives (I will
have to have another look later, I spent way too many hours pouring over
those Coverity reports this week).

The leaks in skip_unnecessary_picks() are real, though, and I have a patch
to fix them this way:

-- snip --
Subject: [PATCH] sequencer: plug resource leak when skipping unnecessary picks

The resource leak only happens in case of an error writing or truncating
the file, therefore it seems less critical, but we should still fix it
nonetheless.

Discovered by Coverity.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 sequencer.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/sequencer.c b/sequencer.c
index e25a4e1180c..9c765e8850a 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -2678,11 +2678,13 @@ int skip_unnecessary_picks(void)
 		if (write_in_full(fd, todo_list.buf.buf + offset,
 				todo_list.buf.len - offset) < 0) {
 			todo_list_release(&todo_list);
+			close(fd);
 			return error_errno(_("could not write to '%s'"),
 				rebase_path_todo());
 		}
 		if (ftruncate(fd, todo_list.buf.len - offset) < 0) {
 			todo_list_release(&todo_list);
+			close(fd);
 			return error_errno(_("could not truncate '%s'"),
 				rebase_path_todo());
 		}
-- snap --

> But I while looking at them I did notice something it didn't mention:
> 
> > +	if (i > 0) {
> > +		int offset = i < todo_list.nr ?
> > +			todo_list.items[i].offset_in_buf : todo_list.buf.len;
> > +		const char *done_path = rebase_path_done();
> > +
> > +		fd = open(done_path, O_CREAT | O_WRONLY | O_APPEND, 0666);
> > +		if (write_in_full(fd, todo_list.buf.buf, offset) < 0) {
> > +			todo_list_release(&todo_list);
> > +			return error_errno(_("could not write to '%s'"),
> > +				done_path);
> > +		}
> > +		close(fd);
> 
> This should probably check the result of open().

Indeed.

> I know write_in_full() will fail if fd is -1, but we'd rather show the
> user the errno from open(), not EBADF.

I guess Coverity follows that code path and determines that it is handled.

If only it would also follow the code paths of the FLEX_ARRAYs and figure
out that we play games with struct definitions whose last entries are
technically incorrect.

> Technically the free() calls from todo_list_release() can also munge
> errno before you print it. You might want to just call error_errno()
> first, then do the cleanup (including the missing close()).

Ah, you're right. I'll have to rework that patch I mentioned above.

> > +		fd = open(rebase_path_todo(), O_WRONLY, 0666);
> > +		if (write_in_full(fd, todo_list.buf.buf + offset,
> > +				todo_list.buf.len - offset) < 0) {
> > +			todo_list_release(&todo_list);
> > +			return error_errno(_("could not write to '%s'"),
> > +				rebase_path_todo());
> > +		}
> 
> Ditto here.

Right.

Ciao,
Dscho

^ permalink raw reply related	[flat|nested] 100+ messages in thread

* Re: [PATCH v2 1/9] rebase -i: generate the script via rebase--helper
  2017-04-26 10:45     ` Jeff King
@ 2017-04-26 11:34       ` Johannes Schindelin
  0 siblings, 0 replies; 100+ messages in thread
From: Johannes Schindelin @ 2017-04-26 11:34 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Junio C Hamano, Philip Oakley

Hi Peff,

On Wed, 26 Apr 2017, Jeff King wrote:

> On Tue, Apr 25, 2017 at 03:51:49PM +0200, Johannes Schindelin wrote:
> 
> > --- a/sequencer.c
> > +++ b/sequencer.c
> > [...]
> > +int sequencer_make_script(int keep_empty, FILE *out,
> > +		int argc, const char **argv)
> > +{
> > +	char *format = "%s";
> 
> I'm surprised the compiler doesn't complain about assigning a string
> literal to a non-const pointer. It makes me worried that we would call
> free() on it later. We don't, but that means...
> 
> > +	git_config_get_string("rebase.instructionFormat", &format);
> 
> ...that this assignment to "format" leaks.
> 
> So perhaps you'd want to xstrdup the literal, and then make sure the
> result is freed? Or alternatively use an extra level of indirection (a
> to_free pointer to store the config value, and then a const pointer for
> "format").

Good suggestion. Will be part of v3,
Dscho

^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH v3 0/9] The final building block for a faster rebase -i
  2017-04-25 13:51 ` [PATCH v2 0/9] The final building block for a faster rebase -i Johannes Schindelin
                     ` (9 preceding siblings ...)
  2017-04-26  3:32   ` [PATCH v2 0/9] The final building block for a faster rebase -i Junio C Hamano
@ 2017-04-26 11:59   ` Johannes Schindelin
  2017-04-26 11:59     ` [PATCH v3 1/9] rebase -i: generate the script via rebase--helper Johannes Schindelin
                       ` (9 more replies)
  10 siblings, 10 replies; 100+ messages in thread
From: Johannes Schindelin @ 2017-04-26 11:59 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Philip Oakley, Jeff King

This patch series reimplements the expensive pre- and post-processing of
the todo script in C.

And it concludes the work I did to accelerate rebase -i.

Changes since v2:

- rearranged error_errno() calls to come before any subsequent free()
  and close()

- now call close(fd) in case of error to avoid resource leaks

- removed unused `format` variable holding the value of
  rebase.instructionFormat from rearrange_squash()

- modified rearrange_squash() to make it easier to reason about
  subjects[i] taking custody of a strbuf's buffer (this should enable
  Coverity to see that there is no resource leak here)


Johannes Schindelin (9):
  rebase -i: generate the script via rebase--helper
  rebase -i: remove useless indentation
  rebase -i: do not invent onelines when expanding/collapsing SHA-1s
  rebase -i: also expand/collapse the SHA-1s via the rebase--helper
  t3404: relax rebase.missingCommitsCheck tests
  rebase -i: check for missing commits in the rebase--helper
  rebase -i: skip unnecessary picks using the rebase--helper
  t3415: test fixup with wrapped oneline
  rebase -i: rearrange fixup/squash lines using the rebase--helper

 Documentation/git-rebase.txt  |  16 +-
 builtin/rebase--helper.c      |  29 ++-
 git-rebase--interactive.sh    | 362 ++++------------------------
 sequencer.c                   | 531 ++++++++++++++++++++++++++++++++++++++++++
 sequencer.h                   |   8 +
 t/t3404-rebase-interactive.sh |  22 +-
 t/t3415-rebase-autosquash.sh  |  16 +-
 7 files changed, 641 insertions(+), 343 deletions(-)


base-commit: e2cb6ab84c94f147f1259260961513b40c36108a
Based-On: rebase--helper at https://github.com/dscho/git
Fetch-Base-Via: git fetch https://github.com/dscho/git rebase--helper
Published-As: https://github.com/dscho/git/releases/tag/rebase-i-extra-v3
Fetch-It-Via: git fetch https://github.com/dscho/git rebase-i-extra-v3

Interdiff vs v2:

 diff --git a/sequencer.c b/sequencer.c
 index 2b07fb9e0ce..7ac1792311e 100644
 --- a/sequencer.c
 +++ b/sequencer.c
 @@ -2393,7 +2393,7 @@ void append_signoff(struct strbuf *msgbuf, int ignore_footer, unsigned flag)
  int sequencer_make_script(int keep_empty, FILE *out,
  		int argc, const char **argv)
  {
 -	char *format = "%s";
 +	char *format = xstrdup("%s");
  	struct pretty_print_context pp = {0};
  	struct strbuf buf = STRBUF_INIT;
  	struct rev_info revs;
 @@ -2412,6 +2412,7 @@ int sequencer_make_script(int keep_empty, FILE *out,
  	revs.pretty_given = 1;
  	git_config_get_string("rebase.instructionFormat", &format);
  	get_commit_format(format, &revs);
 +	free(format);
  	pp.fmt = revs.commit_format;
  	pp.output_encoding = get_log_output_encoding();
  
 @@ -2676,24 +2677,41 @@ int skip_unnecessary_picks(void)
  		const char *done_path = rebase_path_done();
  
  		fd = open(done_path, O_CREAT | O_WRONLY | O_APPEND, 0666);
 +		if (fd < 0) {
 +			error_errno(_("could not open '%s' for writing"),
 +				    done_path);
 +			todo_list_release(&todo_list);
 +			return -1;
 +		}
  		if (write_in_full(fd, todo_list.buf.buf, offset) < 0) {
 +			error_errno(_("could not write to '%s'"), done_path);
  			todo_list_release(&todo_list);
 -			return error_errno(_("could not write to '%s'"),
 -				done_path);
 +			close(fd);
 +			return -1;
  		}
  		close(fd);
  
  		fd = open(rebase_path_todo(), O_WRONLY, 0666);
 +		if (fd < 0) {
 +			error_errno(_("could not open '%s' for writing"),
 +				    rebase_path_todo());
 +			todo_list_release(&todo_list);
 +			return -1;
 +		}
  		if (write_in_full(fd, todo_list.buf.buf + offset,
  				todo_list.buf.len - offset) < 0) {
 +			error_errno(_("could not write to '%s'"),
 +				    rebase_path_todo());
 +			close(fd);
  			todo_list_release(&todo_list);
 -			return error_errno(_("could not write to '%s'"),
 -				rebase_path_todo());
 +			return -1;
  		}
  		if (ftruncate(fd, todo_list.buf.len - offset) < 0) {
 +			error_errno(_("could not truncate '%s'"),
 +				    rebase_path_todo());
  			todo_list_release(&todo_list);
 -			return error_errno(_("could not truncate '%s'"),
 -				rebase_path_todo());
 +			close(fd);
 +			return -1;
  		}
  		close(fd);
  
 @@ -2768,6 +2786,7 @@ int rearrange_squash(void)
  		struct strbuf buf = STRBUF_INIT;
  		struct todo_item *item = todo_list.items + i;
  		const char *commit_buffer, *subject, *p;
 +		size_t subject_len;
  		int i2 = -1;
  		struct subject2item_entry *entry;
  
 @@ -2788,7 +2807,7 @@ int rearrange_squash(void)
  		commit_buffer = get_commit_buffer(item->commit, NULL);
  		find_commit_subject(commit_buffer, &subject);
  		format_subject(&buf, subject, " ");
 -		subject = subjects[i] = buf.buf;
 +		subject = subjects[i] = strbuf_detach(&buf, &subject_len);
  		unuse_commit_buffer(item->commit, commit_buffer);
  		if ((skip_prefix(subject, "fixup! ", &p) ||
  		     skip_prefix(subject, "squash! ", &p))) {
 @@ -2835,19 +2854,16 @@ int rearrange_squash(void)
  			tail[i2] = i;
  		} else if (!hashmap_get_from_hash(&subject2item,
  						strhash(subject), subject)) {
 -			FLEX_ALLOC_MEM(entry, subject, buf.buf, buf.len);
 +			FLEX_ALLOC_MEM(entry, subject, subject, subject_len);
  			entry->i = i;
  			hashmap_entry_init(entry, strhash(entry->subject));
  			hashmap_put(&subject2item, entry);
  		}
 -		strbuf_detach(&buf, NULL);
  	}
  
  	if (rearranged) {
  		struct strbuf buf = STRBUF_INIT;
 -		char *format = NULL;
  
 -		git_config_get_string("rebase.instructionFormat", &format);
  		for (i = 0; i < todo_list.nr; i++) {
  			enum todo_command command = todo_list.items[i].command;
  			int cur = i;

-- 
2.12.2.windows.2.406.gd14a8f8640f


^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH v3 1/9] rebase -i: generate the script via rebase--helper
  2017-04-26 11:59   ` [PATCH v3 " Johannes Schindelin
@ 2017-04-26 11:59     ` Johannes Schindelin
  2017-04-27  4:31       ` Junio C Hamano
  2017-04-28 10:08       ` Phillip Wood
  2017-04-26 11:59     ` [PATCH v3 2/9] rebase -i: remove useless indentation Johannes Schindelin
                       ` (8 subsequent siblings)
  9 siblings, 2 replies; 100+ messages in thread
From: Johannes Schindelin @ 2017-04-26 11:59 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Philip Oakley, Jeff King

The first step of an interactive rebase is to generate the so-called "todo
script", to be stored in the state directory as "git-rebase-todo" and to
be edited by the user.

Originally, we adjusted the output of `git log <options>` using a simple
sed script. Over the course of the years, the code became more
complicated. We now use shell scripting to edit the output of `git log`
conditionally, depending whether to keep "empty" commits (i.e. commits
that do not change any files).

On platforms where shell scripting is not native, this can be a serious
drag. And it opens the door for incompatibilities between platforms when
it comes to shell scripting or to Unix-y commands.

Let's just re-implement the todo script generation in plain C, using the
revision machinery directly.

This is substantially faster, improving the speed relative to the
shell script version of the interactive rebase from 2x to 3x on Windows.

Note that the rearrange_squash() function in git-rebase--interactive
relied on the fact that we set the "format" variable to the config setting
rebase.instructionFormat. Relying on a side effect like this is no good,
hence we explicitly perform that assignment (possibly again) in
rearrange_squash().

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 builtin/rebase--helper.c   |  8 +++++++-
 git-rebase--interactive.sh | 44 +++++++++++++++++++++++---------------------
 sequencer.c                | 45 +++++++++++++++++++++++++++++++++++++++++++++
 sequencer.h                |  3 +++
 4 files changed, 78 insertions(+), 22 deletions(-)

diff --git a/builtin/rebase--helper.c b/builtin/rebase--helper.c
index ca1ebb2fa18..821058d452d 100644
--- a/builtin/rebase--helper.c
+++ b/builtin/rebase--helper.c
@@ -11,15 +11,19 @@ static const char * const builtin_rebase_helper_usage[] = {
 int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
 {
 	struct replay_opts opts = REPLAY_OPTS_INIT;
+	int keep_empty = 0;
 	enum {
-		CONTINUE = 1, ABORT
+		CONTINUE = 1, ABORT, MAKE_SCRIPT
 	} command = 0;
 	struct option options[] = {
 		OPT_BOOL(0, "ff", &opts.allow_ff, N_("allow fast-forward")),
+		OPT_BOOL(0, "keep-empty", &keep_empty, N_("keep empty commits")),
 		OPT_CMDMODE(0, "continue", &command, N_("continue rebase"),
 				CONTINUE),
 		OPT_CMDMODE(0, "abort", &command, N_("abort rebase"),
 				ABORT),
+		OPT_CMDMODE(0, "make-script", &command,
+			N_("make rebase script"), MAKE_SCRIPT),
 		OPT_END()
 	};
 
@@ -36,5 +40,7 @@ int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
 		return !!sequencer_continue(&opts);
 	if (command == ABORT && argc == 1)
 		return !!sequencer_remove_state(&opts);
+	if (command == MAKE_SCRIPT && argc > 1)
+		return !!sequencer_make_script(keep_empty, stdout, argc, argv);
 	usage_with_options(builtin_rebase_helper_usage, options);
 }
diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
index 2c9c0165b5a..609e150d38f 100644
--- a/git-rebase--interactive.sh
+++ b/git-rebase--interactive.sh
@@ -785,6 +785,7 @@ collapse_todo_ids() {
 # each log message will be re-retrieved in order to normalize the
 # autosquash arrangement
 rearrange_squash () {
+	format=$(git config --get rebase.instructionFormat)
 	# extract fixup!/squash! lines and resolve any referenced sha1's
 	while read -r pick sha1 message
 	do
@@ -1210,26 +1211,27 @@ else
 	revisions=$onto...$orig_head
 	shortrevisions=$shorthead
 fi
-format=$(git config --get rebase.instructionFormat)
-# the 'rev-list .. | sed' requires %m to parse; the instruction requires %H to parse
-git rev-list $merges_option --format="%m%H ${format:-%s}" \
-	--reverse --left-right --topo-order \
-	$revisions ${restrict_revision+^$restrict_revision} | \
-	sed -n "s/^>//p" |
-while read -r sha1 rest
-do
-
-	if test -z "$keep_empty" && is_empty_commit $sha1 && ! is_merge_commit $sha1
-	then
-		comment_out="$comment_char "
-	else
-		comment_out=
-	fi
+if test t != "$preserve_merges"
+then
+	git rebase--helper --make-script ${keep_empty:+--keep-empty} \
+		$revisions ${restrict_revision+^$restrict_revision} >"$todo"
+else
+	format=$(git config --get rebase.instructionFormat)
+	# the 'rev-list .. | sed' requires %m to parse; the instruction requires %H to parse
+	git rev-list $merges_option --format="%m%H ${format:-%s}" \
+		--reverse --left-right --topo-order \
+		$revisions ${restrict_revision+^$restrict_revision} | \
+		sed -n "s/^>//p" |
+	while read -r sha1 rest
+	do
+
+		if test -z "$keep_empty" && is_empty_commit $sha1 && ! is_merge_commit $sha1
+		then
+			comment_out="$comment_char "
+		else
+			comment_out=
+		fi
 
-	if test t != "$preserve_merges"
-	then
-		printf '%s\n' "${comment_out}pick $sha1 $rest" >>"$todo"
-	else
 		if test -z "$rebase_root"
 		then
 			preserve=t
@@ -1248,8 +1250,8 @@ do
 			touch "$rewritten"/$sha1
 			printf '%s\n' "${comment_out}pick $sha1 $rest" >>"$todo"
 		fi
-	fi
-done
+	done
+fi
 
 # Watch for commits that been dropped by --cherry-pick
 if test t = "$preserve_merges"
diff --git a/sequencer.c b/sequencer.c
index 77afecaebf0..e858a976279 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -2388,3 +2388,48 @@ void append_signoff(struct strbuf *msgbuf, int ignore_footer, unsigned flag)
 
 	strbuf_release(&sob);
 }
+
+int sequencer_make_script(int keep_empty, FILE *out,
+		int argc, const char **argv)
+{
+	char *format = xstrdup("%s");
+	struct pretty_print_context pp = {0};
+	struct strbuf buf = STRBUF_INIT;
+	struct rev_info revs;
+	struct commit *commit;
+
+	init_revisions(&revs, NULL);
+	revs.verbose_header = 1;
+	revs.max_parents = 1;
+	revs.cherry_pick = 1;
+	revs.limited = 1;
+	revs.reverse = 1;
+	revs.right_only = 1;
+	revs.sort_order = REV_SORT_IN_GRAPH_ORDER;
+	revs.topo_order = 1;
+
+	revs.pretty_given = 1;
+	git_config_get_string("rebase.instructionFormat", &format);
+	get_commit_format(format, &revs);
+	free(format);
+	pp.fmt = revs.commit_format;
+	pp.output_encoding = get_log_output_encoding();
+
+	if (setup_revisions(argc, argv, &revs, NULL) > 1)
+		return error(_("make_script: unhandled options"));
+
+	if (prepare_revision_walk(&revs) < 0)
+		return error(_("make_script: error preparing revisions"));
+
+	while ((commit = get_revision(&revs))) {
+		strbuf_reset(&buf);
+		if (!keep_empty && is_original_commit_empty(commit))
+			strbuf_addf(&buf, "%c ", comment_line_char);
+		strbuf_addf(&buf, "pick %s ", oid_to_hex(&commit->object.oid));
+		pretty_print_commit(&pp, commit, &buf);
+		strbuf_addch(&buf, '\n');
+		fputs(buf.buf, out);
+	}
+	strbuf_release(&buf);
+	return 0;
+}
diff --git a/sequencer.h b/sequencer.h
index f885b68395f..83f2943b7a9 100644
--- a/sequencer.h
+++ b/sequencer.h
@@ -45,6 +45,9 @@ int sequencer_continue(struct replay_opts *opts);
 int sequencer_rollback(struct replay_opts *opts);
 int sequencer_remove_state(struct replay_opts *opts);
 
+int sequencer_make_script(int keep_empty, FILE *out,
+		int argc, const char **argv);
+
 extern const char sign_off_header[];
 
 void append_signoff(struct strbuf *msgbuf, int ignore_footer, unsigned flag);
-- 
2.12.2.windows.2.406.gd14a8f8640f



^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [PATCH v3 2/9] rebase -i: remove useless indentation
  2017-04-26 11:59   ` [PATCH v3 " Johannes Schindelin
  2017-04-26 11:59     ` [PATCH v3 1/9] rebase -i: generate the script via rebase--helper Johannes Schindelin
@ 2017-04-26 11:59     ` Johannes Schindelin
  2017-04-26 11:59     ` [PATCH v3 3/9] rebase -i: do not invent onelines when expanding/collapsing SHA-1s Johannes Schindelin
                       ` (7 subsequent siblings)
  9 siblings, 0 replies; 100+ messages in thread
From: Johannes Schindelin @ 2017-04-26 11:59 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Philip Oakley, Jeff King

The commands used to be indented, and it is nice to look at, but when we
transform the SHA-1s, the indentation is removed. So let's do away with it.

For the moment, at least: when we will use the upcoming rebase--helper
to transform the SHA-1s, we *will* keep the indentation and can
reintroduce it. Yet, to be able to validate the rebase--helper against
the output of the current shell script version, we need to remove the
extra indentation.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 git-rebase--interactive.sh | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
index 609e150d38f..c40b1fd1d2e 100644
--- a/git-rebase--interactive.sh
+++ b/git-rebase--interactive.sh
@@ -155,13 +155,13 @@ reschedule_last_action () {
 append_todo_help () {
 	gettext "
 Commands:
- p, pick = use commit
- r, reword = use commit, but edit the commit message
- e, edit = use commit, but stop for amending
- s, squash = use commit, but meld into previous commit
- f, fixup = like \"squash\", but discard this commit's log message
- x, exec = run command (the rest of the line) using shell
- d, drop = remove commit
+p, pick = use commit
+r, reword = use commit, but edit the commit message
+e, edit = use commit, but stop for amending
+s, squash = use commit, but meld into previous commit
+f, fixup = like \"squash\", but discard this commit's log message
+x, exec = run command (the rest of the line) using shell
+d, drop = remove commit
 
 These lines can be re-ordered; they are executed from top to bottom.
 " | git stripspace --comment-lines >>"$todo"
-- 
2.12.2.windows.2.406.gd14a8f8640f



^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [PATCH v3 3/9] rebase -i: do not invent onelines when expanding/collapsing SHA-1s
  2017-04-26 11:59   ` [PATCH v3 " Johannes Schindelin
  2017-04-26 11:59     ` [PATCH v3 1/9] rebase -i: generate the script via rebase--helper Johannes Schindelin
  2017-04-26 11:59     ` [PATCH v3 2/9] rebase -i: remove useless indentation Johannes Schindelin
@ 2017-04-26 11:59     ` Johannes Schindelin
  2017-04-26 11:59     ` [PATCH v3 4/9] rebase -i: also expand/collapse the SHA-1s via the rebase--helper Johannes Schindelin
                       ` (6 subsequent siblings)
  9 siblings, 0 replies; 100+ messages in thread
From: Johannes Schindelin @ 2017-04-26 11:59 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Philip Oakley, Jeff King

To avoid problems with short SHA-1s that become non-unique during the
rebase, we rewrite the todo script with short/long SHA-1s before and
after letting the user edit the script. Since SHA-1s are not intuitive
for humans, rebase -i also provides the onelines (commit message
subjects) in the script, purely for the user's convenience.

It is very possible to generate a todo script via different means than
rebase -i and then to let rebase -i run with it; In this case, these
onelines are not required.

And this is where the expand/collapse machinery has a bug: it *expects*
that oneline, and failing to find one reuses the previous SHA-1 as
"oneline".

It was most likely an oversight, and made implementation in the (quite
limiting) shell script language less convoluted. However, we are about
to reimplement performance-critical parts in C (and due to spawning a
git.exe process for every single line of the todo script, the
expansion/collapsing of the SHA-1s *is* performance-hampering on
Windows), therefore let's fix this bug to make cross-validation with the
C version of that functionality possible.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 git-rebase--interactive.sh | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
index c40b1fd1d2e..214af0372ba 100644
--- a/git-rebase--interactive.sh
+++ b/git-rebase--interactive.sh
@@ -760,7 +760,12 @@ transform_todo_ids () {
 			;;
 		*)
 			sha1=$(git rev-parse --verify --quiet "$@" ${rest%%[	 ]*}) &&
-			rest="$sha1 ${rest#*[	 ]}"
+			if test "a$rest" = "a${rest#*[	 ]}"
+			then
+				rest=$sha1
+			else
+				rest="$sha1 ${rest#*[	 ]}"
+			fi
 			;;
 		esac
 		printf '%s\n' "$command${rest:+ }$rest"
-- 
2.12.2.windows.2.406.gd14a8f8640f



^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [PATCH v3 4/9] rebase -i: also expand/collapse the SHA-1s via the rebase--helper
  2017-04-26 11:59   ` [PATCH v3 " Johannes Schindelin
                       ` (2 preceding siblings ...)
  2017-04-26 11:59     ` [PATCH v3 3/9] rebase -i: do not invent onelines when expanding/collapsing SHA-1s Johannes Schindelin
@ 2017-04-26 11:59     ` Johannes Schindelin
  2017-04-27  5:00       ` Junio C Hamano
  2017-04-26 11:59     ` [PATCH v3 5/9] t3404: relax rebase.missingCommitsCheck tests Johannes Schindelin
                       ` (5 subsequent siblings)
  9 siblings, 1 reply; 100+ messages in thread
From: Johannes Schindelin @ 2017-04-26 11:59 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Philip Oakley, Jeff King

This is crucial to improve performance on Windows, as the speed is now
mostly dominated by the SHA-1 transformation (because it spawns a new
rev-parse process for *every* line, and spawning processes is pretty
slow from Git for Windows' MSYS2 Bash).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 builtin/rebase--helper.c   | 10 +++++++-
 git-rebase--interactive.sh |  4 ++--
 sequencer.c                | 59 ++++++++++++++++++++++++++++++++++++++++++++++
 sequencer.h                |  2 ++
 4 files changed, 72 insertions(+), 3 deletions(-)

diff --git a/builtin/rebase--helper.c b/builtin/rebase--helper.c
index 821058d452d..9444c8d6c60 100644
--- a/builtin/rebase--helper.c
+++ b/builtin/rebase--helper.c
@@ -13,7 +13,7 @@ int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
 	struct replay_opts opts = REPLAY_OPTS_INIT;
 	int keep_empty = 0;
 	enum {
-		CONTINUE = 1, ABORT, MAKE_SCRIPT
+		CONTINUE = 1, ABORT, MAKE_SCRIPT, SHORTEN_SHA1S, EXPAND_SHA1S
 	} command = 0;
 	struct option options[] = {
 		OPT_BOOL(0, "ff", &opts.allow_ff, N_("allow fast-forward")),
@@ -24,6 +24,10 @@ int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
 				ABORT),
 		OPT_CMDMODE(0, "make-script", &command,
 			N_("make rebase script"), MAKE_SCRIPT),
+		OPT_CMDMODE(0, "shorten-sha1s", &command,
+			N_("shorten SHA-1s in the todo list"), SHORTEN_SHA1S),
+		OPT_CMDMODE(0, "expand-sha1s", &command,
+			N_("expand SHA-1s in the todo list"), EXPAND_SHA1S),
 		OPT_END()
 	};
 
@@ -42,5 +46,9 @@ int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
 		return !!sequencer_remove_state(&opts);
 	if (command == MAKE_SCRIPT && argc > 1)
 		return !!sequencer_make_script(keep_empty, stdout, argc, argv);
+	if (command == SHORTEN_SHA1S && argc == 1)
+		return !!transform_todo_ids(1);
+	if (command == EXPAND_SHA1S && argc == 1)
+		return !!transform_todo_ids(0);
 	usage_with_options(builtin_rebase_helper_usage, options);
 }
diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
index 214af0372ba..52a19e0bdb3 100644
--- a/git-rebase--interactive.sh
+++ b/git-rebase--interactive.sh
@@ -774,11 +774,11 @@ transform_todo_ids () {
 }
 
 expand_todo_ids() {
-	transform_todo_ids
+	git rebase--helper --expand-sha1s
 }
 
 collapse_todo_ids() {
-	transform_todo_ids --short
+	git rebase--helper --shorten-sha1s
 }
 
 # Rearrange the todo list that has both "pick sha1 msg" and
diff --git a/sequencer.c b/sequencer.c
index e858a976279..4b7f88b338f 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -2433,3 +2433,62 @@ int sequencer_make_script(int keep_empty, FILE *out,
 	strbuf_release(&buf);
 	return 0;
 }
+
+
+int transform_todo_ids(int shorten_sha1s)
+{
+	const char *todo_file = rebase_path_todo();
+	struct todo_list todo_list = TODO_LIST_INIT;
+	int fd, res, i;
+	FILE *out;
+
+	strbuf_reset(&todo_list.buf);
+	fd = open(todo_file, O_RDONLY);
+	if (fd < 0)
+		return error_errno(_("could not open '%s'"), todo_file);
+	if (strbuf_read(&todo_list.buf, fd, 0) < 0) {
+		close(fd);
+		return error(_("could not read '%s'."), todo_file);
+	}
+	close(fd);
+
+	res = parse_insn_buffer(todo_list.buf.buf, &todo_list);
+	if (res) {
+		todo_list_release(&todo_list);
+		return error(_("unusable instruction sheet: '%s'"), todo_file);
+	}
+
+	out = fopen(todo_file, "w");
+	if (!out) {
+		todo_list_release(&todo_list);
+		return error(_("unable to open '%s' for writing"), todo_file);
+	}
+	for (i = 0; i < todo_list.nr; i++) {
+		struct todo_item *item = todo_list.items + i;
+		int bol = item->offset_in_buf;
+		const char *p = todo_list.buf.buf + bol;
+		int eol = i + 1 < todo_list.nr ?
+			todo_list.items[i + 1].offset_in_buf :
+			todo_list.buf.len;
+
+		if (item->command >= TODO_EXEC && item->command != TODO_DROP)
+			fwrite(p, eol - bol, 1, out);
+		else {
+			int eoc = strcspn(p, " \t");
+			const char *sha1 = shorten_sha1s ?
+				short_commit_name(item->commit) :
+				oid_to_hex(&item->commit->object.oid);
+
+			if (!eoc) {
+				p += strspn(p, " \t");
+				eoc = strcspn(p, " \t");
+			}
+
+			fprintf(out, "%.*s %s %.*s\n",
+				eoc, p, sha1, item->arg_len, item->arg);
+		}
+	}
+	fclose(out);
+	todo_list_release(&todo_list);
+	return 0;
+}
diff --git a/sequencer.h b/sequencer.h
index 83f2943b7a9..47a81034e76 100644
--- a/sequencer.h
+++ b/sequencer.h
@@ -48,6 +48,8 @@ int sequencer_remove_state(struct replay_opts *opts);
 int sequencer_make_script(int keep_empty, FILE *out,
 		int argc, const char **argv);
 
+int transform_todo_ids(int shorten_sha1s);
+
 extern const char sign_off_header[];
 
 void append_signoff(struct strbuf *msgbuf, int ignore_footer, unsigned flag);
-- 
2.12.2.windows.2.406.gd14a8f8640f



^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [PATCH v3 5/9] t3404: relax rebase.missingCommitsCheck tests
  2017-04-26 11:59   ` [PATCH v3 " Johannes Schindelin
                       ` (3 preceding siblings ...)
  2017-04-26 11:59     ` [PATCH v3 4/9] rebase -i: also expand/collapse the SHA-1s via the rebase--helper Johannes Schindelin
@ 2017-04-26 11:59     ` Johannes Schindelin
  2017-04-27  5:05       ` Junio C Hamano
  2017-04-26 11:59     ` [PATCH v3 6/9] rebase -i: check for missing commits in the rebase--helper Johannes Schindelin
                       ` (4 subsequent siblings)
  9 siblings, 1 reply; 100+ messages in thread
From: Johannes Schindelin @ 2017-04-26 11:59 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Philip Oakley, Jeff King

These tests were a bit anal about the *exact* warning/error message
printed by git rebase. But those messages are intended for the *end
user*, therefore it does not make sense to test so rigidly for the
*exact* wording.

In the following, we will reimplement the missing commits check in
the sequencer, with slightly different words.

So let's just test for the parts in the warning/error message that
we *really* care about, nothing more, nothing less.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 t/t3404-rebase-interactive.sh | 22 ++++------------------
 1 file changed, 4 insertions(+), 18 deletions(-)

diff --git a/t/t3404-rebase-interactive.sh b/t/t3404-rebase-interactive.sh
index 33d392ba112..61113be08a4 100755
--- a/t/t3404-rebase-interactive.sh
+++ b/t/t3404-rebase-interactive.sh
@@ -1242,20 +1242,13 @@ test_expect_success 'rebase -i respects rebase.missingCommitsCheck = error' '
 	test B = $(git cat-file commit HEAD^ | sed -ne \$p)
 '
 
-cat >expect <<EOF
-Warning: the command isn't recognized in the following line:
- - badcmd $(git rev-list --oneline -1 master~1)
-
-You can fix this with 'git rebase --edit-todo' and then run 'git rebase --continue'.
-Or you can abort the rebase with 'git rebase --abort'.
-EOF
-
 test_expect_success 'static check of bad command' '
 	rebase_setup_and_clean bad-cmd &&
 	set_fake_editor &&
 	test_must_fail env FAKE_LINES="1 2 3 bad 4 5" \
 		git rebase -i --root 2>actual &&
-	test_i18ncmp expect actual &&
+	test_i18ngrep "badcmd $(git rev-list --oneline -1 master~1)" actual &&
+	test_i18ngrep "You can fix this with .git rebase --edit-todo.." actual &&
 	FAKE_LINES="1 2 3 drop 4 5" git rebase --edit-todo &&
 	git rebase --continue &&
 	test E = $(git cat-file commit HEAD | sed -ne \$p) &&
@@ -1277,20 +1270,13 @@ test_expect_success 'tabs and spaces are accepted in the todolist' '
 	test E = $(git cat-file commit HEAD | sed -ne \$p)
 '
 
-cat >expect <<EOF
-Warning: the SHA-1 is missing or isn't a commit in the following line:
- - edit XXXXXXX False commit
-
-You can fix this with 'git rebase --edit-todo' and then run 'git rebase --continue'.
-Or you can abort the rebase with 'git rebase --abort'.
-EOF
-
 test_expect_success 'static check of bad SHA-1' '
 	rebase_setup_and_clean bad-sha &&
 	set_fake_editor &&
 	test_must_fail env FAKE_LINES="1 2 edit fakesha 3 4 5 #" \
 		git rebase -i --root 2>actual &&
-	test_i18ncmp expect actual &&
+	test_i18ngrep "edit XXXXXXX False commit" actual &&
+	test_i18ngrep "You can fix this with .git rebase --edit-todo.." actual &&
 	FAKE_LINES="1 2 4 5 6" git rebase --edit-todo &&
 	git rebase --continue &&
 	test E = $(git cat-file commit HEAD | sed -ne \$p)
-- 
2.12.2.windows.2.406.gd14a8f8640f



^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [PATCH v3 6/9] rebase -i: check for missing commits in the rebase--helper
  2017-04-26 11:59   ` [PATCH v3 " Johannes Schindelin
                       ` (4 preceding siblings ...)
  2017-04-26 11:59     ` [PATCH v3 5/9] t3404: relax rebase.missingCommitsCheck tests Johannes Schindelin
@ 2017-04-26 11:59     ` Johannes Schindelin
  2017-04-27  5:32       ` Junio C Hamano
  2017-04-26 12:00     ` [PATCH v3 7/9] rebase -i: skip unnecessary picks using " Johannes Schindelin
                       ` (3 subsequent siblings)
  9 siblings, 1 reply; 100+ messages in thread
From: Johannes Schindelin @ 2017-04-26 11:59 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Philip Oakley, Jeff King

In particular on Windows, where shell scripts are even more expensive
than on MacOSX or Linux, it makes sense to move a loop that forks
Git at least once for every line in the todo list into a builtin.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 builtin/rebase--helper.c   |   7 +-
 git-rebase--interactive.sh | 164 ++-------------------------------------------
 sequencer.c                | 125 ++++++++++++++++++++++++++++++++++
 sequencer.h                |   1 +
 4 files changed, 137 insertions(+), 160 deletions(-)

diff --git a/builtin/rebase--helper.c b/builtin/rebase--helper.c
index 9444c8d6c60..e706eac710d 100644
--- a/builtin/rebase--helper.c
+++ b/builtin/rebase--helper.c
@@ -13,7 +13,8 @@ int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
 	struct replay_opts opts = REPLAY_OPTS_INIT;
 	int keep_empty = 0;
 	enum {
-		CONTINUE = 1, ABORT, MAKE_SCRIPT, SHORTEN_SHA1S, EXPAND_SHA1S
+		CONTINUE = 1, ABORT, MAKE_SCRIPT, SHORTEN_SHA1S, EXPAND_SHA1S,
+		CHECK_TODO_LIST
 	} command = 0;
 	struct option options[] = {
 		OPT_BOOL(0, "ff", &opts.allow_ff, N_("allow fast-forward")),
@@ -28,6 +29,8 @@ int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
 			N_("shorten SHA-1s in the todo list"), SHORTEN_SHA1S),
 		OPT_CMDMODE(0, "expand-sha1s", &command,
 			N_("expand SHA-1s in the todo list"), EXPAND_SHA1S),
+		OPT_CMDMODE(0, "check-todo-list", &command,
+			N_("check the todo list"), CHECK_TODO_LIST),
 		OPT_END()
 	};
 
@@ -50,5 +53,7 @@ int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
 		return !!transform_todo_ids(1);
 	if (command == EXPAND_SHA1S && argc == 1)
 		return !!transform_todo_ids(0);
+	if (command == CHECK_TODO_LIST && argc == 1)
+		return !!check_todo_list();
 	usage_with_options(builtin_rebase_helper_usage, options);
 }
diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
index 52a19e0bdb3..1649506e1e4 100644
--- a/git-rebase--interactive.sh
+++ b/git-rebase--interactive.sh
@@ -890,96 +890,6 @@ add_exec_commands () {
 	mv "$1.new" "$1"
 }
 
-# Check if the SHA-1 passed as an argument is a
-# correct one, if not then print $2 in "$todo".badsha
-# $1: the SHA-1 to test
-# $2: the line number of the input
-# $3: the input filename
-check_commit_sha () {
-	badsha=0
-	if test -z "$1"
-	then
-		badsha=1
-	else
-		sha1_verif="$(git rev-parse --verify --quiet $1^{commit})"
-		if test -z "$sha1_verif"
-		then
-			badsha=1
-		fi
-	fi
-
-	if test $badsha -ne 0
-	then
-		line="$(sed -n -e "${2}p" "$3")"
-		warn "$(eval_gettext "\
-Warning: the SHA-1 is missing or isn't a commit in the following line:
- - \$line")"
-		warn
-	fi
-
-	return $badsha
-}
-
-# prints the bad commits and bad commands
-# from the todolist in stdin
-check_bad_cmd_and_sha () {
-	retval=0
-	lineno=0
-	while read -r command rest
-	do
-		lineno=$(( $lineno + 1 ))
-		case $command in
-		"$comment_char"*|''|noop|x|exec)
-			# Doesn't expect a SHA-1
-			;;
-		"$cr")
-			# Work around CR left by "read" (e.g. with Git for
-			# Windows' Bash).
-			;;
-		pick|p|drop|d|reword|r|edit|e|squash|s|fixup|f)
-			if ! check_commit_sha "${rest%%[ 	]*}" "$lineno" "$1"
-			then
-				retval=1
-			fi
-			;;
-		*)
-			line="$(sed -n -e "${lineno}p" "$1")"
-			warn "$(eval_gettext "\
-Warning: the command isn't recognized in the following line:
- - \$line")"
-			warn
-			retval=1
-			;;
-		esac
-	done <"$1"
-	return $retval
-}
-
-# Print the list of the SHA-1 of the commits
-# from stdin to stdout
-todo_list_to_sha_list () {
-	git stripspace --strip-comments |
-	while read -r command sha1 rest
-	do
-		case $command in
-		"$comment_char"*|''|noop|x|"exec")
-			;;
-		*)
-			long_sha=$(git rev-list --no-walk "$sha1" 2>/dev/null)
-			printf "%s\n" "$long_sha"
-			;;
-		esac
-	done
-}
-
-# Use warn for each line in stdin
-warn_lines () {
-	while read -r line
-	do
-		warn " - $line"
-	done
-}
-
 # Switch to the branch in $into and notify it in the reflog
 checkout_onto () {
 	GIT_REFLOG_ACTION="$GIT_REFLOG_ACTION: checkout $onto_name"
@@ -994,74 +904,6 @@ get_missing_commit_check_level () {
 	printf '%s' "$check_level" | tr 'A-Z' 'a-z'
 }
 
-# Check if the user dropped some commits by mistake
-# Behaviour determined by rebase.missingCommitsCheck.
-# Check if there is an unrecognized command or a
-# bad SHA-1 in a command.
-check_todo_list () {
-	raise_error=f
-
-	check_level=$(get_missing_commit_check_level)
-
-	case "$check_level" in
-	warn|error)
-		# Get the SHA-1 of the commits
-		todo_list_to_sha_list <"$todo".backup >"$todo".oldsha1
-		todo_list_to_sha_list <"$todo" >"$todo".newsha1
-
-		# Sort the SHA-1 and compare them
-		sort -u "$todo".oldsha1 >"$todo".oldsha1+
-		mv "$todo".oldsha1+ "$todo".oldsha1
-		sort -u "$todo".newsha1 >"$todo".newsha1+
-		mv "$todo".newsha1+ "$todo".newsha1
-		comm -2 -3 "$todo".oldsha1 "$todo".newsha1 >"$todo".miss
-
-		# Warn about missing commits
-		if test -s "$todo".miss
-		then
-			test "$check_level" = error && raise_error=t
-
-			warn "$(gettext "\
-Warning: some commits may have been dropped accidentally.
-Dropped commits (newer to older):")"
-
-			# Make the list user-friendly and display
-			opt="--no-walk=sorted --format=oneline --abbrev-commit --stdin"
-			git rev-list $opt <"$todo".miss | warn_lines
-
-			warn "$(gettext "\
-To avoid this message, use \"drop\" to explicitly remove a commit.
-
-Use 'git config rebase.missingCommitsCheck' to change the level of warnings.
-The possible behaviours are: ignore, warn, error.")"
-			warn
-		fi
-		;;
-	ignore)
-		;;
-	*)
-		warn "$(eval_gettext "Unrecognized setting \$check_level for option rebase.missingCommitsCheck. Ignoring.")"
-		;;
-	esac
-
-	if ! check_bad_cmd_and_sha "$todo"
-	then
-		raise_error=t
-	fi
-
-	if test $raise_error = t
-	then
-		# Checkout before the first commit of the
-		# rebase: this way git rebase --continue
-		# will work correctly as it expects HEAD to be
-		# placed before the commit of the next action
-		checkout_onto
-
-		warn "$(gettext "You can fix this with 'git rebase --edit-todo' and then run 'git rebase --continue'.")"
-		die "$(gettext "Or you can abort the rebase with 'git rebase --abort'.")"
-	fi
-}
-
 # The whole contents of this file is run by dot-sourcing it from
 # inside a shell function.  It used to be that "return"s we see
 # below were not inside any function, and expected to return
@@ -1322,7 +1164,11 @@ git_sequence_editor "$todo" ||
 has_action "$todo" ||
 	return 2
 
-check_todo_list
+git rebase--helper --check-todo-list || {
+	ret=$?
+	checkout_onto
+	exit $ret
+}
 
 expand_todo_ids
 
diff --git a/sequencer.c b/sequencer.c
index 4b7f88b338f..fb3915ee39e 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -2492,3 +2492,128 @@ int transform_todo_ids(int shorten_sha1s)
 	todo_list_release(&todo_list);
 	return 0;
 }
+
+enum check_level {
+	CHECK_IGNORE = 0, CHECK_WARN, CHECK_ERROR
+};
+
+static enum check_level get_missing_commit_check_level(void)
+{
+	const char *value;
+
+	if (git_config_get_value("rebase.missingcommitscheck", &value) ||
+			!strcasecmp("ignore", value))
+		return CHECK_IGNORE;
+	if (!strcasecmp("warn", value))
+		return CHECK_WARN;
+	if (!strcasecmp("error", value))
+		return CHECK_ERROR;
+	warning(_("unrecognized setting %s for option"
+		  "rebase.missingCommitsCheck. Ignoring."), value);
+	return CHECK_IGNORE;
+}
+
+/*
+ * Check if the user dropped some commits by mistake
+ * Behaviour determined by rebase.missingCommitsCheck.
+ * Check if there is an unrecognized command or a
+ * bad SHA-1 in a command.
+ */
+int check_todo_list(void)
+{
+	enum check_level check_level = get_missing_commit_check_level();
+	struct strbuf todo_file = STRBUF_INIT;
+	struct todo_list todo_list = TODO_LIST_INIT;
+	struct commit_list *missing = NULL;
+	int raise_error = 0, res = 0, fd, i;
+
+	strbuf_addstr(&todo_file, rebase_path_todo());
+	fd = open(todo_file.buf, O_RDONLY);
+	if (fd < 0) {
+		res = error_errno(_("could not open '%s'"), todo_file.buf);
+		goto leave_check;
+	}
+	if (strbuf_read(&todo_list.buf, fd, 0) < 0) {
+		close(fd);
+		res = error(_("could not read '%s'."), todo_file.buf);
+		goto leave_check;
+	}
+	close(fd);
+	raise_error = res =
+		parse_insn_buffer(todo_list.buf.buf, &todo_list);
+
+	if (check_level == CHECK_IGNORE)
+		goto leave_check;
+
+	/* Get the SHA-1 of the commits */
+	for (i = 0; i < todo_list.nr; i++) {
+		struct commit *commit = todo_list.items[i].commit;
+		if (commit)
+			commit->util = todo_list.items + i;
+	}
+
+	todo_list_release(&todo_list);
+	strbuf_addstr(&todo_file, ".backup");
+	fd = open(todo_file.buf, O_RDONLY);
+	if (fd < 0) {
+		res = error_errno(_("could not open '%s'"), todo_file.buf);
+		goto leave_check;
+	}
+	if (strbuf_read(&todo_list.buf, fd, 0) < 0) {
+		close(fd);
+		res = error(_("could not read '%s'."), todo_file.buf);
+		goto leave_check;
+	}
+	close(fd);
+	strbuf_release(&todo_file);
+	res = !!parse_insn_buffer(todo_list.buf.buf, &todo_list);
+
+	/* Find commits that are missing after editing */
+	for (i = 0; i < todo_list.nr; i++) {
+		struct commit *commit = todo_list.items[i].commit;
+		if (commit && !commit->util) {
+			commit_list_insert(commit, &missing);
+			commit->util = todo_list.items + i;
+		}
+	}
+
+	/* Warn about missing commits */
+	if (!missing)
+		goto leave_check;
+
+	if (check_level == CHECK_ERROR)
+		raise_error = res = 1;
+
+	fprintf(stderr,
+		_("Warning: some commits may have been dropped accidentally.\n"
+		"Dropped commits (newer to older):\n"));
+
+	/* Make the list user-friendly and display */
+	while (missing) {
+		struct commit *commit = pop_commit(&missing);
+		struct todo_item *item = commit->util;
+
+		fprintf(stderr, " - %s %.*s\n", short_commit_name(commit),
+			item->arg_len, item->arg);
+	}
+	free_commit_list(missing);
+
+	fprintf(stderr, _("To avoid this message, use \"drop\" to "
+		"explicitly remove a commit.\n\n"
+		"Use 'git config rebase.missingCommitsCheck' to change "
+		"the level of warnings.\n"
+		"The possible behaviours are: ignore, warn, error.\n\n"));
+
+leave_check:
+	strbuf_release(&todo_file);
+	todo_list_release(&todo_list);
+
+	if (raise_error)
+		fprintf(stderr,
+			_("You can fix this with 'git rebase --edit-todo' "
+			  "and then run 'git rebase --continue'.\n"
+			  "Or you can abort the rebase with 'git rebase"
+			  " --abort'.\n"));
+
+	return res;
+}
diff --git a/sequencer.h b/sequencer.h
index 47a81034e76..4978a61b83b 100644
--- a/sequencer.h
+++ b/sequencer.h
@@ -49,6 +49,7 @@ int sequencer_make_script(int keep_empty, FILE *out,
 		int argc, const char **argv);
 
 int transform_todo_ids(int shorten_sha1s);
+int check_todo_list(void);
 
 extern const char sign_off_header[];
 
-- 
2.12.2.windows.2.406.gd14a8f8640f



^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [PATCH v3 7/9] rebase -i: skip unnecessary picks using the rebase--helper
  2017-04-26 11:59   ` [PATCH v3 " Johannes Schindelin
                       ` (5 preceding siblings ...)
  2017-04-26 11:59     ` [PATCH v3 6/9] rebase -i: check for missing commits in the rebase--helper Johannes Schindelin
@ 2017-04-26 12:00     ` Johannes Schindelin
  2017-04-26 12:00     ` [PATCH v3 8/9] t3415: test fixup with wrapped oneline Johannes Schindelin
                       ` (2 subsequent siblings)
  9 siblings, 0 replies; 100+ messages in thread
From: Johannes Schindelin @ 2017-04-26 12:00 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Philip Oakley, Jeff King

In particular on Windows, where shell scripts are even more expensive
than on MacOSX or Linux, it makes sense to move a loop that forks
Git at least once for every line in the todo list into a builtin.

Note: The original code did not try to skip unnecessary picks of root
commits but punts instead (probably --root was not considered common
enough of a use case to bother optimizing). We do the same, for now.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 builtin/rebase--helper.c   |   6 ++-
 git-rebase--interactive.sh |  41 ++---------------
 sequencer.c                | 107 +++++++++++++++++++++++++++++++++++++++++++++
 sequencer.h                |   1 +
 4 files changed, 116 insertions(+), 39 deletions(-)

diff --git a/builtin/rebase--helper.c b/builtin/rebase--helper.c
index e706eac710d..de3ccd9bfbc 100644
--- a/builtin/rebase--helper.c
+++ b/builtin/rebase--helper.c
@@ -14,7 +14,7 @@ int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
 	int keep_empty = 0;
 	enum {
 		CONTINUE = 1, ABORT, MAKE_SCRIPT, SHORTEN_SHA1S, EXPAND_SHA1S,
-		CHECK_TODO_LIST
+		CHECK_TODO_LIST, SKIP_UNNECESSARY_PICKS
 	} command = 0;
 	struct option options[] = {
 		OPT_BOOL(0, "ff", &opts.allow_ff, N_("allow fast-forward")),
@@ -31,6 +31,8 @@ int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
 			N_("expand SHA-1s in the todo list"), EXPAND_SHA1S),
 		OPT_CMDMODE(0, "check-todo-list", &command,
 			N_("check the todo list"), CHECK_TODO_LIST),
+		OPT_CMDMODE(0, "skip-unnecessary-picks", &command,
+			N_("skip unnecessary picks"), SKIP_UNNECESSARY_PICKS),
 		OPT_END()
 	};
 
@@ -55,5 +57,7 @@ int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
 		return !!transform_todo_ids(0);
 	if (command == CHECK_TODO_LIST && argc == 1)
 		return !!check_todo_list();
+	if (command == SKIP_UNNECESSARY_PICKS && argc == 1)
+		return !!skip_unnecessary_picks();
 	usage_with_options(builtin_rebase_helper_usage, options);
 }
diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
index 1649506e1e4..931bc09e0cf 100644
--- a/git-rebase--interactive.sh
+++ b/git-rebase--interactive.sh
@@ -713,43 +713,6 @@ do_rest () {
 	done
 }
 
-# skip picking commits whose parents are unchanged
-skip_unnecessary_picks () {
-	fd=3
-	while read -r command rest
-	do
-		# fd=3 means we skip the command
-		case "$fd,$command" in
-		3,pick|3,p)
-			# pick a commit whose parent is current $onto -> skip
-			sha1=${rest%% *}
-			case "$(git rev-parse --verify --quiet "$sha1"^)" in
-			"$onto"*)
-				onto=$sha1
-				;;
-			*)
-				fd=1
-				;;
-			esac
-			;;
-		3,"$comment_char"*|3,)
-			# copy comments
-			;;
-		*)
-			fd=1
-			;;
-		esac
-		printf '%s\n' "$command${rest:+ }$rest" >&$fd
-	done <"$todo" >"$todo.new" 3>>"$done" &&
-	mv -f "$todo".new "$todo" &&
-	case "$(peek_next_command)" in
-	squash|s|fixup|f)
-		record_in_rewritten "$onto"
-		;;
-	esac ||
-		die "$(gettext "Could not skip unnecessary pick commands")"
-}
-
 transform_todo_ids () {
 	while read -r command rest
 	do
@@ -1172,7 +1135,9 @@ git rebase--helper --check-todo-list || {
 
 expand_todo_ids
 
-test -d "$rewritten" || test -n "$force_rebase" || skip_unnecessary_picks
+test -d "$rewritten" || test -n "$force_rebase" ||
+onto="$(git rebase--helper --skip-unnecessary-picks)" ||
+die "Could not skip unnecessary pick commands"
 
 checkout_onto
 if test -z "$rebase_root" && test ! -d "$rewritten"
diff --git a/sequencer.c b/sequencer.c
index fb3915ee39e..b51faa0120f 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -2617,3 +2617,110 @@ int check_todo_list(void)
 
 	return res;
 }
+
+/* skip picking commits whose parents are unchanged */
+int skip_unnecessary_picks(void)
+{
+	const char *todo_file = rebase_path_todo();
+	struct strbuf buf = STRBUF_INIT;
+	struct todo_list todo_list = TODO_LIST_INIT;
+	struct object_id onto_oid, *oid = &onto_oid, *parent_oid;
+	int fd, i;
+
+	if (!read_oneliner(&buf, rebase_path_onto(), 0))
+		return error(_("could not read 'onto'"));
+	if (get_sha1(buf.buf, onto_oid.hash)) {
+		strbuf_release(&buf);
+		return error(_("need a HEAD to fixup"));
+	}
+	strbuf_release(&buf);
+
+	fd = open(todo_file, O_RDONLY);
+	if (fd < 0) {
+		return error_errno(_("could not open '%s'"), todo_file);
+	}
+	if (strbuf_read(&todo_list.buf, fd, 0) < 0) {
+		close(fd);
+		return error(_("could not read '%s'."), todo_file);
+	}
+	close(fd);
+	if (parse_insn_buffer(todo_list.buf.buf, &todo_list) < 0) {
+		todo_list_release(&todo_list);
+		return -1;
+	}
+
+	for (i = 0; i < todo_list.nr; i++) {
+		struct todo_item *item = todo_list.items + i;
+
+		if (item->command >= TODO_NOOP)
+			continue;
+		if (item->command != TODO_PICK)
+			break;
+		if (parse_commit(item->commit)) {
+			todo_list_release(&todo_list);
+			return error(_("could not parse commit '%s'"),
+				oid_to_hex(&item->commit->object.oid));
+		}
+		if (!item->commit->parents)
+			break; /* root commit */
+		if (item->commit->parents->next)
+			break; /* merge commit */
+		parent_oid = &item->commit->parents->item->object.oid;
+		if (hashcmp(parent_oid->hash, oid->hash))
+			break;
+		oid = &item->commit->object.oid;
+	}
+	if (i > 0) {
+		int offset = i < todo_list.nr ?
+			todo_list.items[i].offset_in_buf : todo_list.buf.len;
+		const char *done_path = rebase_path_done();
+
+		fd = open(done_path, O_CREAT | O_WRONLY | O_APPEND, 0666);
+		if (fd < 0) {
+			error_errno(_("could not open '%s' for writing"),
+				    done_path);
+			todo_list_release(&todo_list);
+			return -1;
+		}
+		if (write_in_full(fd, todo_list.buf.buf, offset) < 0) {
+			error_errno(_("could not write to '%s'"), done_path);
+			todo_list_release(&todo_list);
+			close(fd);
+			return -1;
+		}
+		close(fd);
+
+		fd = open(rebase_path_todo(), O_WRONLY, 0666);
+		if (fd < 0) {
+			error_errno(_("could not open '%s' for writing"),
+				    rebase_path_todo());
+			todo_list_release(&todo_list);
+			return -1;
+		}
+		if (write_in_full(fd, todo_list.buf.buf + offset,
+				todo_list.buf.len - offset) < 0) {
+			error_errno(_("could not write to '%s'"),
+				    rebase_path_todo());
+			close(fd);
+			todo_list_release(&todo_list);
+			return -1;
+		}
+		if (ftruncate(fd, todo_list.buf.len - offset) < 0) {
+			error_errno(_("could not truncate '%s'"),
+				    rebase_path_todo());
+			todo_list_release(&todo_list);
+			close(fd);
+			return -1;
+		}
+		close(fd);
+
+		todo_list.current = i;
+		if (is_fixup(peek_command(&todo_list, 0)))
+			record_in_rewritten(oid, peek_command(&todo_list, 0));
+	}
+
+	todo_list_release(&todo_list);
+	printf("%s\n", oid_to_hex(oid));
+
+	return 0;
+}
diff --git a/sequencer.h b/sequencer.h
index 4978a61b83b..28e1fc1e9bb 100644
--- a/sequencer.h
+++ b/sequencer.h
@@ -50,6 +50,7 @@ int sequencer_make_script(int keep_empty, FILE *out,
 
 int transform_todo_ids(int shorten_sha1s);
 int check_todo_list(void);
+int skip_unnecessary_picks(void);
 
 extern const char sign_off_header[];
 
-- 
2.12.2.windows.2.406.gd14a8f8640f



^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [PATCH v3 8/9] t3415: test fixup with wrapped oneline
  2017-04-26 11:59   ` [PATCH v3 " Johannes Schindelin
                       ` (6 preceding siblings ...)
  2017-04-26 12:00     ` [PATCH v3 7/9] rebase -i: skip unnecessary picks using " Johannes Schindelin
@ 2017-04-26 12:00     ` Johannes Schindelin
  2017-04-26 12:00     ` [PATCH v3 9/9] rebase -i: rearrange fixup/squash lines using the rebase--helper Johannes Schindelin
  2017-04-28 21:30     ` [PATCH v4 00/10] The final building block for a faster rebase -i Johannes Schindelin
  9 siblings, 0 replies; 100+ messages in thread
From: Johannes Schindelin @ 2017-04-26 12:00 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Philip Oakley, Jeff King

The `git commit --fixup` command unwraps wrapped onelines when
constructing the commit message, without wrapping the result.

We need to make sure that `git rebase --autosquash` keeps handling such
cases correctly, in particular since we are about to move the autosquash
handling into the rebase--helper.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 t/t3415-rebase-autosquash.sh | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/t/t3415-rebase-autosquash.sh b/t/t3415-rebase-autosquash.sh
index 48346f1cc0c..9fd629a6e21 100755
--- a/t/t3415-rebase-autosquash.sh
+++ b/t/t3415-rebase-autosquash.sh
@@ -304,4 +304,18 @@ test_expect_success 'extra spaces after fixup!' '
 	test $base = $parent
 '
 
+test_expect_success 'wrapped original subject' '
+	if test -d .git/rebase-merge; then git rebase --abort; fi &&
+	base=$(git rev-parse HEAD) &&
+	echo "wrapped subject" >wrapped &&
+	git add wrapped &&
+	test_tick &&
+	git commit --allow-empty -m "$(printf "To\nfixup")" &&
+	test_tick &&
+	git commit --allow-empty -m "fixup! To fixup" &&
+	git rebase -i --autosquash --keep-empty HEAD~2 &&
+	parent=$(git rev-parse HEAD^) &&
+	test $base = $parent
+'
+
 test_done
-- 
2.12.2.windows.2.406.gd14a8f8640f



^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [PATCH v3 9/9] rebase -i: rearrange fixup/squash lines using the rebase--helper
  2017-04-26 11:59   ` [PATCH v3 " Johannes Schindelin
                       ` (7 preceding siblings ...)
  2017-04-26 12:00     ` [PATCH v3 8/9] t3415: test fixup with wrapped oneline Johannes Schindelin
@ 2017-04-26 12:00     ` Johannes Schindelin
  2017-04-28 21:30     ` [PATCH v4 00/10] The final building block for a faster rebase -i Johannes Schindelin
  9 siblings, 0 replies; 100+ messages in thread
From: Johannes Schindelin @ 2017-04-26 12:00 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Philip Oakley, Jeff King

This operation has quadratic complexity, which is especially painful
on Windows, where shell scripts are *already* slow (mainly due to the
overhead of the POSIX emulation layer).

Let's reimplement this with linear complexity (using a hash map to
match the commits' subject lines) for the common case; Sadly, the
fixup/squash feature's design neglected performance considerations,
allowing arbitrary prefixes (read: `fixup! hell` will match the
commit subject `hello world`), which means that we are stuck with
quadratic performance in the worst case.

The reimplemented logic also happens to fix a bug where commented-out
lines (representing empty patches) were dropped by the previous code.

While at it, clarify how the fixup/squash feature works in `git rebase
-i`'s man page.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 Documentation/git-rebase.txt |  16 ++--
 builtin/rebase--helper.c     |   6 +-
 git-rebase--interactive.sh   |  90 +-------------------
 sequencer.c                  | 195 +++++++++++++++++++++++++++++++++++++++++++
 sequencer.h                  |   1 +
 t/t3415-rebase-autosquash.sh |   2 +-
 6 files changed, 212 insertions(+), 98 deletions(-)

diff --git a/Documentation/git-rebase.txt b/Documentation/git-rebase.txt
index 67d48e68831..da79fbda5b3 100644
--- a/Documentation/git-rebase.txt
+++ b/Documentation/git-rebase.txt
@@ -425,13 +425,15 @@ without an explicit `--interactive`.
 --autosquash::
 --no-autosquash::
 	When the commit log message begins with "squash! ..." (or
-	"fixup! ..."), and there is a commit whose title begins with
-	the same ..., automatically modify the todo list of rebase -i
-	so that the commit marked for squashing comes right after the
-	commit to be modified, and change the action of the moved
-	commit from `pick` to `squash` (or `fixup`).  Ignores subsequent
-	"fixup! " or "squash! " after the first, in case you referred to an
-	earlier fixup/squash with `git commit --fixup/--squash`.
+	"fixup! ..."), and there is already a commit in the todo list that
+	matches the same `...`, automatically modify the todo list of rebase
+	-i so that the commit marked for squashing comes right after the
+	commit to be modified, and change the action of the moved commit
+	from `pick` to `squash` (or `fixup`).  A commit matches the `...` if
+	the commit subject matches, or if the `...` refers to the commit's
+	hash. As a fall-back, partial matches of the commit subject work,
+	too.  The recommended way to create fixup/squash commits is by using
+	the `--fixup`/`--squash` options of linkgit:git-commit[1].
 +
 This option is only valid when the `--interactive` option is used.
 +
diff --git a/builtin/rebase--helper.c b/builtin/rebase--helper.c
index de3ccd9bfbc..e6591f01112 100644
--- a/builtin/rebase--helper.c
+++ b/builtin/rebase--helper.c
@@ -14,7 +14,7 @@ int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
 	int keep_empty = 0;
 	enum {
 		CONTINUE = 1, ABORT, MAKE_SCRIPT, SHORTEN_SHA1S, EXPAND_SHA1S,
-		CHECK_TODO_LIST, SKIP_UNNECESSARY_PICKS
+		CHECK_TODO_LIST, SKIP_UNNECESSARY_PICKS, REARRANGE_SQUASH
 	} command = 0;
 	struct option options[] = {
 		OPT_BOOL(0, "ff", &opts.allow_ff, N_("allow fast-forward")),
@@ -33,6 +33,8 @@ int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
 			N_("check the todo list"), CHECK_TODO_LIST),
 		OPT_CMDMODE(0, "skip-unnecessary-picks", &command,
 			N_("skip unnecessary picks"), SKIP_UNNECESSARY_PICKS),
+		OPT_CMDMODE(0, "rearrange-squash", &command,
+			N_("rearrange fixup/squash lines"), REARRANGE_SQUASH),
 		OPT_END()
 	};
 
@@ -59,5 +61,7 @@ int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
 		return !!check_todo_list();
 	if (command == SKIP_UNNECESSARY_PICKS && argc == 1)
 		return !!skip_unnecessary_picks();
+	if (command == REARRANGE_SQUASH && argc == 1)
+		return !!rearrange_squash();
 	usage_with_options(builtin_rebase_helper_usage, options);
 }
diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
index 931bc09e0cf..d39fe4f5fb7 100644
--- a/git-rebase--interactive.sh
+++ b/git-rebase--interactive.sh
@@ -744,94 +744,6 @@ collapse_todo_ids() {
 	git rebase--helper --shorten-sha1s
 }
 
-# Rearrange the todo list that has both "pick sha1 msg" and
-# "pick sha1 fixup!/squash! msg" appears in it so that the latter
-# comes immediately after the former, and change "pick" to
-# "fixup"/"squash".
-#
-# Note that if the config has specified a custom instruction format
-# each log message will be re-retrieved in order to normalize the
-# autosquash arrangement
-rearrange_squash () {
-	format=$(git config --get rebase.instructionFormat)
-	# extract fixup!/squash! lines and resolve any referenced sha1's
-	while read -r pick sha1 message
-	do
-		test -z "${format}" || message=$(git log -n 1 --format="%s" ${sha1})
-		case "$message" in
-		"squash! "*|"fixup! "*)
-			action="${message%%!*}"
-			rest=$message
-			prefix=
-			# skip all squash! or fixup! (but save for later)
-			while :
-			do
-				case "$rest" in
-				"squash! "*|"fixup! "*)
-					prefix="$prefix${rest%%!*},"
-					rest="${rest#*! }"
-					;;
-				*)
-					break
-					;;
-				esac
-			done
-			printf '%s %s %s %s\n' "$sha1" "$action" "$prefix" "$rest"
-			# if it's a single word, try to resolve to a full sha1 and
-			# emit a second copy. This allows us to match on both message
-			# and on sha1 prefix
-			if test "${rest#* }" = "$rest"; then
-				fullsha="$(git rev-parse -q --verify "$rest" 2>/dev/null)"
-				if test -n "$fullsha"; then
-					# prefix the action to uniquely identify this line as
-					# intended for full sha1 match
-					echo "$sha1 +$action $prefix $fullsha"
-				fi
-			fi
-		esac
-	done >"$1.sq" <"$1"
-	test -s "$1.sq" || return
-
-	used=
-	while read -r pick sha1 message
-	do
-		case " $used" in
-		*" $sha1 "*) continue ;;
-		esac
-		printf '%s\n' "$pick $sha1 $message"
-		test -z "${format}" || message=$(git log -n 1 --format="%s" ${sha1})
-		used="$used$sha1 "
-		while read -r squash action msg_prefix msg_content
-		do
-			case " $used" in
-			*" $squash "*) continue ;;
-			esac
-			emit=0
-			case "$action" in
-			+*)
-				action="${action#+}"
-				# full sha1 prefix test
-				case "$msg_content" in "$sha1"*) emit=1;; esac ;;
-			*)
-				# message prefix test
-				case "$message" in "$msg_content"*) emit=1;; esac ;;
-			esac
-			if test $emit = 1; then
-				if test -n "${format}"
-				then
-					msg_content=$(git log -n 1 --format="${format}" ${squash})
-				else
-					msg_content="$(echo "$msg_prefix" | sed "s/,/! /g")$msg_content"
-				fi
-				printf '%s\n' "$action $squash $msg_content"
-				used="$used$squash "
-			fi
-		done <"$1.sq"
-	done >"$1.rearranged" <"$1"
-	cat "$1.rearranged" >"$1"
-	rm -f "$1.sq" "$1.rearranged"
-}
-
 # Add commands after a pick or after a squash/fixup serie
 # in the todo list.
 add_exec_commands () {
@@ -1091,7 +1003,7 @@ then
 fi
 
 test -s "$todo" || echo noop >> "$todo"
-test -n "$autosquash" && rearrange_squash "$todo"
+test -z "$autosquash" || git rebase--helper --rearrange-squash || exit
 test -n "$cmd" && add_exec_commands "$todo"
 
 todocount=$(git stripspace --strip-comments <"$todo" | wc -l)
diff --git a/sequencer.c b/sequencer.c
index b51faa0120f..7ac1792311e 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -19,6 +19,7 @@
 #include "trailer.h"
 #include "log-tree.h"
 #include "wt-status.h"
+#include "hashmap.h"
 
 #define GIT_REFLOG_ACTION "GIT_REFLOG_ACTION"
 
@@ -2724,3 +2725,197 @@ int skip_unnecessary_picks(void)
 
 	return 0;
 }
+
+struct subject2item_entry {
+	struct hashmap_entry entry;
+	int i;
+	char subject[FLEX_ARRAY];
+};
+
+static int subject2item_cmp(const struct subject2item_entry *a,
+	const struct subject2item_entry *b, const void *key)
+{
+	return key ? strcmp(a->subject, key) : strcmp(a->subject, b->subject);
+}
+
+/*
+ * Rearrange the todo list that has both "pick sha1 msg" and "pick sha1
+ * fixup!/squash! msg" in it so that the latter is put immediately after the
+ * former, and change "pick" to "fixup"/"squash".
+ *
+ * Note that if the config has specified a custom instruction format, each log
+ * message will have to be retrieved from the commit (as the oneline in the
+ * script cannot be trusted) in order to normalize the autosquash arrangement.
+ */
+int rearrange_squash(void)
+{
+	const char *todo_file = rebase_path_todo();
+	struct todo_list todo_list = TODO_LIST_INIT;
+	struct hashmap subject2item;
+	int res = 0, rearranged = 0, *next, *tail, fd, i;
+	char **subjects;
+
+	fd = open(todo_file, O_RDONLY);
+	if (fd < 0)
+		return error_errno(_("could not open '%s'"), todo_file);
+	if (strbuf_read(&todo_list.buf, fd, 0) < 0) {
+		close(fd);
+		return error(_("could not read '%s'."), todo_file);
+	}
+	close(fd);
+	if (parse_insn_buffer(todo_list.buf.buf, &todo_list) < 0) {
+		todo_list_release(&todo_list);
+		return -1;
+	}
+
+	/*
+	 * The hashmap maps onelines to the respective todo list index.
+	 *
+	 * If any items need to be rearranged, the next[i] value will indicate
+	 * which item was moved directly after the i'th.
+	 *
+	 * In that case, last[i] will indicate the index of the latest item to
+	 * be moved to appear after the i'th.
+	 */
+	hashmap_init(&subject2item, (hashmap_cmp_fn) subject2item_cmp,
+		     todo_list.nr);
+	ALLOC_ARRAY(next, todo_list.nr);
+	ALLOC_ARRAY(tail, todo_list.nr);
+	ALLOC_ARRAY(subjects, todo_list.nr);
+	for (i = 0; i < todo_list.nr; i++) {
+		struct strbuf buf = STRBUF_INIT;
+		struct todo_item *item = todo_list.items + i;
+		const char *commit_buffer, *subject, *p;
+		size_t subject_len;
+		int i2 = -1;
+		struct subject2item_entry *entry;
+
+		next[i] = tail[i] = -1;
+		if (item->command >= TODO_EXEC) {
+			subjects[i] = NULL;
+			continue;
+		}
+
+		if (is_fixup(item->command)) {
+			todo_list_release(&todo_list);
+			return error(_("the script was already rearranged."));
+		}
+
+		item->commit->util = item;
+
+		parse_commit(item->commit);
+		commit_buffer = get_commit_buffer(item->commit, NULL);
+		find_commit_subject(commit_buffer, &subject);
+		format_subject(&buf, subject, " ");
+		subject = subjects[i] = strbuf_detach(&buf, &subject_len);
+		unuse_commit_buffer(item->commit, commit_buffer);
+		if ((skip_prefix(subject, "fixup! ", &p) ||
+		     skip_prefix(subject, "squash! ", &p))) {
+			struct commit *commit2;
+
+			for (;;) {
+				while (isspace(*p))
+					p++;
+				if (!skip_prefix(p, "fixup! ", &p) &&
+				    !skip_prefix(p, "squash! ", &p))
+					break;
+			}
+
+			if ((entry = hashmap_get_from_hash(&subject2item,
+							   strhash(p), p)))
+				/* found by title */
+				i2 = entry->i;
+			else if (!strchr(p, ' ') &&
+				 (commit2 =
+				  lookup_commit_reference_by_name(p)) &&
+				 commit2->util)
+				/* found by commit name */
+				i2 = (struct todo_item *)commit2->util
+					- todo_list.items;
+			else {
+				/* copy can be a prefix of the commit subject */
+				for (i2 = 0; i2 < i; i2++)
+					if (subjects[i2] &&
+					    starts_with(subjects[i2], p))
+						break;
+				if (i2 == i)
+					i2 = -1;
+			}
+		}
+		if (i2 >= 0) {
+			rearranged = 1;
+			todo_list.items[i].command =
+				starts_with(subject, "fixup!") ?
+				TODO_FIXUP : TODO_SQUASH;
+			if (next[i2] < 0)
+				next[i2] = i;
+			else
+				next[tail[i2]] = i;
+			tail[i2] = i;
+		} else if (!hashmap_get_from_hash(&subject2item,
+						strhash(subject), subject)) {
+			FLEX_ALLOC_MEM(entry, subject, subject, subject_len);
+			entry->i = i;
+			hashmap_entry_init(entry, strhash(entry->subject));
+			hashmap_put(&subject2item, entry);
+		}
+	}
+
+	if (rearranged) {
+		struct strbuf buf = STRBUF_INIT;
+
+		for (i = 0; i < todo_list.nr; i++) {
+			enum todo_command command = todo_list.items[i].command;
+			int cur = i;
+
+			/*
+			 * Initially, all commands are 'pick's. If it is a
+			 * fixup or a squash now, we have rearranged it.
+			 */
+			if (is_fixup(command))
+				continue;
+
+			while (cur >= 0) {
+				int offset = todo_list.items[cur].offset_in_buf;
+				int end_offset = cur + 1 < todo_list.nr ?
+					todo_list.items[cur + 1].offset_in_buf :
+					todo_list.buf.len;
+				char *bol = todo_list.buf.buf + offset;
+				char *eol = todo_list.buf.buf + end_offset;
+
+				/* replace 'pick', by 'fixup' or 'squash' */
+				command = todo_list.items[cur].command;
+				if (is_fixup(command)) {
+					strbuf_addstr(&buf,
+						todo_command_info[command].str);
+					bol += strcspn(bol, " \t");
+				}
+
+				strbuf_add(&buf, bol, eol - bol);
+
+				cur = next[cur];
+			}
+		}
+
+		fd = open(todo_file, O_WRONLY);
+		if (fd < 0)
+			res = error_errno(_("could not open '%s'"), todo_file);
+		else if (write(fd, buf.buf, buf.len) < 0)
+			res = error_errno(_("could not read '%s'."), todo_file);
+		else if (ftruncate(fd, buf.len) < 0)
+			res = error_errno(_("could not finish '%s'"),
+					   todo_file);
+		close(fd);
+		strbuf_release(&buf);
+	}
+
+	free(next);
+	free(tail);
+	for (i = 0; i < todo_list.nr; i++)
+		free(subjects[i]);
+	free(subjects);
+	hashmap_free(&subject2item, 1);
+	todo_list_release(&todo_list);
+
+	return res;
+}
diff --git a/sequencer.h b/sequencer.h
index 28e1fc1e9bb..1c94bec7622 100644
--- a/sequencer.h
+++ b/sequencer.h
@@ -51,6 +51,7 @@ int sequencer_make_script(int keep_empty, FILE *out,
 int transform_todo_ids(int shorten_sha1s);
 int check_todo_list(void);
 int skip_unnecessary_picks(void);
+int rearrange_squash(void);
 
 extern const char sign_off_header[];
 
diff --git a/t/t3415-rebase-autosquash.sh b/t/t3415-rebase-autosquash.sh
index 9fd629a6e21..b9e26008a79 100755
--- a/t/t3415-rebase-autosquash.sh
+++ b/t/t3415-rebase-autosquash.sh
@@ -278,7 +278,7 @@ set_backup_editor () {
 	test_set_editor "$PWD/backup-editor.sh"
 }
 
-test_expect_failure 'autosquash with multiple empty patches' '
+test_expect_success 'autosquash with multiple empty patches' '
 	test_tick &&
 	git commit --allow-empty -m "empty" &&
 	test_tick &&
-- 
2.12.2.windows.2.406.gd14a8f8640f

^ permalink raw reply related	[flat|nested] 100+ messages in thread

* Re: [PATCH v3 1/9] rebase -i: generate the script via rebase--helper
  2017-04-26 11:59     ` [PATCH v3 1/9] rebase -i: generate the script via rebase--helper Johannes Schindelin
@ 2017-04-27  4:31       ` Junio C Hamano
  2017-04-27 14:18         ` Johannes Schindelin
  2017-04-28 10:08       ` Phillip Wood
  1 sibling, 1 reply; 100+ messages in thread
From: Junio C Hamano @ 2017-04-27  4:31 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, Philip Oakley, Jeff King

Johannes Schindelin <johannes.schindelin@gmx.de> writes:

> diff --git a/sequencer.c b/sequencer.c
> index 77afecaebf0..e858a976279 100644
> --- a/sequencer.c
> +++ b/sequencer.c
> @@ -2388,3 +2388,48 @@ void append_signoff(struct strbuf *msgbuf, int ignore_footer, unsigned flag)
>  
>  	strbuf_release(&sob);
>  }
> +
> +int sequencer_make_script(int keep_empty, FILE *out,
> +		int argc, const char **argv)
> +{
> +	char *format = xstrdup("%s");
> +	struct pretty_print_context pp = {0};
> +	struct strbuf buf = STRBUF_INIT;
> +	struct rev_info revs;
> +	struct commit *commit;
> +
> +	init_revisions(&revs, NULL);
> +	revs.verbose_header = 1;
> +	revs.max_parents = 1;
> +	revs.cherry_pick = 1;
> +	revs.limited = 1;
> +	revs.reverse = 1;
> +	revs.right_only = 1;
> +	revs.sort_order = REV_SORT_IN_GRAPH_ORDER;
> +	revs.topo_order = 1;
> +
> +	revs.pretty_given = 1;
> +	git_config_get_string("rebase.instructionFormat", &format);
> +	get_commit_format(format, &revs);
> +	free(format);
> +	pp.fmt = revs.commit_format;
> +	pp.output_encoding = get_log_output_encoding();

All of the above feels like inviting unnecessary future breakages by
knowing too much about the implementation the current version of
revision.c happens to use.  A more careful implementation would be
to allocate our own av[] and prepare "--reverse", "--left-right",
"--cherry-pick", etc. to be parsed by setup_revisions() call we see
below.  The parsing is not an expensive part of the operation
anyway, and that way we do not have to worry about one less thing.

> +	if (setup_revisions(argc, argv, &revs, NULL) > 1)
> +		return error(_("make_script: unhandled options"));
> +
> +	if (prepare_revision_walk(&revs) < 0)
> +		return error(_("make_script: error preparing revisions"));
> +
> +	while ((commit = get_revision(&revs))) {
> +		strbuf_reset(&buf);
> +		if (!keep_empty && is_original_commit_empty(commit))
> +			strbuf_addf(&buf, "%c ", comment_line_char);

Presumably callers of this function (which does not exist yet at
this step) are expected to have done the configuration dance to
prepare comment_line_char to whatever the end-user specified?

> +		strbuf_addf(&buf, "pick %s ", oid_to_hex(&commit->object.oid));
> +		pretty_print_commit(&pp, commit, &buf);
> +		strbuf_addch(&buf, '\n');
> +		fputs(buf.buf, out);
> +	}
> +	strbuf_release(&buf);
> +	return 0;
> +}

Other than that, this looks reasonable.

> diff --git a/sequencer.h b/sequencer.h
> index f885b68395f..83f2943b7a9 100644
> --- a/sequencer.h
> +++ b/sequencer.h
> @@ -45,6 +45,9 @@ int sequencer_continue(struct replay_opts *opts);
>  int sequencer_rollback(struct replay_opts *opts);
>  int sequencer_remove_state(struct replay_opts *opts);
>  
> +int sequencer_make_script(int keep_empty, FILE *out,
> +		int argc, const char **argv);
> +
>  extern const char sign_off_header[];
>  
>  void append_signoff(struct strbuf *msgbuf, int ignore_footer, unsigned flag);

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH v3 4/9] rebase -i: also expand/collapse the SHA-1s via the rebase--helper
  2017-04-26 11:59     ` [PATCH v3 4/9] rebase -i: also expand/collapse the SHA-1s via the rebase--helper Johannes Schindelin
@ 2017-04-27  5:00       ` Junio C Hamano
  2017-04-27  6:47         ` Junio C Hamano
  2017-04-27 21:44         ` Johannes Schindelin
  0 siblings, 2 replies; 100+ messages in thread
From: Junio C Hamano @ 2017-04-27  5:00 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, Philip Oakley, Jeff King

Johannes Schindelin <johannes.schindelin@gmx.de> writes:

> diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
> index 214af0372ba..52a19e0bdb3 100644
> --- a/git-rebase--interactive.sh
> +++ b/git-rebase--interactive.sh
> @@ -774,11 +774,11 @@ transform_todo_ids () {
>  }
>  
>  expand_todo_ids() {
> -	transform_todo_ids
> +	git rebase--helper --expand-sha1s
>  }
>  
>  collapse_todo_ids() {
> -	transform_todo_ids --short
> +	git rebase--helper --shorten-sha1s
>  }

Obviously correct ;-)  But doesn't this make transform_todo_ids ()
helper unused and removable?

> +int transform_todo_ids(int shorten_sha1s)
> +{
> +	const char *todo_file = rebase_path_todo();
> +	struct todo_list todo_list = TODO_LIST_INIT;
> +	int fd, res, i;
> +	FILE *out;
> +
> +	strbuf_reset(&todo_list.buf);
> +	fd = open(todo_file, O_RDONLY);
> +	if (fd < 0)
> +		return error_errno(_("could not open '%s'"), todo_file);
> +	if (strbuf_read(&todo_list.buf, fd, 0) < 0) {
> +		close(fd);
> +		return error(_("could not read '%s'."), todo_file);
> +	}
> +	close(fd);
> +
> +	res = parse_insn_buffer(todo_list.buf.buf, &todo_list);
> +	if (res) {
> +		todo_list_release(&todo_list);
> +		return error(_("unusable instruction sheet: '%s'"), todo_file);
> +	}
> +
> +	out = fopen(todo_file, "w");

The usual "open lockfile, write to it and then rename" dance is not
necessary for the purpose of preventing other people from reading
this file while we are writing to it.  But if we fail inside this
function before we fclose(3) "out", the user will lose the todo
list.  It probably is not a big deal, though.

> +	if (!out) {
> +		todo_list_release(&todo_list);
> +		return error(_("unable to open '%s' for writing"), todo_file);
> +	}
> +	for (i = 0; i < todo_list.nr; i++) {
> +		struct todo_item *item = todo_list.items + i;
> +		int bol = item->offset_in_buf;
> +		const char *p = todo_list.buf.buf + bol;
> +		int eol = i + 1 < todo_list.nr ?
> +			todo_list.items[i + 1].offset_in_buf :
> +			todo_list.buf.len;
> +
> +		if (item->command >= TODO_EXEC && item->command != TODO_DROP)
> +			fwrite(p, eol - bol, 1, out);
> +		else {
> +			int eoc = strcspn(p, " \t");
> +			const char *sha1 = shorten_sha1s ?
> +				short_commit_name(item->commit) :
> +				oid_to_hex(&item->commit->object.oid);
> +
> +			if (!eoc) {
> +				p += strspn(p, " \t");
> +				eoc = strcspn(p, " \t");
> +			}

It would be much easier to follow the logic if "int eoc" above were
a mere declaration without initialization and "skip to the
whitespaces" is done immediately before this if() statement.  It's
not like the initialized value of eoc is needed there because it
participates in the computation of sha1, and also having the
assignment followed by "oops, the line begins with a whitespace"
recovery that is done here.

Wouldn't it be simpler to do:

	else {
		int eoc;
		const char *sha1 = ...
		p += strspn(p, " \t"); /* skip optional indent */
		eoc = strcspn(p, " \t"); /* grab the command word */

without conditional?

> +			fprintf(out, "%.*s %s %.*s\n",
> +				eoc, p, sha1, item->arg_len, item->arg);
> +		}
> +	}
> +	fclose(out);
> +	todo_list_release(&todo_list);
> +	return 0;
> +}
> diff --git a/sequencer.h b/sequencer.h
> index 83f2943b7a9..47a81034e76 100644
> --- a/sequencer.h
> +++ b/sequencer.h
> @@ -48,6 +48,8 @@ int sequencer_remove_state(struct replay_opts *opts);
>  int sequencer_make_script(int keep_empty, FILE *out,
>  		int argc, const char **argv);
>  
> +int transform_todo_ids(int shorten_sha1s);
> +
>  extern const char sign_off_header[];
>  
>  void append_signoff(struct strbuf *msgbuf, int ignore_footer, unsigned flag);

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH v3 5/9] t3404: relax rebase.missingCommitsCheck tests
  2017-04-26 11:59     ` [PATCH v3 5/9] t3404: relax rebase.missingCommitsCheck tests Johannes Schindelin
@ 2017-04-27  5:05       ` Junio C Hamano
  2017-04-27 22:01         ` Johannes Schindelin
  0 siblings, 1 reply; 100+ messages in thread
From: Junio C Hamano @ 2017-04-27  5:05 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, Philip Oakley, Jeff King

Johannes Schindelin <johannes.schindelin@gmx.de> writes:

> These tests were a bit anal about the *exact* warning/error message
> printed by git rebase. But those messages are intended for the *end
> user*, therefore it does not make sense to test so rigidly for the
> *exact* wording.
>
> In the following, we will reimplement the missing commits check in
> the sequencer, with slightly different words.

Up to this point I thought your strategy was to mimic the original
as closely as possible, and changes to update (and/or improve) the
end user experience (like rewording messages) are left as a separate
step.  Changes to the test can be used to demonstrate the improved
end-user experience that way, and I found it a good way to structure
your series.  Is there a technical reason why you deviate from that
pattern here?  

Just being curious, as I do not particularly mind seeing things done
differently (especially if there is a good reason).

> -cat >expect <<EOF
> -Warning: the command isn't recognized in the following line:
> - - badcmd $(git rev-list --oneline -1 master~1)
> -
> -You can fix this with 'git rebase --edit-todo' and then run 'git rebase --continue'.
> -Or you can abort the rebase with 'git rebase --abort'.
> -EOF
> -
>  test_expect_success 'static check of bad command' '
>  	rebase_setup_and_clean bad-cmd &&
>  	set_fake_editor &&
>  	test_must_fail env FAKE_LINES="1 2 3 bad 4 5" \
>  		git rebase -i --root 2>actual &&
> -	test_i18ncmp expect actual &&
> +	test_i18ngrep "badcmd $(git rev-list --oneline -1 master~1)" actual &&
> +	test_i18ngrep "You can fix this with .git rebase --edit-todo.." actual &&
>  	FAKE_LINES="1 2 3 drop 4 5" git rebase --edit-todo &&
>  	git rebase --continue &&
>  	test E = $(git cat-file commit HEAD | sed -ne \$p) &&
> @@ -1277,20 +1270,13 @@ test_expect_success 'tabs and spaces are accepted in the todolist' '
>  	test E = $(git cat-file commit HEAD | sed -ne \$p)
>  '
>  
> -cat >expect <<EOF
> -Warning: the SHA-1 is missing or isn't a commit in the following line:
> - - edit XXXXXXX False commit
> -
> -You can fix this with 'git rebase --edit-todo' and then run 'git rebase --continue'.
> -Or you can abort the rebase with 'git rebase --abort'.
> -EOF
> -
>  test_expect_success 'static check of bad SHA-1' '
>  	rebase_setup_and_clean bad-sha &&
>  	set_fake_editor &&
>  	test_must_fail env FAKE_LINES="1 2 edit fakesha 3 4 5 #" \
>  		git rebase -i --root 2>actual &&
> -	test_i18ncmp expect actual &&
> +	test_i18ngrep "edit XXXXXXX False commit" actual &&
> +	test_i18ngrep "You can fix this with .git rebase --edit-todo.." actual &&
>  	FAKE_LINES="1 2 4 5 6" git rebase --edit-todo &&
>  	git rebase --continue &&
>  	test E = $(git cat-file commit HEAD | sed -ne \$p)

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH v3 6/9] rebase -i: check for missing commits in the rebase--helper
  2017-04-26 11:59     ` [PATCH v3 6/9] rebase -i: check for missing commits in the rebase--helper Johannes Schindelin
@ 2017-04-27  5:32       ` Junio C Hamano
  2017-04-28 15:10         ` Johannes Schindelin
  0 siblings, 1 reply; 100+ messages in thread
From: Junio C Hamano @ 2017-04-27  5:32 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, Philip Oakley, Jeff King

Johannes Schindelin <johannes.schindelin@gmx.de> writes:

> -check_todo_list
> +git rebase--helper --check-todo-list || {
> +	ret=$?
> +	checkout_onto
> +	exit $ret
> +}

I find this a better division of labor between "check_todo_list" and
its caller.  Compared to the original that did the "recover and exit
with failure" inside the helper, this is much easier to see what is
going on.

> +/*
> + * Check if the user dropped some commits by mistake
> + * Behaviour determined by rebase.missingCommitsCheck.
> + * Check if there is an unrecognized command or a
> + * bad SHA-1 in a command.
> + */
> +int check_todo_list(void)
> +{
> +	enum check_level check_level = get_missing_commit_check_level();
> +	struct strbuf todo_file = STRBUF_INIT;
> +	struct todo_list todo_list = TODO_LIST_INIT;
> +	struct commit_list *missing = NULL;
> +	int raise_error = 0, res = 0, fd, i;
> +
> +	strbuf_addstr(&todo_file, rebase_path_todo());
> +	fd = open(todo_file.buf, O_RDONLY);
> +	if (fd < 0) {
> +		res = error_errno(_("could not open '%s'"), todo_file.buf);
> +		goto leave_check;
> +	}
> +	if (strbuf_read(&todo_list.buf, fd, 0) < 0) {
> +		close(fd);
> +		res = error(_("could not read '%s'."), todo_file.buf);
> +		goto leave_check;
> +	}
> +	close(fd);
> +	raise_error = res =
> +		parse_insn_buffer(todo_list.buf.buf, &todo_list);
> +
> +	if (check_level == CHECK_IGNORE)
> +		goto leave_check;

OK, so even it is set to ignore, unreadable todo list will be shown
with a loud error message that tells the user to use --edit-todo.

What should happen when it is not set to ignore and we found the
todo list unacceptable, I wonder?  Let's read on.

> +	/* Get the SHA-1 of the commits */
> +	for (i = 0; i < todo_list.nr; i++) {
> +		struct commit *commit = todo_list.items[i].commit;
> +		if (commit)
> +			commit->util = todo_list.items + i;
> +	}

It does not look like this loop is "Get(ting) the SHA-1 of the commits"
to me, though.  It is setting up ->util to be usable as a back-pointer
into the list.

> +
> +	todo_list_release(&todo_list);

But then the todo-list is released?  The util field we have set, if
any, in the previous loop are now dangling, no?

> +	strbuf_addstr(&todo_file, ".backup");
> +	fd = open(todo_file.buf, O_RDONLY);
> +	if (fd < 0) {
> +		res = error_errno(_("could not open '%s'"), todo_file.buf);
> +		goto leave_check;
> +	}
> +	if (strbuf_read(&todo_list.buf, fd, 0) < 0) {
> +		close(fd);
> +		res = error(_("could not read '%s'."), todo_file.buf);
> +		goto leave_check;
> +	}
> +	close(fd);
> +	strbuf_release(&todo_file);
> +	res = !!parse_insn_buffer(todo_list.buf.buf, &todo_list);

Then we read from .backup; failure to do so does not result in the
"you need to --edit-todo" warning.

> +	/* Find commits that are missing after editing */
> +	for (i = 0; i < todo_list.nr; i++) {
> +		struct commit *commit = todo_list.items[i].commit;
> +		if (commit && !commit->util) {
> +			commit_list_insert(commit, &missing);
> +			commit->util = todo_list.items + i;
> +		}
> +	}

And we check the commits mentioned in the backup; any commit whose
util is not marked in the previous loop is noticed and thrown into
the missing list.

The loop we later have does "while (missing)" and does not look at
commit->util for commits that are *not* missing, i.e. the ones that
are marked in the previous loop, so it does not matter that their
util field have dangling pointers.  In that sense, it may not be
buggy, but it is misleading.  The only thing these two loops care
about is that the commits found in the earlier loop get their util
field set to non-NULL, so instead of using "todo_list.items+i",
perhaps doing this

	if (commit)
		commit->util = (void *)1; /* mark as seen */

in the earlier loop instead would be much less confusing.

> +	/* Warn about missing commits */
> +	if (!missing)
> +		goto leave_check;

If there is no missing one, we may still return error about
unacceptable backup file.  But if we read backup fine and didn't
find anything missing, we'll return silently and with success.  OK.

> +	if (check_level == CHECK_ERROR)
> +		raise_error = res = 1;

Otherwise, we found missing ones and we want to report here.

The reason why I started reading this function aloud was because I
found two variables (raise_error and res) somewhat confusing.  I
think what the code does makes sense, but I still find the way how
the code expresses the logic with these two variables confusing.
Perhaps somebody else can hopefully offer possible improvements, as
I do not offhand think of a way better than what is currently in
this patch myself.

> +	fprintf(stderr,
> +		_("Warning: some commits may have been dropped accidentally.\n"
> +		"Dropped commits (newer to older):\n"));
> +
> +	/* Make the list user-friendly and display */
> +	while (missing) {
> +		struct commit *commit = pop_commit(&missing);
> +		struct todo_item *item = commit->util;
> +
> +		fprintf(stderr, " - %s %.*s\n", short_commit_name(commit),
> +			item->arg_len, item->arg);
> +	}
> +	free_commit_list(missing);
> +
> +	fprintf(stderr, _("To avoid this message, use \"drop\" to "
> +		"explicitly remove a commit.\n\n"
> +		"Use 'git config rebase.missingCommitsCheck' to change "
> +		"the level of warnings.\n"
> +		"The possible behaviours are: ignore, warn, error.\n\n"));
> +
> +leave_check:
> +	strbuf_release(&todo_file);
> +	todo_list_release(&todo_list);
> +
> +	if (raise_error)
> +		fprintf(stderr,
> +			_("You can fix this with 'git rebase --edit-todo' "
> +			  "and then run 'git rebase --continue'.\n"
> +			  "Or you can abort the rebase with 'git rebase"
> +			  " --abort'.\n"));
> +
> +	return res;
> +}
> diff --git a/sequencer.h b/sequencer.h
> index 47a81034e76..4978a61b83b 100644
> --- a/sequencer.h
> +++ b/sequencer.h
> @@ -49,6 +49,7 @@ int sequencer_make_script(int keep_empty, FILE *out,
>  		int argc, const char **argv);
>  
>  int transform_todo_ids(int shorten_sha1s);
> +int check_todo_list(void);
>  
>  extern const char sign_off_header[];

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH v3 4/9] rebase -i: also expand/collapse the SHA-1s via the rebase--helper
  2017-04-27  5:00       ` Junio C Hamano
@ 2017-04-27  6:47         ` Junio C Hamano
  2017-04-27 21:44         ` Johannes Schindelin
  1 sibling, 0 replies; 100+ messages in thread
From: Junio C Hamano @ 2017-04-27  6:47 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, Philip Oakley, Jeff King

Junio C Hamano <gitster@pobox.com> writes:

> Johannes Schindelin <johannes.schindelin@gmx.de> writes:
>
>> diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
>> index 214af0372ba..52a19e0bdb3 100644
>> --- a/git-rebase--interactive.sh
>> +++ b/git-rebase--interactive.sh
>> @@ -774,11 +774,11 @@ transform_todo_ids () {
>>  }
>>  
>>  expand_todo_ids() {
>> -	transform_todo_ids
>> +	git rebase--helper --expand-sha1s
>>  }
>>  
>>  collapse_todo_ids() {
>> -	transform_todo_ids --short
>> +	git rebase--helper --shorten-sha1s
>>  }
>
> Obviously correct ;-)  But doesn't this make transform_todo_ids ()
> helper unused and removable?

Ehh, in case it was notclear, I meant the helper function in shell,
not the one you added below to C code.
>
>> +int transform_todo_ids(int shorten_sha1s)
>> +{


^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH v3 1/9] rebase -i: generate the script via rebase--helper
  2017-04-27  4:31       ` Junio C Hamano
@ 2017-04-27 14:18         ` Johannes Schindelin
  2017-04-28  0:13           ` Junio C Hamano
  0 siblings, 1 reply; 100+ messages in thread
From: Johannes Schindelin @ 2017-04-27 14:18 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Philip Oakley, Jeff King

Hi Junio,

On Wed, 26 Apr 2017, Junio C Hamano wrote:

> Johannes Schindelin <johannes.schindelin@gmx.de> writes:
> 
> > diff --git a/sequencer.c b/sequencer.c
> > index 77afecaebf0..e858a976279 100644
> > --- a/sequencer.c
> > +++ b/sequencer.c
> > @@ -2388,3 +2388,48 @@ void append_signoff(struct strbuf *msgbuf, int ignore_footer, unsigned flag)
> >  
> >  	strbuf_release(&sob);
> >  }
> > +
> > +int sequencer_make_script(int keep_empty, FILE *out,
> > +		int argc, const char **argv)
> > +{
> > +	char *format = xstrdup("%s");
> > +	struct pretty_print_context pp = {0};
> > +	struct strbuf buf = STRBUF_INIT;
> > +	struct rev_info revs;
> > +	struct commit *commit;
> > +
> > +	init_revisions(&revs, NULL);
> > +	revs.verbose_header = 1;
> > +	revs.max_parents = 1;
> > +	revs.cherry_pick = 1;
> > +	revs.limited = 1;
> > +	revs.reverse = 1;
> > +	revs.right_only = 1;
> > +	revs.sort_order = REV_SORT_IN_GRAPH_ORDER;
> > +	revs.topo_order = 1;
> > +
> > +	revs.pretty_given = 1;
> > +	git_config_get_string("rebase.instructionFormat", &format);
> > +	get_commit_format(format, &revs);
> > +	free(format);
> > +	pp.fmt = revs.commit_format;
> > +	pp.output_encoding = get_log_output_encoding();
> 
> All of the above feels like inviting unnecessary future breakages by
> knowing too much about the implementation the current version of
> revision.c happens to use.

You mean that the `--reverse` option gets translated into the `reverse`
bit, and the other settings?

:-)

> A more careful implementation would be to allocate our own av[] and
> prepare "--reverse", "--left-right", "--cherry-pick", etc. to be parsed
> by setup_revisions() call we see below.

Oh, so you were not joking.

Part of why I think we should stay away from shell scripts has nothing to
do with performance (which would already be worth it) nor portability
issues (which also would already be worth it) nor requiring contributors to
know more than C (which also would already be worth it), but static
typing.

What you are asking is to do away with the strong, static typing (which
would show a breakage pretty quickly if that part of revision.c's API was
changed, therefore I think your concern is a little curious) in favor of
loose typing which would demonstrate breakages only upon use.

That is the exact opposite direction of where I want to go.

> The parsing is not an expensive part of the operation anyway,

... but why, oh why make things more complicated than they need to be? The
revision API is an API, yes, an internal one, but an API, for crying out
loud.

> and that way we do not have to worry about one less thing.

Not that I don't mind no double or triple negations, but no, not this one.

> > +	if (setup_revisions(argc, argv, &revs, NULL) > 1)
> > +		return error(_("make_script: unhandled options"));
> > +
> > +	if (prepare_revision_walk(&revs) < 0)
> > +		return error(_("make_script: error preparing revisions"));
> > +
> > +	while ((commit = get_revision(&revs))) {
> > +		strbuf_reset(&buf);
> > +		if (!keep_empty && is_original_commit_empty(commit))
> > +			strbuf_addf(&buf, "%c ", comment_line_char);
> 
> Presumably callers of this function (which does not exist yet at
> this step) are expected to have done the configuration dance to
> prepare comment_line_char to whatever the end-user specified?

Yes. Just like they have to take care of discovering the .git/ directory.

I guess I kind of fail to see your point. Of course the configuration has
to be read at this point... This is an internal API function that has the
same contract as all the other internal API functions: you have to set up
and configure everything needed to run the API function beforehand.

But maybe what you really wanted to ask is: How do we know that
comment_line_char is initialized correctly at this point?

If that is the question, I understand your puzzlement, and it is easy to
dispell: comment_line_char is configured as part of
git_default_core_config(), and initialized to '#' before Git even starts
to run.

So we're safe here, as long as the default config handling runs. The
intended user is obviously the rebase--helper, which runs git_config()
even before parsing the options.

Meaning: the code is safe.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH v3 4/9] rebase -i: also expand/collapse the SHA-1s via the rebase--helper
  2017-04-27  5:00       ` Junio C Hamano
  2017-04-27  6:47         ` Junio C Hamano
@ 2017-04-27 21:44         ` Johannes Schindelin
  2017-04-28  0:15           ` Junio C Hamano
  1 sibling, 1 reply; 100+ messages in thread
From: Johannes Schindelin @ 2017-04-27 21:44 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Philip Oakley, Jeff King

Hi Junio,

On Wed, 26 Apr 2017, Junio C Hamano wrote:

> Johannes Schindelin <johannes.schindelin@gmx.de> writes:
> 
> > diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
> > index 214af0372ba..52a19e0bdb3 100644
> > --- a/git-rebase--interactive.sh
> > +++ b/git-rebase--interactive.sh
> > @@ -774,11 +774,11 @@ transform_todo_ids () {
> >  }
> >  
> >  expand_todo_ids() {
> > -	transform_todo_ids
> > +	git rebase--helper --expand-sha1s
> >  }
> >  
> >  collapse_todo_ids() {
> > -	transform_todo_ids --short
> > +	git rebase--helper --shorten-sha1s
> >  }
> 
> Obviously correct ;-)  But doesn't this make transform_todo_ids ()
> helper unused and removable?

But of course it is now unused! Will fix.

> > +int transform_todo_ids(int shorten_sha1s)
> > +{
> > +	const char *todo_file = rebase_path_todo();
> > +	struct todo_list todo_list = TODO_LIST_INIT;
> > +	int fd, res, i;
> > +	FILE *out;
> > +
> > +	strbuf_reset(&todo_list.buf);
> > +	fd = open(todo_file, O_RDONLY);
> > +	if (fd < 0)
> > +		return error_errno(_("could not open '%s'"), todo_file);
> > +	if (strbuf_read(&todo_list.buf, fd, 0) < 0) {
> > +		close(fd);
> > +		return error(_("could not read '%s'."), todo_file);
> > +	}
> > +	close(fd);
> > +
> > +	res = parse_insn_buffer(todo_list.buf.buf, &todo_list);
> > +	if (res) {
> > +		todo_list_release(&todo_list);
> > +		return error(_("unusable instruction sheet: '%s'"), todo_file);
> > +	}
> > +
> > +	out = fopen(todo_file, "w");
> 
> The usual "open lockfile, write to it and then rename" dance is not
> necessary for the purpose of preventing other people from reading
> this file while we are writing to it.  But if we fail inside this
> function before we fclose(3) "out", the user will lose the todo
> list.  It probably is not a big deal, though.

I guess you're right. It is bug-for-bug equivalent to the previous shell
function, though.

> > +	if (!out) {
> > +		todo_list_release(&todo_list);
> > +		return error(_("unable to open '%s' for writing"), todo_file);
> > +	}
> > +	for (i = 0; i < todo_list.nr; i++) {
> > +		struct todo_item *item = todo_list.items + i;
> > +		int bol = item->offset_in_buf;
> > +		const char *p = todo_list.buf.buf + bol;
> > +		int eol = i + 1 < todo_list.nr ?
> > +			todo_list.items[i + 1].offset_in_buf :
> > +			todo_list.buf.len;
> > +
> > +		if (item->command >= TODO_EXEC && item->command != TODO_DROP)
> > +			fwrite(p, eol - bol, 1, out);
> > +		else {
> > +			int eoc = strcspn(p, " \t");
> > +			const char *sha1 = shorten_sha1s ?
> > +				short_commit_name(item->commit) :
> > +				oid_to_hex(&item->commit->object.oid);
> > +
> > +			if (!eoc) {
> > +				p += strspn(p, " \t");
> > +				eoc = strcspn(p, " \t");
> > +			}
> 
> It would be much easier to follow the logic if "int eoc" above were
> a mere declaration without initialization and "skip to the
> whitespaces" is done immediately before this if() statement.  It's
> not like the initialized value of eoc is needed there because it
> participates in the computation of sha1, and also having the
> assignment followed by "oops, the line begins with a whitespace"
> recovery that is done here.
> 
> Wouldn't it be simpler to do:
> 
> 	else {
> 		int eoc;
> 		const char *sha1 = ...
> 		p += strspn(p, " \t"); /* skip optional indent */
> 		eoc = strcspn(p, " \t"); /* grab the command word */
> 
> without conditional?

Sure, will fix.

Ciao,
Dscho


^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH v3 5/9] t3404: relax rebase.missingCommitsCheck tests
  2017-04-27  5:05       ` Junio C Hamano
@ 2017-04-27 22:01         ` Johannes Schindelin
  0 siblings, 0 replies; 100+ messages in thread
From: Johannes Schindelin @ 2017-04-27 22:01 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Philip Oakley, Jeff King

Hi Junio,

On Wed, 26 Apr 2017, Junio C Hamano wrote:

> Johannes Schindelin <johannes.schindelin@gmx.de> writes:
> 
> > These tests were a bit anal about the *exact* warning/error message
> > printed by git rebase. But those messages are intended for the *end
> > user*, therefore it does not make sense to test so rigidly for the
> > *exact* wording.
> >
> > In the following, we will reimplement the missing commits check in
> > the sequencer, with slightly different words.
> 
> Up to this point I thought your strategy was to mimic the original
> as closely as possible, and changes to update (and/or improve) the
> end user experience (like rewording messages) are left as a separate
> step.  Changes to the test can be used to demonstrate the improved
> end-user experience that way, and I found it a good way to structure
> your series.  Is there a technical reason why you deviate from that
> pattern here?  
> 
> Just being curious, as I do not particularly mind seeing things done
> differently (especially if there is a good reason).

Yes, I remember the qualms I had about this patch. It was simply too
cumbersome to keep the exact error message, as it would have meant to
deviate purposefully from the way we do things in C.

In our C code, we error out using the error() function, with translateable
messages. That is what the todo list parsing code in sequencer.c does:

	error: invalid line 4: badcmd 0547e3f1350d D

It even prints out the line number, which I think is a nice touch.

Compare that to the non-standard way the rebase -i *shell* code reported
the failure:

	Warning: the command isn't recognized in the following line:
	 - badcmd 0547e3f1350d D

First of all, it reported it as a "Warning". Which is wrong, as it is a
fatal error, and it was always treated as such.

Second, it uses a capitalized prefix, which not even our warning()
function does.

Third, it uses two lines, and then indents the offending line with a
leading dash (as if we were in the middle of an enumeration).

I would have added that the previous message also imitated Lore in that it
uses a contraction, but I was shocked to learn through a simple git grep
that plenty of our error messages use that style, too.

I could have simply re-written the error() as an `fprintf(stderr,
"Warning: ...");` but that was just too inconsistent for my taste.

Hence I settled for relaxing the error message, which I had already done
much earlier so that running the test with a `set -x` inserted somewhere
into git-rebase--interacive.sh would still work.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH v3 1/9] rebase -i: generate the script via rebase--helper
  2017-04-27 14:18         ` Johannes Schindelin
@ 2017-04-28  0:13           ` Junio C Hamano
  2017-04-28  2:36             ` Junio C Hamano
  2017-04-28 15:13             ` Johannes Schindelin
  0 siblings, 2 replies; 100+ messages in thread
From: Junio C Hamano @ 2017-04-28  0:13 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, Philip Oakley, Jeff King

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> Hi Junio,
>
> On Wed, 26 Apr 2017, Junio C Hamano wrote:
>
>> Johannes Schindelin <johannes.schindelin@gmx.de> writes:
>> 
>> > diff --git a/sequencer.c b/sequencer.c
>> > index 77afecaebf0..e858a976279 100644
>> > --- a/sequencer.c
>> > +++ b/sequencer.c
>> > @@ -2388,3 +2388,48 @@ void append_signoff(struct strbuf *msgbuf, int ignore_footer, unsigned flag)
>> >  
>> >  	strbuf_release(&sob);
>> >  }
>> > +
>> > +int sequencer_make_script(int keep_empty, FILE *out,
>> > +		int argc, const char **argv)
>> > +{
>> > +	char *format = xstrdup("%s");
>> > +	struct pretty_print_context pp = {0};
>> > +	struct strbuf buf = STRBUF_INIT;
>> > +	struct rev_info revs;
>> > +	struct commit *commit;
>> > +
>> > +	init_revisions(&revs, NULL);
>> > +	revs.verbose_header = 1;
>> > +	revs.max_parents = 1;
>> > +	revs.cherry_pick = 1;
>> > +	revs.limited = 1;
>> > +	revs.reverse = 1;
>> > +	revs.right_only = 1;
>> > +	revs.sort_order = REV_SORT_IN_GRAPH_ORDER;
>> > +	revs.topo_order = 1;
>> > +
>> > +	revs.pretty_given = 1;
>> > +	git_config_get_string("rebase.instructionFormat", &format);
>> > +	get_commit_format(format, &revs);
>> > +	free(format);
>> > +	pp.fmt = revs.commit_format;
>> > +	pp.output_encoding = get_log_output_encoding();
>> 
>> All of the above feels like inviting unnecessary future breakages by
>> knowing too much about the implementation the current version of
>> revision.c happens to use.
>
> You mean that the `--reverse` option gets translated into the `reverse`
> bit, and the other settings?

Yes.  The "pretty_given" trick is one example that the underlying
implementation can change over time.  If you wrote this patch before
66b2ed09 ("Fix "log" family not to be too agressive about showing
notes", 2010-01-20) happened, you wouldn't have known to flip this
bit on to emulate the command line parsing of "--pretty" and
friends, and you would have required the author of that change to
know that you have this cut & pasted duplicated code here when the
commit is primarily about updating revision.c

So I am very serious when I say that this is adding an unnecessary
maintenance burden.


^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH v3 4/9] rebase -i: also expand/collapse the SHA-1s via the rebase--helper
  2017-04-27 21:44         ` Johannes Schindelin
@ 2017-04-28  0:15           ` Junio C Hamano
  2017-04-28 15:15             ` Johannes Schindelin
  0 siblings, 1 reply; 100+ messages in thread
From: Junio C Hamano @ 2017-04-28  0:15 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, Philip Oakley, Jeff King

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

>> > +	out = fopen(todo_file, "w");
>> 
>> The usual "open lockfile, write to it and then rename" dance is not
>> necessary for the purpose of preventing other people from reading
>> this file while we are writing to it.  But if we fail inside this
>> function before we fclose(3) "out", the user will lose the todo
>> list.  It probably is not a big deal, though.
>
> I guess you're right. It is bug-for-bug equivalent to the previous shell
> function, though.

I think the scripted version uses the "write to $todo.new and mv
$todo.new to $todo" pattern so you'd at least have something to go
back to when the loopfails.

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH v3 1/9] rebase -i: generate the script via rebase--helper
  2017-04-28  0:13           ` Junio C Hamano
@ 2017-04-28  2:36             ` Junio C Hamano
  2017-04-28 15:13             ` Johannes Schindelin
  1 sibling, 0 replies; 100+ messages in thread
From: Junio C Hamano @ 2017-04-28  2:36 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, Philip Oakley, Jeff King

Junio C Hamano <gitster@pobox.com> writes:

> Yes.  The "pretty_given" trick is one example that the underlying
> implementation can change over time.  If you wrote this patch before
> 66b2ed09 ("Fix "log" family not to be too agressive about showing
> notes", 2010-01-20) happened, you wouldn't have known to flip this
> bit on to emulate the command line parsing of "--pretty" and
> friends, and you would have required the author of that change to
> know that you have this cut & pasted duplicated code here when the
> commit is primarily about updating revision.c
>
> So I am very serious when I say that this is adding an unnecessary
> maintenance burden.

I _am_ sympathetic to your wish to have the compiler catch a
misspelt "revs.verboes_header = 1".  A misspelt "--formta=..." 
would not be caught until the execution time.

But the compiler's static name checking helps only one time while
you are writing _this_ patch, and it does not help at all to protect
this duplicated code from future breakages.  The way "rev-list" and
friends _internally_ implement "--format=..."  or any other options
sed by the rev-list command whose behaviour you are recreating here
can (and will) change in the future, just like it already did change
in early 2010.  We didn't have pretty_given field for several years
after "--pretty" etc. that currently set the field were originally
introduced.

In an ideal world, we would probably have specific methods to
manipulate "struct rev_info" and set_format_string() method, which
would be called when the command line parser is reacting to
"--format=..." in setup_revisions(), may encapsulate the
implementation detail of setting verbose_header and pretty_given
fields in addition to calling get_commit_format() method on the
rev_info object, and your new code may be calling that method,
without having to know the implementation detail.

We do not live in that ideal world, and it is _not_ the theme of
your topic to bring us closer to the ideal world.  Under that
constraint, a future-proof way to set up the revision machinery is
to have setup_revisions() parse an av[] array.  What will be done to
your copy of revs will stay compatible with what rev-list would do
after the implementation detail of setup_revisions() changes that
way.  It is true that a misspelt "--formta=..."  would not be caught
until the execution time, but once the code in this part is written,
it is less likely to get broken by a change coming from needs by
other parts of the system (e.g. the addition of pretty_given came
not because we wanted to enhance how --format or --pretty worked; it
came because we wanted to make sure they are not affected by changes
to another option).

So after being forced by your response to rethink about it, I feel
even firmer about this than I felt when I sent my first review
comment.

Thanks.




^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH v3 1/9] rebase -i: generate the script via rebase--helper
  2017-04-26 11:59     ` [PATCH v3 1/9] rebase -i: generate the script via rebase--helper Johannes Schindelin
  2017-04-27  4:31       ` Junio C Hamano
@ 2017-04-28 10:08       ` Phillip Wood
  2017-04-28 19:22         ` Johannes Schindelin
  2017-05-01  0:49         ` Junio C Hamano
  1 sibling, 2 replies; 100+ messages in thread
From: Phillip Wood @ 2017-04-28 10:08 UTC (permalink / raw)
  To: Johannes Schindelin, git; +Cc: Junio C Hamano, Philip Oakley, Jeff King

On 26/04/17 12:59, Johannes Schindelin wrote:
> The first step of an interactive rebase is to generate the so-called "todo
> script", to be stored in the state directory as "git-rebase-todo" and to
> be edited by the user.
> 
> Originally, we adjusted the output of `git log <options>` using a simple
> sed script. Over the course of the years, the code became more
> complicated. We now use shell scripting to edit the output of `git log`
> conditionally, depending whether to keep "empty" commits (i.e. commits
> that do not change any files).
> 
> On platforms where shell scripting is not native, this can be a serious
> drag. And it opens the door for incompatibilities between platforms when
> it comes to shell scripting or to Unix-y commands.
> 
> Let's just re-implement the todo script generation in plain C, using the
> revision machinery directly.
> 
> This is substantially faster, improving the speed relative to the
> shell script version of the interactive rebase from 2x to 3x on Windows.
> 
> Note that the rearrange_squash() function in git-rebase--interactive
> relied on the fact that we set the "format" variable to the config setting
> rebase.instructionFormat. Relying on a side effect like this is no good,
> hence we explicitly perform that assignment (possibly again) in
> rearrange_squash().
> 
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---
>  builtin/rebase--helper.c   |  8 +++++++-
>  git-rebase--interactive.sh | 44 +++++++++++++++++++++++---------------------
>  sequencer.c                | 45 +++++++++++++++++++++++++++++++++++++++++++++
>  sequencer.h                |  3 +++
>  4 files changed, 78 insertions(+), 22 deletions(-)
> 
> diff --git a/sequencer.c b/sequencer.c
> index 77afecaebf0..e858a976279 100644
> --- a/sequencer.c
> +++ b/sequencer.c
> @@ -2388,3 +2388,48 @@ void append_signoff(struct strbuf *msgbuf, int ignore_footer, unsigned flag)
>  
>  	strbuf_release(&sob);
>  }
> +
> +int sequencer_make_script(int keep_empty, FILE *out,
> +		int argc, const char **argv)
> +{
> +	char *format = xstrdup("%s");
> +	struct pretty_print_context pp = {0};
> +	struct strbuf buf = STRBUF_INIT;
> +	struct rev_info revs;
> +	struct commit *commit;
> +
> +	init_revisions(&revs, NULL);
> +	revs.verbose_header = 1;
> +	revs.max_parents = 1;
> +	revs.cherry_pick = 1;
> +	revs.limited = 1;
> +	revs.reverse = 1;
> +	revs.right_only = 1;
> +	revs.sort_order = REV_SORT_IN_GRAPH_ORDER;
> +	revs.topo_order = 1;
> +
> +	revs.pretty_given = 1;
> +	git_config_get_string("rebase.instructionFormat", &format);

Firstly thanks for all your work on speeding up rebase -i, it definitely
feels faster.

This changes the behaviour of
git -c rebase.instructionFormat= rebase -i
The shell version treats the rebase.instructionFormat being unset or set
to the empty string as equivalent. This version generates a todo list
with lines like 'pick <abbrev sha1>' rather than 'pick <abbrev sha1>
<subject>'

I only picked this up because I have a script that does 'git -c
rebase.instructionFormat= rebase -i' with a custom sequence editor. I
can easily add '%s' in the appropriate place but I thought I'd point it
out in case other people are affected by the change.

Please CC me in any replies as I'm not subscribed to this list

Best Wishes

Phillip

> +	get_commit_format(format, &revs);
> +	free(format);
> +	pp.fmt = revs.commit_format;
> +	pp.output_encoding = get_log_output_encoding();
> +
> +	if (setup_revisions(argc, argv, &revs, NULL) > 1)
> +		return error(_("make_script: unhandled options"));
> +
> +	if (prepare_revision_walk(&revs) < 0)
> +		return error(_("make_script: error preparing revisions"));
> +
> +	while ((commit = get_revision(&revs))) {
> +		strbuf_reset(&buf);
> +		if (!keep_empty && is_original_commit_empty(commit))
> +			strbuf_addf(&buf, "%c ", comment_line_char);
> +		strbuf_addf(&buf, "pick %s ", oid_to_hex(&commit->object.oid));
> +		pretty_print_commit(&pp, commit, &buf);
> +		strbuf_addch(&buf, '\n');
> +		fputs(buf.buf, out);
> +	}
> +	strbuf_release(&buf);
> +	return 0;
> +}
> diff --git a/sequencer.h b/sequencer.h
> index f885b68395f..83f2943b7a9 100644
> --- a/sequencer.h
> +++ b/sequencer.h
> @@ -45,6 +45,9 @@ int sequencer_continue(struct replay_opts *opts);
>  int sequencer_rollback(struct replay_opts *opts);
>  int sequencer_remove_state(struct replay_opts *opts);
>  
> +int sequencer_make_script(int keep_empty, FILE *out,
> +		int argc, const char **argv);
> +
>  extern const char sign_off_header[];
>  
>  void append_signoff(struct strbuf *msgbuf, int ignore_footer, unsigned flag);
> 


^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH v3 6/9] rebase -i: check for missing commits in the rebase--helper
  2017-04-27  5:32       ` Junio C Hamano
@ 2017-04-28 15:10         ` Johannes Schindelin
  0 siblings, 0 replies; 100+ messages in thread
From: Johannes Schindelin @ 2017-04-28 15:10 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Philip Oakley, Jeff King

Hi Junio,


On Wed, 26 Apr 2017, Junio C Hamano wrote:

> Johannes Schindelin <johannes.schindelin@gmx.de> writes:
> 
> > -check_todo_list
> > +git rebase--helper --check-todo-list || {
> > +	ret=$?
> > +	checkout_onto
> > +	exit $ret
> > +}
> 
> I find this a better division of labor between "check_todo_list" and
> its caller.  Compared to the original that did the "recover and exit
> with failure" inside the helper, this is much easier to see what is
> going on.

Yes. My first attempt did not even checkout <onto>, and it was
surprisingly difficult to pin that one down. I would never have expected
check_todo_list to have that side effect.

> > +/*
> > + * Check if the user dropped some commits by mistake
> > + * Behaviour determined by rebase.missingCommitsCheck.
> > + * Check if there is an unrecognized command or a
> > + * bad SHA-1 in a command.
> > + */
> > +int check_todo_list(void)
> > +{
> > +	enum check_level check_level = get_missing_commit_check_level();
> > +	struct strbuf todo_file = STRBUF_INIT;
> > +	struct todo_list todo_list = TODO_LIST_INIT;
> > +	struct commit_list *missing = NULL;
> > +	int raise_error = 0, res = 0, fd, i;
> > +
> > +	strbuf_addstr(&todo_file, rebase_path_todo());
> > +	fd = open(todo_file.buf, O_RDONLY);
> > +	if (fd < 0) {
> > +		res = error_errno(_("could not open '%s'"), todo_file.buf);
> > +		goto leave_check;
> > +	}
> > +	if (strbuf_read(&todo_list.buf, fd, 0) < 0) {
> > +		close(fd);
> > +		res = error(_("could not read '%s'."), todo_file.buf);
> > +		goto leave_check;
> > +	}
> > +	close(fd);
> > +	raise_error = res =
> > +		parse_insn_buffer(todo_list.buf.buf, &todo_list);
> > +
> > +	if (check_level == CHECK_IGNORE)
> > +		goto leave_check;
> 
> OK, so even it is set to ignore, unreadable todo list will be shown
> with a loud error message that tells the user to use --edit-todo.
> 
> What should happen when it is not set to ignore and we found the
> todo list unacceptable, I wonder?

Whoops. In case of a parse error, it does not make sense to check, does
it. Fixed.

> > +	/* Get the SHA-1 of the commits */
> > +	for (i = 0; i < todo_list.nr; i++) {
> > +		struct commit *commit = todo_list.items[i].commit;
> > +		if (commit)
> > +			commit->util = todo_list.items + i;
> > +	}
> 
> It does not look like this loop is "Get(ting) the SHA-1 of the commits"
> to me, though.  It is setting up ->util to be usable as a back-pointer
> into the list.

Right, and that is not even necessary. It is even incorrect, as we release
the todo_list and read git-rebase-todo.backup into the same data
structure, possibly reallocating the array, therefore the pointers may
become stale. So I went with your suggestion further down to use (void *)1
instead.

Also, the comment is actively wrong, I agree. I changed it to

	/* Mark the commits in git-rebase-todo as seen */

> > +	todo_list_release(&todo_list);
> 
> But then the todo-list is released?  The util field we have set, if
> any, in the previous loop are now dangling, no?

Right.

> > +	strbuf_addstr(&todo_file, ".backup");
> > +	fd = open(todo_file.buf, O_RDONLY);
> > +	if (fd < 0) {
> > +		res = error_errno(_("could not open '%s'"), todo_file.buf);
> > +		goto leave_check;
> > +	}
> > +	if (strbuf_read(&todo_list.buf, fd, 0) < 0) {
> > +		close(fd);
> > +		res = error(_("could not read '%s'."), todo_file.buf);
> > +		goto leave_check;
> > +	}
> > +	close(fd);
> > +	strbuf_release(&todo_file);
> > +	res = !!parse_insn_buffer(todo_list.buf.buf, &todo_list);
> 
> Then we read from .backup; failure to do so does not result in the
> "you need to --edit-todo" warning.

Correct. At this point, we could even add

	if (res)
		die("BUG: cannot read '%s'", todo_file.buf);

(moving the strbuf_release(&todo_file) below, of course), as the .backup
file is not intended to be edited by the user, i.e. it is the original
todo which should *never* be unparseable.

> > +	/* Find commits that are missing after editing */
> > +	for (i = 0; i < todo_list.nr; i++) {
> > +		struct commit *commit = todo_list.items[i].commit;
> > +		if (commit && !commit->util) {
> > +			commit_list_insert(commit, &missing);
> > +			commit->util = todo_list.items + i;
> > +		}
> > +	}
> 
> And we check the commits mentioned in the backup; any commit whose
> util is not marked in the previous loop is noticed and thrown into
> the missing list.
> 
> The loop we later have does "while (missing)" and does not look at
> commit->util for commits that are *not* missing, i.e. the ones that
> are marked in the previous loop, so it does not matter that their
> util field have dangling pointers.  In that sense, it may not be
> buggy, but it is misleading.  The only thing these two loops care
> about is that the commits found in the earlier loop get their util
> field set to non-NULL, so instead of using "todo_list.items+i",
> perhaps doing this
> 
> 	if (commit)
> 		commit->util = (void *)1; /* mark as seen */
> 
> in the earlier loop instead would be much less confusing.

... and doing the same in this loop. I agree, that's exactly what I
changed it to.

> > +	/* Warn about missing commits */
> > +	if (!missing)
> > +		goto leave_check;
> 
> If there is no missing one, we may still return error about
> unacceptable backup file.  But if we read backup fine and didn't
> find anything missing, we'll return silently and with success.  OK.
> 
> > +	if (check_level == CHECK_ERROR)
> > +		raise_error = res = 1;
> 
> Otherwise, we found missing ones and we want to report here.
> 
> The reason why I started reading this function aloud was because I
> found two variables (raise_error and res) somewhat confusing.  I
> think what the code does makes sense, but I still find the way how
> the code expresses the logic with these two variables confusing.
> Perhaps somebody else can hopefully offer possible improvements, as
> I do not offhand think of a way better than what is currently in
> this patch myself.

I renamed the `raise_error` variable to `advise_to_edit_todo`. The `res`
does not need renaming, methinks, as it is used everywhere else in that
file to indicate the return value.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH v3 1/9] rebase -i: generate the script via rebase--helper
  2017-04-28  0:13           ` Junio C Hamano
  2017-04-28  2:36             ` Junio C Hamano
@ 2017-04-28 15:13             ` Johannes Schindelin
  2017-05-01  3:11               ` Junio C Hamano
  1 sibling, 1 reply; 100+ messages in thread
From: Johannes Schindelin @ 2017-04-28 15:13 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Philip Oakley, Jeff King

Hi Junio,

On Thu, 27 Apr 2017, Junio C Hamano wrote:

> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> 
> > On Wed, 26 Apr 2017, Junio C Hamano wrote:
> >
> >> Johannes Schindelin <johannes.schindelin@gmx.de> writes:
> >> 
> >> > diff --git a/sequencer.c b/sequencer.c
> >> > index 77afecaebf0..e858a976279 100644
> >> > --- a/sequencer.c
> >> > +++ b/sequencer.c
> >> > @@ -2388,3 +2388,48 @@ void append_signoff(struct strbuf *msgbuf, int ignore_footer, unsigned flag)
> >> >  
> >> >  	strbuf_release(&sob);
> >> >  }
> >> > +
> >> > +int sequencer_make_script(int keep_empty, FILE *out,
> >> > +		int argc, const char **argv)
> >> > +{
> >> > +	char *format = xstrdup("%s");
> >> > +	struct pretty_print_context pp = {0};
> >> > +	struct strbuf buf = STRBUF_INIT;
> >> > +	struct rev_info revs;
> >> > +	struct commit *commit;
> >> > +
> >> > +	init_revisions(&revs, NULL);
> >> > +	revs.verbose_header = 1;
> >> > +	revs.max_parents = 1;
> >> > +	revs.cherry_pick = 1;
> >> > +	revs.limited = 1;
> >> > +	revs.reverse = 1;
> >> > +	revs.right_only = 1;
> >> > +	revs.sort_order = REV_SORT_IN_GRAPH_ORDER;
> >> > +	revs.topo_order = 1;
> >> > +
> >> > +	revs.pretty_given = 1;
> >> > +	git_config_get_string("rebase.instructionFormat", &format);
> >> > +	get_commit_format(format, &revs);
> >> > +	free(format);
> >> > +	pp.fmt = revs.commit_format;
> >> > +	pp.output_encoding = get_log_output_encoding();
> >> 
> >> All of the above feels like inviting unnecessary future breakages by
> >> knowing too much about the implementation the current version of
> >> revision.c happens to use.
> >
> > You mean that the `--reverse` option gets translated into the `reverse`
> > bit, and the other settings?
> 
> Yes.  The "pretty_given" trick is one example that the underlying
> implementation can change over time.  If you wrote this patch before
> 66b2ed09 ("Fix "log" family not to be too agressive about showing
> notes", 2010-01-20) happened, you wouldn't have known to flip this
> bit on to emulate the command line parsing of "--pretty" and
> friends, and you would have required the author of that change to
> know that you have this cut & pasted duplicated code here when the
> commit is primarily about updating revision.c
> 
> So I am very serious when I say that this is adding an unnecessary
> maintenance burden.

In that case, I would strongly advise to consider redesigning the API. It
is just no good to ask for a change in stringent code that would delay
compile errors to runtime errors, that's just poor form.

And if the API allows settings that do something unintentional without at
least a runtime warning, the API is no good.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH v3 4/9] rebase -i: also expand/collapse the SHA-1s via the rebase--helper
  2017-04-28  0:15           ` Junio C Hamano
@ 2017-04-28 15:15             ` Johannes Schindelin
  0 siblings, 0 replies; 100+ messages in thread
From: Johannes Schindelin @ 2017-04-28 15:15 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Philip Oakley, Jeff King

Hi Junio,

On Thu, 27 Apr 2017, Junio C Hamano wrote:

> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> 
> >> > +	out = fopen(todo_file, "w");
> >> 
> >> The usual "open lockfile, write to it and then rename" dance is not
> >> necessary for the purpose of preventing other people from reading
> >> this file while we are writing to it.  But if we fail inside this
> >> function before we fclose(3) "out", the user will lose the todo
> >> list.  It probably is not a big deal, though.
> >
> > I guess you're right. It is bug-for-bug equivalent to the previous shell
> > function, though.
> 
> I think the scripted version uses the "write to $todo.new and mv
> $todo.new to $todo" pattern so you'd at least have something to go
> back to when the loopfails.

My mistake.

Sorry,
Dscho

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH v3 1/9] rebase -i: generate the script via rebase--helper
  2017-04-28 10:08       ` Phillip Wood
@ 2017-04-28 19:22         ` Johannes Schindelin
  2017-05-01 10:06           ` Phillip Wood
  2017-05-01  0:49         ` Junio C Hamano
  1 sibling, 1 reply; 100+ messages in thread
From: Johannes Schindelin @ 2017-04-28 19:22 UTC (permalink / raw)
  To: phillip.wood; +Cc: git, Junio C Hamano, Philip Oakley, Jeff King

Hi Philip,

On Fri, 28 Apr 2017, Phillip Wood wrote:

> On 26/04/17 12:59, Johannes Schindelin wrote:
>
> > The first step of an interactive rebase is to generate the so-called
> > "todo script", to be stored in the state directory as
> > "git-rebase-todo" and to be edited by the user.
> > 
> > Originally, we adjusted the output of `git log <options>` using a
> > simple sed script. Over the course of the years, the code became more
> > complicated. We now use shell scripting to edit the output of `git
> > log` conditionally, depending whether to keep "empty" commits (i.e.
> > commits that do not change any files).
> > 
> > On platforms where shell scripting is not native, this can be a
> > serious drag. And it opens the door for incompatibilities between
> > platforms when it comes to shell scripting or to Unix-y commands.
> > 
> > Let's just re-implement the todo script generation in plain C, using
> > the revision machinery directly.
> > 
> > This is substantially faster, improving the speed relative to the
> > shell script version of the interactive rebase from 2x to 3x on
> > Windows.
> 
> This changes the behaviour of git -c rebase.instructionFormat= rebase -i
> The shell version treats the rebase.instructionFormat being unset or set
> to the empty string as equivalent. This version generates a todo list
> with lines like 'pick <abbrev sha1>' rather than 'pick <abbrev sha1>
> <subject>'
> 
> I only picked this up because I have a script that does 'git -c
> rebase.instructionFormat= rebase -i' with a custom sequence editor. I
> can easily add '%s' in the appropriate place but I thought I'd point it
> out in case other people are affected by the change.

While I would argue that the C version is more correct, it would be
backwards-incompatible.

So I changed it.

BTW in the future you could help me a *lot* by providing a patch that adds
a test case to our test suite that not only demonstrates what exactly goes
wrong, but also will help prevent future regressions.

Ciao,
Johannes

^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH v4 00/10] The final building block for a faster rebase -i
  2017-04-26 11:59   ` [PATCH v3 " Johannes Schindelin
                       ` (8 preceding siblings ...)
  2017-04-26 12:00     ` [PATCH v3 9/9] rebase -i: rearrange fixup/squash lines using the rebase--helper Johannes Schindelin
@ 2017-04-28 21:30     ` Johannes Schindelin
  2017-04-28 21:31       ` [PATCH v4 01/10] t3415: verify that an empty instructionFormat is handled as before Johannes Schindelin
                         ` (11 more replies)
  9 siblings, 12 replies; 100+ messages in thread
From: Johannes Schindelin @ 2017-04-28 21:30 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Philip Oakley, Jeff King, Phillip Wood

This patch series reimplements the expensive pre- and post-processing of
the todo script in C.

And it concludes the work I did to accelerate rebase -i.

Changes since v3:

- removed the no-longer-used transform_todo_ids shell function

- simplified transform_todo_ids()'s command parsing

- fixed two commits in check_todo_list(), renamed the unclear
  `raise_error` variable to `advise_to_edit_todo`, build the message
  about missing commits directly (without the detour to building a
  commit_list) and instead of assigning an unused pointer to commit->util
  the code now uses (void *)1.

- return early from check_todo_list() when parsing failed, even if the
  check level is something else than CHECK_IGNORE

- the todo list is generated is again generated in the same way as
  before when rebase.instructionFormat is empty: it was interpreted as
  if it had not been set

- added a test for empty rebase.instructionFormat settings


Johannes Schindelin (10):
  t3415: verify that an empty instructionFormat is handled as before
  rebase -i: generate the script via rebase--helper
  rebase -i: remove useless indentation
  rebase -i: do not invent onelines when expanding/collapsing SHA-1s
  rebase -i: also expand/collapse the SHA-1s via the rebase--helper
  t3404: relax rebase.missingCommitsCheck tests
  rebase -i: check for missing commits in the rebase--helper
  rebase -i: skip unnecessary picks using the rebase--helper
  t3415: test fixup with wrapped oneline
  rebase -i: rearrange fixup/squash lines using the rebase--helper

 Documentation/git-rebase.txt  |  16 +-
 builtin/rebase--helper.c      |  29 ++-
 git-rebase--interactive.sh    | 373 ++++-------------------------
 sequencer.c                   | 530 ++++++++++++++++++++++++++++++++++++++++++
 sequencer.h                   |   8 +
 t/t3404-rebase-interactive.sh |  22 +-
 t/t3415-rebase-autosquash.sh  |  28 ++-
 7 files changed, 646 insertions(+), 360 deletions(-)


base-commit: 027a3b943b444a3e3a76f9a89803fc10245b858f
Based-On: rebase--helper at https://github.com/dscho/git
Fetch-Base-Via: git fetch https://github.com/dscho/git rebase--helper
Published-As: https://github.com/dscho/git/releases/tag/rebase-i-extra-v4
Fetch-It-Via: git fetch https://github.com/dscho/git rebase-i-extra-v4

Interdiff vs v3:

 diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
 index d39fe4f5fb7..84c6e62518f 100644
 --- a/git-rebase--interactive.sh
 +++ b/git-rebase--interactive.sh
 @@ -713,29 +713,6 @@ do_rest () {
  	done
  }
  
 -transform_todo_ids () {
 -	while read -r command rest
 -	do
 -		case "$command" in
 -		"$comment_char"* | exec)
 -			# Be careful for oddball commands like 'exec'
 -			# that do not have a SHA-1 at the beginning of $rest.
 -			;;
 -		*)
 -			sha1=$(git rev-parse --verify --quiet "$@" ${rest%%[	 ]*}) &&
 -			if test "a$rest" = "a${rest#*[	 ]}"
 -			then
 -				rest=$sha1
 -			else
 -				rest="$sha1 ${rest#*[	 ]}"
 -			fi
 -			;;
 -		esac
 -		printf '%s\n' "$command${rest:+ }$rest"
 -	done <"$todo" >"$todo.new" &&
 -	mv -f "$todo.new" "$todo"
 -}
 -
  expand_todo_ids() {
  	git rebase--helper --expand-sha1s
  }
 diff --git a/sequencer.c b/sequencer.c
 index 84f8e366761..63a588f0916 100644
 --- a/sequencer.c
 +++ b/sequencer.c
 @@ -2393,7 +2393,7 @@ void append_signoff(struct strbuf *msgbuf, int ignore_footer, unsigned flag)
  int sequencer_make_script(int keep_empty, FILE *out,
  		int argc, const char **argv)
  {
 -	char *format = xstrdup("%s");
 +	char *format = NULL;
  	struct pretty_print_context pp = {0};
  	struct strbuf buf = STRBUF_INIT;
  	struct rev_info revs;
 @@ -2411,6 +2411,10 @@ int sequencer_make_script(int keep_empty, FILE *out,
  
  	revs.pretty_given = 1;
  	git_config_get_string("rebase.instructionFormat", &format);
 +	if (!format || !*format) {
 +		free(format);
 +		format = xstrdup("%s");
 +	}
  	get_commit_format(format, &revs);
  	free(format);
  	pp.fmt = revs.commit_format;
 @@ -2475,18 +2479,16 @@ int transform_todo_ids(int shorten_sha1s)
  		if (item->command >= TODO_EXEC && item->command != TODO_DROP)
  			fwrite(p, eol - bol, 1, out);
  		else {
 -			int eoc = strcspn(p, " \t");
  			const char *sha1 = shorten_sha1s ?
  				short_commit_name(item->commit) :
  				oid_to_hex(&item->commit->object.oid);
 +			int len;
  
 -			if (!eoc) {
 -				p += strspn(p, " \t");
 -				eoc = strcspn(p, " \t");
 -			}
 +			p += strspn(p, " \t"); /* left-trim command */
 +			len = strcspn(p, " \t"); /* length of command */
  
  			fprintf(out, "%.*s %s %.*s\n",
 -				eoc, p, sha1, item->arg_len, item->arg);
 +				len, p, sha1, item->arg_len, item->arg);
  		}
  	}
  	fclose(out);
 @@ -2525,8 +2527,8 @@ int check_todo_list(void)
  	enum check_level check_level = get_missing_commit_check_level();
  	struct strbuf todo_file = STRBUF_INIT;
  	struct todo_list todo_list = TODO_LIST_INIT;
 -	struct commit_list *missing = NULL;
 -	int raise_error = 0, res = 0, fd, i;
 +	struct strbuf missing = STRBUF_INIT;
 +	int advise_to_edit_todo = 0, res = 0, fd, i;
  
  	strbuf_addstr(&todo_file, rebase_path_todo());
  	fd = open(todo_file.buf, O_RDONLY);
 @@ -2540,17 +2542,17 @@ int check_todo_list(void)
  		goto leave_check;
  	}
  	close(fd);
 -	raise_error = res =
 +	advise_to_edit_todo = res =
  		parse_insn_buffer(todo_list.buf.buf, &todo_list);
  
 -	if (check_level == CHECK_IGNORE)
 +	if (res || check_level == CHECK_IGNORE)
  		goto leave_check;
  
 -	/* Get the SHA-1 of the commits */
 +	/* Mark the commits in git-rebase-todo as seen */
  	for (i = 0; i < todo_list.nr; i++) {
  		struct commit *commit = todo_list.items[i].commit;
  		if (commit)
 -			commit->util = todo_list.items + i;
 +			commit->util = (void *)1;
  	}
  
  	todo_list_release(&todo_list);
 @@ -2569,35 +2571,32 @@ int check_todo_list(void)
  	strbuf_release(&todo_file);
  	res = !!parse_insn_buffer(todo_list.buf.buf, &todo_list);
  
 -	/* Find commits that are missing after editing */
 -	for (i = 0; i < todo_list.nr; i++) {
 -		struct commit *commit = todo_list.items[i].commit;
 +	/* Find commits in git-rebase-todo.backup yet unseen */
 +	for (i = todo_list.nr - 1; i >= 0; i--) {
 +		struct todo_item *item = todo_list.items + i;
 +		struct commit *commit = item->commit;
  		if (commit && !commit->util) {
 -			commit_list_insert(commit, &missing);
 -			commit->util = todo_list.items + i;
 +			strbuf_addf(&missing, " - %s %.*s\n",
 +				    short_commit_name(commit),
 +				    item->arg_len, item->arg);
 +			commit->util = (void *)1;
  		}
  	}
  
  	/* Warn about missing commits */
 -	if (!missing)
 +	if (!missing.len)
  		goto leave_check;
  
  	if (check_level == CHECK_ERROR)
 -		raise_error = res = 1;
 +		advise_to_edit_todo = res = 1;
  
  	fprintf(stderr,
  		_("Warning: some commits may have been dropped accidentally.\n"
  		"Dropped commits (newer to older):\n"));
  
  	/* Make the list user-friendly and display */
 -	while (missing) {
 -		struct commit *commit = pop_commit(&missing);
 -		struct todo_item *item = commit->util;
 -
 -		fprintf(stderr, " - %s %.*s\n", short_commit_name(commit),
 -			item->arg_len, item->arg);
 -	}
 -	free_commit_list(missing);
 +	fputs(missing.buf, stderr);
 +	strbuf_release(&missing);
  
  	fprintf(stderr, _("To avoid this message, use \"drop\" to "
  		"explicitly remove a commit.\n\n"
 @@ -2609,7 +2608,7 @@ int check_todo_list(void)
  	strbuf_release(&todo_file);
  	todo_list_release(&todo_list);
  
 -	if (raise_error)
 +	if (advise_to_edit_todo)
  		fprintf(stderr,
  			_("You can fix this with 'git rebase --edit-todo' "
  			  "and then run 'git rebase --continue'.\n"
 diff --git a/t/t3415-rebase-autosquash.sh b/t/t3415-rebase-autosquash.sh
 index b9e26008a79..2f88f50c057 100755
 --- a/t/t3415-rebase-autosquash.sh
 +++ b/t/t3415-rebase-autosquash.sh
 @@ -271,6 +271,18 @@ test_expect_success 'autosquash with custom inst format' '
  	test 2 = $(git cat-file commit HEAD^ | grep squash | wc -l)
  '
  
 +test_expect_success 'autosquash with empty custom instructionFormat' '
 +	git reset --hard base &&
 +	test_commit empty-instructionFormat-test &&
 +	(
 +		set_cat_todo_editor &&
 +		test_must_fail git -c rebase.instructionFormat= \
 +			rebase --autosquash  --force -i HEAD^ >actual &&
 +		git log -1 --format="pick %h %s" >expect &&
 +		test_cmp expect actual
 +	)
 +'
 +
  set_backup_editor () {
  	write_script backup-editor.sh <<-\EOF
  	cp "$1" .git/backup-"$(basename "$1")"

-- 
2.12.2.windows.2.800.gede8f145e06


^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH v4 01/10] t3415: verify that an empty instructionFormat is handled as before
  2017-04-28 21:30     ` [PATCH v4 00/10] The final building block for a faster rebase -i Johannes Schindelin
@ 2017-04-28 21:31       ` Johannes Schindelin
  2017-04-28 21:31       ` [PATCH v4 02/10] rebase -i: generate the script via rebase--helper Johannes Schindelin
                         ` (10 subsequent siblings)
  11 siblings, 0 replies; 100+ messages in thread
From: Johannes Schindelin @ 2017-04-28 21:31 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Philip Oakley, Jeff King, Phillip Wood

An upcoming patch will move the todo list generation into the
rebase--helper. An early version of that patch regressed on an empty
rebase.instructionFormat value (the shell version could not discern
between an empty one and a non-existing one, but the C version used the
empty one as if that was intended to skip the oneline from the `pick
<hash>` lines).

Let's verify that this still works as before.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 t/t3415-rebase-autosquash.sh | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/t/t3415-rebase-autosquash.sh b/t/t3415-rebase-autosquash.sh
index 48346f1cc0c..6c37ebdff87 100755
--- a/t/t3415-rebase-autosquash.sh
+++ b/t/t3415-rebase-autosquash.sh
@@ -271,6 +271,18 @@ test_expect_success 'autosquash with custom inst format' '
 	test 2 = $(git cat-file commit HEAD^ | grep squash | wc -l)
 '
 
+test_expect_success 'autosquash with empty custom instructionFormat' '
+	git reset --hard base &&
+	test_commit empty-instructionFormat-test &&
+	(
+		set_cat_todo_editor &&
+		test_must_fail git -c rebase.instructionFormat= \
+			rebase --autosquash  --force -i HEAD^ >actual &&
+		git log -1 --format="pick %h %s" >expect &&
+		test_cmp expect actual
+	)
+'
+
 set_backup_editor () {
 	write_script backup-editor.sh <<-\EOF
 	cp "$1" .git/backup-"$(basename "$1")"
-- 
2.12.2.windows.2.800.gede8f145e06



^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [PATCH v4 02/10] rebase -i: generate the script via rebase--helper
  2017-04-28 21:30     ` [PATCH v4 00/10] The final building block for a faster rebase -i Johannes Schindelin
  2017-04-28 21:31       ` [PATCH v4 01/10] t3415: verify that an empty instructionFormat is handled as before Johannes Schindelin
@ 2017-04-28 21:31       ` Johannes Schindelin
  2017-05-26  3:15         ` Liam Beguin
  2017-05-29  6:07         ` Junio C Hamano
  2017-04-28 21:31       ` [PATCH v4 03/10] rebase -i: remove useless indentation Johannes Schindelin
                         ` (9 subsequent siblings)
  11 siblings, 2 replies; 100+ messages in thread
From: Johannes Schindelin @ 2017-04-28 21:31 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Philip Oakley, Jeff King, Phillip Wood

The first step of an interactive rebase is to generate the so-called "todo
script", to be stored in the state directory as "git-rebase-todo" and to
be edited by the user.

Originally, we adjusted the output of `git log <options>` using a simple
sed script. Over the course of the years, the code became more
complicated. We now use shell scripting to edit the output of `git log`
conditionally, depending whether to keep "empty" commits (i.e. commits
that do not change any files).

On platforms where shell scripting is not native, this can be a serious
drag. And it opens the door for incompatibilities between platforms when
it comes to shell scripting or to Unix-y commands.

Let's just re-implement the todo script generation in plain C, using the
revision machinery directly.

This is substantially faster, improving the speed relative to the
shell script version of the interactive rebase from 2x to 3x on Windows.

Note that the rearrange_squash() function in git-rebase--interactive
relied on the fact that we set the "format" variable to the config setting
rebase.instructionFormat. Relying on a side effect like this is no good,
hence we explicitly perform that assignment (possibly again) in
rearrange_squash().

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 builtin/rebase--helper.c   |  8 +++++++-
 git-rebase--interactive.sh | 44 +++++++++++++++++++++--------------------
 sequencer.c                | 49 ++++++++++++++++++++++++++++++++++++++++++++++
 sequencer.h                |  3 +++
 4 files changed, 82 insertions(+), 22 deletions(-)

diff --git a/builtin/rebase--helper.c b/builtin/rebase--helper.c
index ca1ebb2fa18..821058d452d 100644
--- a/builtin/rebase--helper.c
+++ b/builtin/rebase--helper.c
@@ -11,15 +11,19 @@ static const char * const builtin_rebase_helper_usage[] = {
 int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
 {
 	struct replay_opts opts = REPLAY_OPTS_INIT;
+	int keep_empty = 0;
 	enum {
-		CONTINUE = 1, ABORT
+		CONTINUE = 1, ABORT, MAKE_SCRIPT
 	} command = 0;
 	struct option options[] = {
 		OPT_BOOL(0, "ff", &opts.allow_ff, N_("allow fast-forward")),
+		OPT_BOOL(0, "keep-empty", &keep_empty, N_("keep empty commits")),
 		OPT_CMDMODE(0, "continue", &command, N_("continue rebase"),
 				CONTINUE),
 		OPT_CMDMODE(0, "abort", &command, N_("abort rebase"),
 				ABORT),
+		OPT_CMDMODE(0, "make-script", &command,
+			N_("make rebase script"), MAKE_SCRIPT),
 		OPT_END()
 	};
 
@@ -36,5 +40,7 @@ int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
 		return !!sequencer_continue(&opts);
 	if (command == ABORT && argc == 1)
 		return !!sequencer_remove_state(&opts);
+	if (command == MAKE_SCRIPT && argc > 1)
+		return !!sequencer_make_script(keep_empty, stdout, argc, argv);
 	usage_with_options(builtin_rebase_helper_usage, options);
 }
diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
index 2c9c0165b5a..609e150d38f 100644
--- a/git-rebase--interactive.sh
+++ b/git-rebase--interactive.sh
@@ -785,6 +785,7 @@ collapse_todo_ids() {
 # each log message will be re-retrieved in order to normalize the
 # autosquash arrangement
 rearrange_squash () {
+	format=$(git config --get rebase.instructionFormat)
 	# extract fixup!/squash! lines and resolve any referenced sha1's
 	while read -r pick sha1 message
 	do
@@ -1210,26 +1211,27 @@ else
 	revisions=$onto...$orig_head
 	shortrevisions=$shorthead
 fi
-format=$(git config --get rebase.instructionFormat)
-# the 'rev-list .. | sed' requires %m to parse; the instruction requires %H to parse
-git rev-list $merges_option --format="%m%H ${format:-%s}" \
-	--reverse --left-right --topo-order \
-	$revisions ${restrict_revision+^$restrict_revision} | \
-	sed -n "s/^>//p" |
-while read -r sha1 rest
-do
-
-	if test -z "$keep_empty" && is_empty_commit $sha1 && ! is_merge_commit $sha1
-	then
-		comment_out="$comment_char "
-	else
-		comment_out=
-	fi
+if test t != "$preserve_merges"
+then
+	git rebase--helper --make-script ${keep_empty:+--keep-empty} \
+		$revisions ${restrict_revision+^$restrict_revision} >"$todo"
+else
+	format=$(git config --get rebase.instructionFormat)
+	# the 'rev-list .. | sed' requires %m to parse; the instruction requires %H to parse
+	git rev-list $merges_option --format="%m%H ${format:-%s}" \
+		--reverse --left-right --topo-order \
+		$revisions ${restrict_revision+^$restrict_revision} | \
+		sed -n "s/^>//p" |
+	while read -r sha1 rest
+	do
+
+		if test -z "$keep_empty" && is_empty_commit $sha1 && ! is_merge_commit $sha1
+		then
+			comment_out="$comment_char "
+		else
+			comment_out=
+		fi
 
-	if test t != "$preserve_merges"
-	then
-		printf '%s\n' "${comment_out}pick $sha1 $rest" >>"$todo"
-	else
 		if test -z "$rebase_root"
 		then
 			preserve=t
@@ -1248,8 +1250,8 @@ do
 			touch "$rewritten"/$sha1
 			printf '%s\n' "${comment_out}pick $sha1 $rest" >>"$todo"
 		fi
-	fi
-done
+	done
+fi
 
 # Watch for commits that been dropped by --cherry-pick
 if test t = "$preserve_merges"
diff --git a/sequencer.c b/sequencer.c
index 130cc868e51..88819a1a2a9 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -2388,3 +2388,52 @@ void append_signoff(struct strbuf *msgbuf, int ignore_footer, unsigned flag)
 
 	strbuf_release(&sob);
 }
+
+int sequencer_make_script(int keep_empty, FILE *out,
+		int argc, const char **argv)
+{
+	char *format = NULL;
+	struct pretty_print_context pp = {0};
+	struct strbuf buf = STRBUF_INIT;
+	struct rev_info revs;
+	struct commit *commit;
+
+	init_revisions(&revs, NULL);
+	revs.verbose_header = 1;
+	revs.max_parents = 1;
+	revs.cherry_pick = 1;
+	revs.limited = 1;
+	revs.reverse = 1;
+	revs.right_only = 1;
+	revs.sort_order = REV_SORT_IN_GRAPH_ORDER;
+	revs.topo_order = 1;
+
+	revs.pretty_given = 1;
+	git_config_get_string("rebase.instructionFormat", &format);
+	if (!format || !*format) {
+		free(format);
+		format = xstrdup("%s");
+	}
+	get_commit_format(format, &revs);
+	free(format);
+	pp.fmt = revs.commit_format;
+	pp.output_encoding = get_log_output_encoding();
+
+	if (setup_revisions(argc, argv, &revs, NULL) > 1)
+		return error(_("make_script: unhandled options"));
+
+	if (prepare_revision_walk(&revs) < 0)
+		return error(_("make_script: error preparing revisions"));
+
+	while ((commit = get_revision(&revs))) {
+		strbuf_reset(&buf);
+		if (!keep_empty && is_original_commit_empty(commit))
+			strbuf_addf(&buf, "%c ", comment_line_char);
+		strbuf_addf(&buf, "pick %s ", oid_to_hex(&commit->object.oid));
+		pretty_print_commit(&pp, commit, &buf);
+		strbuf_addch(&buf, '\n');
+		fputs(buf.buf, out);
+	}
+	strbuf_release(&buf);
+	return 0;
+}
diff --git a/sequencer.h b/sequencer.h
index f885b68395f..83f2943b7a9 100644
--- a/sequencer.h
+++ b/sequencer.h
@@ -45,6 +45,9 @@ int sequencer_continue(struct replay_opts *opts);
 int sequencer_rollback(struct replay_opts *opts);
 int sequencer_remove_state(struct replay_opts *opts);
 
+int sequencer_make_script(int keep_empty, FILE *out,
+		int argc, const char **argv);
+
 extern const char sign_off_header[];
 
 void append_signoff(struct strbuf *msgbuf, int ignore_footer, unsigned flag);
-- 
2.12.2.windows.2.800.gede8f145e06



^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [PATCH v4 03/10] rebase -i: remove useless indentation
  2017-04-28 21:30     ` [PATCH v4 00/10] The final building block for a faster rebase -i Johannes Schindelin
  2017-04-28 21:31       ` [PATCH v4 01/10] t3415: verify that an empty instructionFormat is handled as before Johannes Schindelin
  2017-04-28 21:31       ` [PATCH v4 02/10] rebase -i: generate the script via rebase--helper Johannes Schindelin
@ 2017-04-28 21:31       ` Johannes Schindelin
  2017-05-26  3:15         ` Liam Beguin
  2017-04-28 21:32       ` [PATCH v4 04/10] rebase -i: do not invent onelines when expanding/collapsing SHA-1s Johannes Schindelin
                         ` (8 subsequent siblings)
  11 siblings, 1 reply; 100+ messages in thread
From: Johannes Schindelin @ 2017-04-28 21:31 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Philip Oakley, Jeff King, Phillip Wood

The commands used to be indented, and it is nice to look at, but when we
transform the SHA-1s, the indentation is removed. So let's do away with it.

For the moment, at least: when we will use the upcoming rebase--helper
to transform the SHA-1s, we *will* keep the indentation and can
reintroduce it. Yet, to be able to validate the rebase--helper against
the output of the current shell script version, we need to remove the
extra indentation.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 git-rebase--interactive.sh | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
index 609e150d38f..c40b1fd1d2e 100644
--- a/git-rebase--interactive.sh
+++ b/git-rebase--interactive.sh
@@ -155,13 +155,13 @@ reschedule_last_action () {
 append_todo_help () {
 	gettext "
 Commands:
- p, pick = use commit
- r, reword = use commit, but edit the commit message
- e, edit = use commit, but stop for amending
- s, squash = use commit, but meld into previous commit
- f, fixup = like \"squash\", but discard this commit's log message
- x, exec = run command (the rest of the line) using shell
- d, drop = remove commit
+p, pick = use commit
+r, reword = use commit, but edit the commit message
+e, edit = use commit, but stop for amending
+s, squash = use commit, but meld into previous commit
+f, fixup = like \"squash\", but discard this commit's log message
+x, exec = run command (the rest of the line) using shell
+d, drop = remove commit
 
 These lines can be re-ordered; they are executed from top to bottom.
 " | git stripspace --comment-lines >>"$todo"
-- 
2.12.2.windows.2.800.gede8f145e06



^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [PATCH v4 04/10] rebase -i: do not invent onelines when expanding/collapsing SHA-1s
  2017-04-28 21:30     ` [PATCH v4 00/10] The final building block for a faster rebase -i Johannes Schindelin
                         ` (2 preceding siblings ...)
  2017-04-28 21:31       ` [PATCH v4 03/10] rebase -i: remove useless indentation Johannes Schindelin
@ 2017-04-28 21:32       ` Johannes Schindelin
  2017-04-28 21:32       ` [PATCH v4 05/10] rebase -i: also expand/collapse the SHA-1s via the rebase--helper Johannes Schindelin
                         ` (7 subsequent siblings)
  11 siblings, 0 replies; 100+ messages in thread
From: Johannes Schindelin @ 2017-04-28 21:32 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Philip Oakley, Jeff King, Phillip Wood

To avoid problems with short SHA-1s that become non-unique during the
rebase, we rewrite the todo script with short/long SHA-1s before and
after letting the user edit the script. Since SHA-1s are not intuitive
for humans, rebase -i also provides the onelines (commit message
subjects) in the script, purely for the user's convenience.

It is very possible to generate a todo script via different means than
rebase -i and then to let rebase -i run with it; In this case, these
onelines are not required.

And this is where the expand/collapse machinery has a bug: it *expects*
that oneline, and failing to find one reuses the previous SHA-1 as
"oneline".

It was most likely an oversight, and made implementation in the (quite
limiting) shell script language less convoluted. However, we are about
to reimplement performance-critical parts in C (and due to spawning a
git.exe process for every single line of the todo script, the
expansion/collapsing of the SHA-1s *is* performance-hampering on
Windows), therefore let's fix this bug to make cross-validation with the
C version of that functionality possible.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 git-rebase--interactive.sh | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
index c40b1fd1d2e..214af0372ba 100644
--- a/git-rebase--interactive.sh
+++ b/git-rebase--interactive.sh
@@ -760,7 +760,12 @@ transform_todo_ids () {
 			;;
 		*)
 			sha1=$(git rev-parse --verify --quiet "$@" ${rest%%[	 ]*}) &&
-			rest="$sha1 ${rest#*[	 ]}"
+			if test "a$rest" = "a${rest#*[	 ]}"
+			then
+				rest=$sha1
+			else
+				rest="$sha1 ${rest#*[	 ]}"
+			fi
 			;;
 		esac
 		printf '%s\n' "$command${rest:+ }$rest"
-- 
2.12.2.windows.2.800.gede8f145e06



^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [PATCH v4 05/10] rebase -i: also expand/collapse the SHA-1s via the rebase--helper
  2017-04-28 21:30     ` [PATCH v4 00/10] The final building block for a faster rebase -i Johannes Schindelin
                         ` (3 preceding siblings ...)
  2017-04-28 21:32       ` [PATCH v4 04/10] rebase -i: do not invent onelines when expanding/collapsing SHA-1s Johannes Schindelin
@ 2017-04-28 21:32       ` Johannes Schindelin
  2017-05-26  3:15         ` Liam Beguin
  2017-04-28 21:32       ` [PATCH v4 06/10] t3404: relax rebase.missingCommitsCheck tests Johannes Schindelin
                         ` (6 subsequent siblings)
  11 siblings, 1 reply; 100+ messages in thread
From: Johannes Schindelin @ 2017-04-28 21:32 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Philip Oakley, Jeff King, Phillip Wood

This is crucial to improve performance on Windows, as the speed is now
mostly dominated by the SHA-1 transformation (because it spawns a new
rev-parse process for *every* line, and spawning processes is pretty
slow from Git for Windows' MSYS2 Bash).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 builtin/rebase--helper.c   | 10 +++++++-
 git-rebase--interactive.sh | 27 ++--------------------
 sequencer.c                | 57 ++++++++++++++++++++++++++++++++++++++++++++++
 sequencer.h                |  2 ++
 4 files changed, 70 insertions(+), 26 deletions(-)

diff --git a/builtin/rebase--helper.c b/builtin/rebase--helper.c
index 821058d452d..9444c8d6c60 100644
--- a/builtin/rebase--helper.c
+++ b/builtin/rebase--helper.c
@@ -13,7 +13,7 @@ int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
 	struct replay_opts opts = REPLAY_OPTS_INIT;
 	int keep_empty = 0;
 	enum {
-		CONTINUE = 1, ABORT, MAKE_SCRIPT
+		CONTINUE = 1, ABORT, MAKE_SCRIPT, SHORTEN_SHA1S, EXPAND_SHA1S
 	} command = 0;
 	struct option options[] = {
 		OPT_BOOL(0, "ff", &opts.allow_ff, N_("allow fast-forward")),
@@ -24,6 +24,10 @@ int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
 				ABORT),
 		OPT_CMDMODE(0, "make-script", &command,
 			N_("make rebase script"), MAKE_SCRIPT),
+		OPT_CMDMODE(0, "shorten-sha1s", &command,
+			N_("shorten SHA-1s in the todo list"), SHORTEN_SHA1S),
+		OPT_CMDMODE(0, "expand-sha1s", &command,
+			N_("expand SHA-1s in the todo list"), EXPAND_SHA1S),
 		OPT_END()
 	};
 
@@ -42,5 +46,9 @@ int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
 		return !!sequencer_remove_state(&opts);
 	if (command == MAKE_SCRIPT && argc > 1)
 		return !!sequencer_make_script(keep_empty, stdout, argc, argv);
+	if (command == SHORTEN_SHA1S && argc == 1)
+		return !!transform_todo_ids(1);
+	if (command == EXPAND_SHA1S && argc == 1)
+		return !!transform_todo_ids(0);
 	usage_with_options(builtin_rebase_helper_usage, options);
 }
diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
index 214af0372ba..82a1941c42c 100644
--- a/git-rebase--interactive.sh
+++ b/git-rebase--interactive.sh
@@ -750,35 +750,12 @@ skip_unnecessary_picks () {
 		die "$(gettext "Could not skip unnecessary pick commands")"
 }
 
-transform_todo_ids () {
-	while read -r command rest
-	do
-		case "$command" in
-		"$comment_char"* | exec)
-			# Be careful for oddball commands like 'exec'
-			# that do not have a SHA-1 at the beginning of $rest.
-			;;
-		*)
-			sha1=$(git rev-parse --verify --quiet "$@" ${rest%%[	 ]*}) &&
-			if test "a$rest" = "a${rest#*[	 ]}"
-			then
-				rest=$sha1
-			else
-				rest="$sha1 ${rest#*[	 ]}"
-			fi
-			;;
-		esac
-		printf '%s\n' "$command${rest:+ }$rest"
-	done <"$todo" >"$todo.new" &&
-	mv -f "$todo.new" "$todo"
-}
-
 expand_todo_ids() {
-	transform_todo_ids
+	git rebase--helper --expand-sha1s
 }
 
 collapse_todo_ids() {
-	transform_todo_ids --short
+	git rebase--helper --shorten-sha1s
 }
 
 # Rearrange the todo list that has both "pick sha1 msg" and
diff --git a/sequencer.c b/sequencer.c
index 88819a1a2a9..201d45b1677 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -2437,3 +2437,60 @@ int sequencer_make_script(int keep_empty, FILE *out,
 	strbuf_release(&buf);
 	return 0;
 }
+
+
+int transform_todo_ids(int shorten_sha1s)
+{
+	const char *todo_file = rebase_path_todo();
+	struct todo_list todo_list = TODO_LIST_INIT;
+	int fd, res, i;
+	FILE *out;
+
+	strbuf_reset(&todo_list.buf);
+	fd = open(todo_file, O_RDONLY);
+	if (fd < 0)
+		return error_errno(_("could not open '%s'"), todo_file);
+	if (strbuf_read(&todo_list.buf, fd, 0) < 0) {
+		close(fd);
+		return error(_("could not read '%s'."), todo_file);
+	}
+	close(fd);
+
+	res = parse_insn_buffer(todo_list.buf.buf, &todo_list);
+	if (res) {
+		todo_list_release(&todo_list);
+		return error(_("unusable instruction sheet: '%s'"), todo_file);
+	}
+
+	out = fopen(todo_file, "w");
+	if (!out) {
+		todo_list_release(&todo_list);
+		return error(_("unable to open '%s' for writing"), todo_file);
+	}
+	for (i = 0; i < todo_list.nr; i++) {
+		struct todo_item *item = todo_list.items + i;
+		int bol = item->offset_in_buf;
+		const char *p = todo_list.buf.buf + bol;
+		int eol = i + 1 < todo_list.nr ?
+			todo_list.items[i + 1].offset_in_buf :
+			todo_list.buf.len;
+
+		if (item->command >= TODO_EXEC && item->command != TODO_DROP)
+			fwrite(p, eol - bol, 1, out);
+		else {
+			const char *sha1 = shorten_sha1s ?
+				short_commit_name(item->commit) :
+				oid_to_hex(&item->commit->object.oid);
+			int len;
+
+			p += strspn(p, " \t"); /* left-trim command */
+			len = strcspn(p, " \t"); /* length of command */
+
+			fprintf(out, "%.*s %s %.*s\n",
+				len, p, sha1, item->arg_len, item->arg);
+		}
+	}
+	fclose(out);
+	todo_list_release(&todo_list);
+	return 0;
+}
diff --git a/sequencer.h b/sequencer.h
index 83f2943b7a9..47a81034e76 100644
--- a/sequencer.h
+++ b/sequencer.h
@@ -48,6 +48,8 @@ int sequencer_remove_state(struct replay_opts *opts);
 int sequencer_make_script(int keep_empty, FILE *out,
 		int argc, const char **argv);
 
+int transform_todo_ids(int shorten_sha1s);
+
 extern const char sign_off_header[];
 
 void append_signoff(struct strbuf *msgbuf, int ignore_footer, unsigned flag);
-- 
2.12.2.windows.2.800.gede8f145e06



^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [PATCH v4 06/10] t3404: relax rebase.missingCommitsCheck tests
  2017-04-28 21:30     ` [PATCH v4 00/10] The final building block for a faster rebase -i Johannes Schindelin
                         ` (4 preceding siblings ...)
  2017-04-28 21:32       ` [PATCH v4 05/10] rebase -i: also expand/collapse the SHA-1s via the rebase--helper Johannes Schindelin
@ 2017-04-28 21:32       ` Johannes Schindelin
  2017-04-28 21:32       ` [PATCH v4 07/10] rebase -i: check for missing commits in the rebase--helper Johannes Schindelin
                         ` (5 subsequent siblings)
  11 siblings, 0 replies; 100+ messages in thread
From: Johannes Schindelin @ 2017-04-28 21:32 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Philip Oakley, Jeff King, Phillip Wood

These tests were a bit anal about the *exact* warning/error message
printed by git rebase. But those messages are intended for the *end
user*, therefore it does not make sense to test so rigidly for the
*exact* wording.

In the following, we will reimplement the missing commits check in
the sequencer, with slightly different words.

So let's just test for the parts in the warning/error message that
we *really* care about, nothing more, nothing less.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 t/t3404-rebase-interactive.sh | 22 ++++------------------
 1 file changed, 4 insertions(+), 18 deletions(-)

diff --git a/t/t3404-rebase-interactive.sh b/t/t3404-rebase-interactive.sh
index 33d392ba112..61113be08a4 100755
--- a/t/t3404-rebase-interactive.sh
+++ b/t/t3404-rebase-interactive.sh
@@ -1242,20 +1242,13 @@ test_expect_success 'rebase -i respects rebase.missingCommitsCheck = error' '
 	test B = $(git cat-file commit HEAD^ | sed -ne \$p)
 '
 
-cat >expect <<EOF
-Warning: the command isn't recognized in the following line:
- - badcmd $(git rev-list --oneline -1 master~1)
-
-You can fix this with 'git rebase --edit-todo' and then run 'git rebase --continue'.
-Or you can abort the rebase with 'git rebase --abort'.
-EOF
-
 test_expect_success 'static check of bad command' '
 	rebase_setup_and_clean bad-cmd &&
 	set_fake_editor &&
 	test_must_fail env FAKE_LINES="1 2 3 bad 4 5" \
 		git rebase -i --root 2>actual &&
-	test_i18ncmp expect actual &&
+	test_i18ngrep "badcmd $(git rev-list --oneline -1 master~1)" actual &&
+	test_i18ngrep "You can fix this with .git rebase --edit-todo.." actual &&
 	FAKE_LINES="1 2 3 drop 4 5" git rebase --edit-todo &&
 	git rebase --continue &&
 	test E = $(git cat-file commit HEAD | sed -ne \$p) &&
@@ -1277,20 +1270,13 @@ test_expect_success 'tabs and spaces are accepted in the todolist' '
 	test E = $(git cat-file commit HEAD | sed -ne \$p)
 '
 
-cat >expect <<EOF
-Warning: the SHA-1 is missing or isn't a commit in the following line:
- - edit XXXXXXX False commit
-
-You can fix this with 'git rebase --edit-todo' and then run 'git rebase --continue'.
-Or you can abort the rebase with 'git rebase --abort'.
-EOF
-
 test_expect_success 'static check of bad SHA-1' '
 	rebase_setup_and_clean bad-sha &&
 	set_fake_editor &&
 	test_must_fail env FAKE_LINES="1 2 edit fakesha 3 4 5 #" \
 		git rebase -i --root 2>actual &&
-	test_i18ncmp expect actual &&
+	test_i18ngrep "edit XXXXXXX False commit" actual &&
+	test_i18ngrep "You can fix this with .git rebase --edit-todo.." actual &&
 	FAKE_LINES="1 2 4 5 6" git rebase --edit-todo &&
 	git rebase --continue &&
 	test E = $(git cat-file commit HEAD | sed -ne \$p)
-- 
2.12.2.windows.2.800.gede8f145e06



^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [PATCH v4 07/10] rebase -i: check for missing commits in the rebase--helper
  2017-04-28 21:30     ` [PATCH v4 00/10] The final building block for a faster rebase -i Johannes Schindelin
                         ` (5 preceding siblings ...)
  2017-04-28 21:32       ` [PATCH v4 06/10] t3404: relax rebase.missingCommitsCheck tests Johannes Schindelin
@ 2017-04-28 21:32       ` Johannes Schindelin
  2017-04-28 21:32       ` [PATCH v4 08/10] rebase -i: skip unnecessary picks using " Johannes Schindelin
                         ` (4 subsequent siblings)
  11 siblings, 0 replies; 100+ messages in thread
From: Johannes Schindelin @ 2017-04-28 21:32 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Philip Oakley, Jeff King, Phillip Wood

In particular on Windows, where shell scripts are even more expensive
than on MacOSX or Linux, it makes sense to move a loop that forks
Git at least once for every line in the todo list into a builtin.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 builtin/rebase--helper.c   |   7 +-
 git-rebase--interactive.sh | 164 ++-------------------------------------------
 sequencer.c                | 122 +++++++++++++++++++++++++++++++++
 sequencer.h                |   1 +
 4 files changed, 134 insertions(+), 160 deletions(-)

diff --git a/builtin/rebase--helper.c b/builtin/rebase--helper.c
index 9444c8d6c60..e706eac710d 100644
--- a/builtin/rebase--helper.c
+++ b/builtin/rebase--helper.c
@@ -13,7 +13,8 @@ int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
 	struct replay_opts opts = REPLAY_OPTS_INIT;
 	int keep_empty = 0;
 	enum {
-		CONTINUE = 1, ABORT, MAKE_SCRIPT, SHORTEN_SHA1S, EXPAND_SHA1S
+		CONTINUE = 1, ABORT, MAKE_SCRIPT, SHORTEN_SHA1S, EXPAND_SHA1S,
+		CHECK_TODO_LIST
 	} command = 0;
 	struct option options[] = {
 		OPT_BOOL(0, "ff", &opts.allow_ff, N_("allow fast-forward")),
@@ -28,6 +29,8 @@ int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
 			N_("shorten SHA-1s in the todo list"), SHORTEN_SHA1S),
 		OPT_CMDMODE(0, "expand-sha1s", &command,
 			N_("expand SHA-1s in the todo list"), EXPAND_SHA1S),
+		OPT_CMDMODE(0, "check-todo-list", &command,
+			N_("check the todo list"), CHECK_TODO_LIST),
 		OPT_END()
 	};
 
@@ -50,5 +53,7 @@ int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
 		return !!transform_todo_ids(1);
 	if (command == EXPAND_SHA1S && argc == 1)
 		return !!transform_todo_ids(0);
+	if (command == CHECK_TODO_LIST && argc == 1)
+		return !!check_todo_list();
 	usage_with_options(builtin_rebase_helper_usage, options);
 }
diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
index 82a1941c42c..08168a0d46b 100644
--- a/git-rebase--interactive.sh
+++ b/git-rebase--interactive.sh
@@ -867,96 +867,6 @@ add_exec_commands () {
 	mv "$1.new" "$1"
 }
 
-# Check if the SHA-1 passed as an argument is a
-# correct one, if not then print $2 in "$todo".badsha
-# $1: the SHA-1 to test
-# $2: the line number of the input
-# $3: the input filename
-check_commit_sha () {
-	badsha=0
-	if test -z "$1"
-	then
-		badsha=1
-	else
-		sha1_verif="$(git rev-parse --verify --quiet $1^{commit})"
-		if test -z "$sha1_verif"
-		then
-			badsha=1
-		fi
-	fi
-
-	if test $badsha -ne 0
-	then
-		line="$(sed -n -e "${2}p" "$3")"
-		warn "$(eval_gettext "\
-Warning: the SHA-1 is missing or isn't a commit in the following line:
- - \$line")"
-		warn
-	fi
-
-	return $badsha
-}
-
-# prints the bad commits and bad commands
-# from the todolist in stdin
-check_bad_cmd_and_sha () {
-	retval=0
-	lineno=0
-	while read -r command rest
-	do
-		lineno=$(( $lineno + 1 ))
-		case $command in
-		"$comment_char"*|''|noop|x|exec)
-			# Doesn't expect a SHA-1
-			;;
-		"$cr")
-			# Work around CR left by "read" (e.g. with Git for
-			# Windows' Bash).
-			;;
-		pick|p|drop|d|reword|r|edit|e|squash|s|fixup|f)
-			if ! check_commit_sha "${rest%%[ 	]*}" "$lineno" "$1"
-			then
-				retval=1
-			fi
-			;;
-		*)
-			line="$(sed -n -e "${lineno}p" "$1")"
-			warn "$(eval_gettext "\
-Warning: the command isn't recognized in the following line:
- - \$line")"
-			warn
-			retval=1
-			;;
-		esac
-	done <"$1"
-	return $retval
-}
-
-# Print the list of the SHA-1 of the commits
-# from stdin to stdout
-todo_list_to_sha_list () {
-	git stripspace --strip-comments |
-	while read -r command sha1 rest
-	do
-		case $command in
-		"$comment_char"*|''|noop|x|"exec")
-			;;
-		*)
-			long_sha=$(git rev-list --no-walk "$sha1" 2>/dev/null)
-			printf "%s\n" "$long_sha"
-			;;
-		esac
-	done
-}
-
-# Use warn for each line in stdin
-warn_lines () {
-	while read -r line
-	do
-		warn " - $line"
-	done
-}
-
 # Switch to the branch in $into and notify it in the reflog
 checkout_onto () {
 	GIT_REFLOG_ACTION="$GIT_REFLOG_ACTION: checkout $onto_name"
@@ -971,74 +881,6 @@ get_missing_commit_check_level () {
 	printf '%s' "$check_level" | tr 'A-Z' 'a-z'
 }
 
-# Check if the user dropped some commits by mistake
-# Behaviour determined by rebase.missingCommitsCheck.
-# Check if there is an unrecognized command or a
-# bad SHA-1 in a command.
-check_todo_list () {
-	raise_error=f
-
-	check_level=$(get_missing_commit_check_level)
-
-	case "$check_level" in
-	warn|error)
-		# Get the SHA-1 of the commits
-		todo_list_to_sha_list <"$todo".backup >"$todo".oldsha1
-		todo_list_to_sha_list <"$todo" >"$todo".newsha1
-
-		# Sort the SHA-1 and compare them
-		sort -u "$todo".oldsha1 >"$todo".oldsha1+
-		mv "$todo".oldsha1+ "$todo".oldsha1
-		sort -u "$todo".newsha1 >"$todo".newsha1+
-		mv "$todo".newsha1+ "$todo".newsha1
-		comm -2 -3 "$todo".oldsha1 "$todo".newsha1 >"$todo".miss
-
-		# Warn about missing commits
-		if test -s "$todo".miss
-		then
-			test "$check_level" = error && raise_error=t
-
-			warn "$(gettext "\
-Warning: some commits may have been dropped accidentally.
-Dropped commits (newer to older):")"
-
-			# Make the list user-friendly and display
-			opt="--no-walk=sorted --format=oneline --abbrev-commit --stdin"
-			git rev-list $opt <"$todo".miss | warn_lines
-
-			warn "$(gettext "\
-To avoid this message, use \"drop\" to explicitly remove a commit.
-
-Use 'git config rebase.missingCommitsCheck' to change the level of warnings.
-The possible behaviours are: ignore, warn, error.")"
-			warn
-		fi
-		;;
-	ignore)
-		;;
-	*)
-		warn "$(eval_gettext "Unrecognized setting \$check_level for option rebase.missingCommitsCheck. Ignoring.")"
-		;;
-	esac
-
-	if ! check_bad_cmd_and_sha "$todo"
-	then
-		raise_error=t
-	fi
-
-	if test $raise_error = t
-	then
-		# Checkout before the first commit of the
-		# rebase: this way git rebase --continue
-		# will work correctly as it expects HEAD to be
-		# placed before the commit of the next action
-		checkout_onto
-
-		warn "$(gettext "You can fix this with 'git rebase --edit-todo' and then run 'git rebase --continue'.")"
-		die "$(gettext "Or you can abort the rebase with 'git rebase --abort'.")"
-	fi
-}
-
 # The whole contents of this file is run by dot-sourcing it from
 # inside a shell function.  It used to be that "return"s we see
 # below were not inside any function, and expected to return
@@ -1299,7 +1141,11 @@ git_sequence_editor "$todo" ||
 has_action "$todo" ||
 	return 2
 
-check_todo_list
+git rebase--helper --check-todo-list || {
+	ret=$?
+	checkout_onto
+	exit $ret
+}
 
 expand_todo_ids
 
diff --git a/sequencer.c b/sequencer.c
index 201d45b1677..4535ba9d12f 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -2494,3 +2494,125 @@ int transform_todo_ids(int shorten_sha1s)
 	todo_list_release(&todo_list);
 	return 0;
 }
+
+enum check_level {
+	CHECK_IGNORE = 0, CHECK_WARN, CHECK_ERROR
+};
+
+static enum check_level get_missing_commit_check_level(void)
+{
+	const char *value;
+
+	if (git_config_get_value("rebase.missingcommitscheck", &value) ||
+			!strcasecmp("ignore", value))
+		return CHECK_IGNORE;
+	if (!strcasecmp("warn", value))
+		return CHECK_WARN;
+	if (!strcasecmp("error", value))
+		return CHECK_ERROR;
+	warning(_("unrecognized setting %s for option"
+		  "rebase.missingCommitsCheck. Ignoring."), value);
+	return CHECK_IGNORE;
+}
+
+/*
+ * Check if the user dropped some commits by mistake
+ * Behaviour determined by rebase.missingCommitsCheck.
+ * Check if there is an unrecognized command or a
+ * bad SHA-1 in a command.
+ */
+int check_todo_list(void)
+{
+	enum check_level check_level = get_missing_commit_check_level();
+	struct strbuf todo_file = STRBUF_INIT;
+	struct todo_list todo_list = TODO_LIST_INIT;
+	struct strbuf missing = STRBUF_INIT;
+	int advise_to_edit_todo = 0, res = 0, fd, i;
+
+	strbuf_addstr(&todo_file, rebase_path_todo());
+	fd = open(todo_file.buf, O_RDONLY);
+	if (fd < 0) {
+		res = error_errno(_("could not open '%s'"), todo_file.buf);
+		goto leave_check;
+	}
+	if (strbuf_read(&todo_list.buf, fd, 0) < 0) {
+		close(fd);
+		res = error(_("could not read '%s'."), todo_file.buf);
+		goto leave_check;
+	}
+	close(fd);
+	advise_to_edit_todo = res =
+		parse_insn_buffer(todo_list.buf.buf, &todo_list);
+
+	if (res || check_level == CHECK_IGNORE)
+		goto leave_check;
+
+	/* Mark the commits in git-rebase-todo as seen */
+	for (i = 0; i < todo_list.nr; i++) {
+		struct commit *commit = todo_list.items[i].commit;
+		if (commit)
+			commit->util = (void *)1;
+	}
+
+	todo_list_release(&todo_list);
+	strbuf_addstr(&todo_file, ".backup");
+	fd = open(todo_file.buf, O_RDONLY);
+	if (fd < 0) {
+		res = error_errno(_("could not open '%s'"), todo_file.buf);
+		goto leave_check;
+	}
+	if (strbuf_read(&todo_list.buf, fd, 0) < 0) {
+		close(fd);
+		res = error(_("could not read '%s'."), todo_file.buf);
+		goto leave_check;
+	}
+	close(fd);
+	strbuf_release(&todo_file);
+	res = !!parse_insn_buffer(todo_list.buf.buf, &todo_list);
+
+	/* Find commits in git-rebase-todo.backup yet unseen */
+	for (i = todo_list.nr - 1; i >= 0; i--) {
+		struct todo_item *item = todo_list.items + i;
+		struct commit *commit = item->commit;
+		if (commit && !commit->util) {
+			strbuf_addf(&missing, " - %s %.*s\n",
+				    short_commit_name(commit),
+				    item->arg_len, item->arg);
+			commit->util = (void *)1;
+		}
+	}
+
+	/* Warn about missing commits */
+	if (!missing.len)
+		goto leave_check;
+
+	if (check_level == CHECK_ERROR)
+		advise_to_edit_todo = res = 1;
+
+	fprintf(stderr,
+		_("Warning: some commits may have been dropped accidentally.\n"
+		"Dropped commits (newer to older):\n"));
+
+	/* Make the list user-friendly and display */
+	fputs(missing.buf, stderr);
+	strbuf_release(&missing);
+
+	fprintf(stderr, _("To avoid this message, use \"drop\" to "
+		"explicitly remove a commit.\n\n"
+		"Use 'git config rebase.missingCommitsCheck' to change "
+		"the level of warnings.\n"
+		"The possible behaviours are: ignore, warn, error.\n\n"));
+
+leave_check:
+	strbuf_release(&todo_file);
+	todo_list_release(&todo_list);
+
+	if (advise_to_edit_todo)
+		fprintf(stderr,
+			_("You can fix this with 'git rebase --edit-todo' "
+			  "and then run 'git rebase --continue'.\n"
+			  "Or you can abort the rebase with 'git rebase"
+			  " --abort'.\n"));
+
+	return res;
+}
diff --git a/sequencer.h b/sequencer.h
index 47a81034e76..4978a61b83b 100644
--- a/sequencer.h
+++ b/sequencer.h
@@ -49,6 +49,7 @@ int sequencer_make_script(int keep_empty, FILE *out,
 		int argc, const char **argv);
 
 int transform_todo_ids(int shorten_sha1s);
+int check_todo_list(void);
 
 extern const char sign_off_header[];
 
-- 
2.12.2.windows.2.800.gede8f145e06



^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [PATCH v4 08/10] rebase -i: skip unnecessary picks using the rebase--helper
  2017-04-28 21:30     ` [PATCH v4 00/10] The final building block for a faster rebase -i Johannes Schindelin
                         ` (6 preceding siblings ...)
  2017-04-28 21:32       ` [PATCH v4 07/10] rebase -i: check for missing commits in the rebase--helper Johannes Schindelin
@ 2017-04-28 21:32       ` Johannes Schindelin
  2017-04-28 21:32       ` [PATCH v4 09/10] t3415: test fixup with wrapped oneline Johannes Schindelin
                         ` (3 subsequent siblings)
  11 siblings, 0 replies; 100+ messages in thread
From: Johannes Schindelin @ 2017-04-28 21:32 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Philip Oakley, Jeff King, Phillip Wood

In particular on Windows, where shell scripts are even more expensive
than on MacOSX or Linux, it makes sense to move a loop that forks
Git at least once for every line in the todo list into a builtin.

Note: The original code did not try to skip unnecessary picks of root
commits but punts instead (probably --root was not considered common
enough of a use case to bother optimizing). We do the same, for now.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 builtin/rebase--helper.c   |   6 ++-
 git-rebase--interactive.sh |  41 ++---------------
 sequencer.c                | 107 +++++++++++++++++++++++++++++++++++++++++++++
 sequencer.h                |   1 +
 4 files changed, 116 insertions(+), 39 deletions(-)

diff --git a/builtin/rebase--helper.c b/builtin/rebase--helper.c
index e706eac710d..de3ccd9bfbc 100644
--- a/builtin/rebase--helper.c
+++ b/builtin/rebase--helper.c
@@ -14,7 +14,7 @@ int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
 	int keep_empty = 0;
 	enum {
 		CONTINUE = 1, ABORT, MAKE_SCRIPT, SHORTEN_SHA1S, EXPAND_SHA1S,
-		CHECK_TODO_LIST
+		CHECK_TODO_LIST, SKIP_UNNECESSARY_PICKS
 	} command = 0;
 	struct option options[] = {
 		OPT_BOOL(0, "ff", &opts.allow_ff, N_("allow fast-forward")),
@@ -31,6 +31,8 @@ int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
 			N_("expand SHA-1s in the todo list"), EXPAND_SHA1S),
 		OPT_CMDMODE(0, "check-todo-list", &command,
 			N_("check the todo list"), CHECK_TODO_LIST),
+		OPT_CMDMODE(0, "skip-unnecessary-picks", &command,
+			N_("skip unnecessary picks"), SKIP_UNNECESSARY_PICKS),
 		OPT_END()
 	};
 
@@ -55,5 +57,7 @@ int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
 		return !!transform_todo_ids(0);
 	if (command == CHECK_TODO_LIST && argc == 1)
 		return !!check_todo_list();
+	if (command == SKIP_UNNECESSARY_PICKS && argc == 1)
+		return !!skip_unnecessary_picks();
 	usage_with_options(builtin_rebase_helper_usage, options);
 }
diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
index 08168a0d46b..92e3ca1de3b 100644
--- a/git-rebase--interactive.sh
+++ b/git-rebase--interactive.sh
@@ -713,43 +713,6 @@ do_rest () {
 	done
 }
 
-# skip picking commits whose parents are unchanged
-skip_unnecessary_picks () {
-	fd=3
-	while read -r command rest
-	do
-		# fd=3 means we skip the command
-		case "$fd,$command" in
-		3,pick|3,p)
-			# pick a commit whose parent is current $onto -> skip
-			sha1=${rest%% *}
-			case "$(git rev-parse --verify --quiet "$sha1"^)" in
-			"$onto"*)
-				onto=$sha1
-				;;
-			*)
-				fd=1
-				;;
-			esac
-			;;
-		3,"$comment_char"*|3,)
-			# copy comments
-			;;
-		*)
-			fd=1
-			;;
-		esac
-		printf '%s\n' "$command${rest:+ }$rest" >&$fd
-	done <"$todo" >"$todo.new" 3>>"$done" &&
-	mv -f "$todo".new "$todo" &&
-	case "$(peek_next_command)" in
-	squash|s|fixup|f)
-		record_in_rewritten "$onto"
-		;;
-	esac ||
-		die "$(gettext "Could not skip unnecessary pick commands")"
-}
-
 expand_todo_ids() {
 	git rebase--helper --expand-sha1s
 }
@@ -1149,7 +1112,9 @@ git rebase--helper --check-todo-list || {
 
 expand_todo_ids
 
-test -d "$rewritten" || test -n "$force_rebase" || skip_unnecessary_picks
+test -d "$rewritten" || test -n "$force_rebase" ||
+onto="$(git rebase--helper --skip-unnecessary-picks)" ||
+die "Could not skip unnecessary pick commands"
 
 checkout_onto
 if test -z "$rebase_root" && test ! -d "$rewritten"
diff --git a/sequencer.c b/sequencer.c
index 4535ba9d12f..72e3ad8d145 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -2616,3 +2616,110 @@ int check_todo_list(void)
 
 	return res;
 }
+
+/* skip picking commits whose parents are unchanged */
+int skip_unnecessary_picks(void)
+{
+	const char *todo_file = rebase_path_todo();
+	struct strbuf buf = STRBUF_INIT;
+	struct todo_list todo_list = TODO_LIST_INIT;
+	struct object_id onto_oid, *oid = &onto_oid, *parent_oid;
+	int fd, i;
+
+	if (!read_oneliner(&buf, rebase_path_onto(), 0))
+		return error(_("could not read 'onto'"));
+	if (get_sha1(buf.buf, onto_oid.hash)) {
+		strbuf_release(&buf);
+		return error(_("need a HEAD to fixup"));
+	}
+	strbuf_release(&buf);
+
+	fd = open(todo_file, O_RDONLY);
+	if (fd < 0) {
+		return error_errno(_("could not open '%s'"), todo_file);
+	}
+	if (strbuf_read(&todo_list.buf, fd, 0) < 0) {
+		close(fd);
+		return error(_("could not read '%s'."), todo_file);
+	}
+	close(fd);
+	if (parse_insn_buffer(todo_list.buf.buf, &todo_list) < 0) {
+		todo_list_release(&todo_list);
+		return -1;
+	}
+
+	for (i = 0; i < todo_list.nr; i++) {
+		struct todo_item *item = todo_list.items + i;
+
+		if (item->command >= TODO_NOOP)
+			continue;
+		if (item->command != TODO_PICK)
+			break;
+		if (parse_commit(item->commit)) {
+			todo_list_release(&todo_list);
+			return error(_("could not parse commit '%s'"),
+				oid_to_hex(&item->commit->object.oid));
+		}
+		if (!item->commit->parents)
+			break; /* root commit */
+		if (item->commit->parents->next)
+			break; /* merge commit */
+		parent_oid = &item->commit->parents->item->object.oid;
+		if (hashcmp(parent_oid->hash, oid->hash))
+			break;
+		oid = &item->commit->object.oid;
+	}
+	if (i > 0) {
+		int offset = i < todo_list.nr ?
+			todo_list.items[i].offset_in_buf : todo_list.buf.len;
+		const char *done_path = rebase_path_done();
+
+		fd = open(done_path, O_CREAT | O_WRONLY | O_APPEND, 0666);
+		if (fd < 0) {
+			error_errno(_("could not open '%s' for writing"),
+				    done_path);
+			todo_list_release(&todo_list);
+			return -1;
+		}
+		if (write_in_full(fd, todo_list.buf.buf, offset) < 0) {
+			error_errno(_("could not write to '%s'"), done_path);
+			todo_list_release(&todo_list);
+			close(fd);
+			return -1;
+		}
+		close(fd);
+
+		fd = open(rebase_path_todo(), O_WRONLY, 0666);
+		if (fd < 0) {
+			error_errno(_("could not open '%s' for writing"),
+				    rebase_path_todo());
+			todo_list_release(&todo_list);
+			return -1;
+		}
+		if (write_in_full(fd, todo_list.buf.buf + offset,
+				todo_list.buf.len - offset) < 0) {
+			error_errno(_("could not write to '%s'"),
+				    rebase_path_todo());
+			close(fd);
+			todo_list_release(&todo_list);
+			return -1;
+		}
+		if (ftruncate(fd, todo_list.buf.len - offset) < 0) {
+			error_errno(_("could not truncate '%s'"),
+				    rebase_path_todo());
+			todo_list_release(&todo_list);
+			close(fd);
+			return -1;
+		}
+		close(fd);
+
+		todo_list.current = i;
+		if (is_fixup(peek_command(&todo_list, 0)))
+			record_in_rewritten(oid, peek_command(&todo_list, 0));
+	}
+
+	todo_list_release(&todo_list);
+	printf("%s\n", oid_to_hex(oid));
+
+	return 0;
+}
diff --git a/sequencer.h b/sequencer.h
index 4978a61b83b..28e1fc1e9bb 100644
--- a/sequencer.h
+++ b/sequencer.h
@@ -50,6 +50,7 @@ int sequencer_make_script(int keep_empty, FILE *out,
 
 int transform_todo_ids(int shorten_sha1s);
 int check_todo_list(void);
+int skip_unnecessary_picks(void);
 
 extern const char sign_off_header[];
 
-- 
2.12.2.windows.2.800.gede8f145e06



^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [PATCH v4 09/10] t3415: test fixup with wrapped oneline
  2017-04-28 21:30     ` [PATCH v4 00/10] The final building block for a faster rebase -i Johannes Schindelin
                         ` (7 preceding siblings ...)
  2017-04-28 21:32       ` [PATCH v4 08/10] rebase -i: skip unnecessary picks using " Johannes Schindelin
@ 2017-04-28 21:32       ` Johannes Schindelin
  2017-04-28 21:33       ` [PATCH v4 10/10] rebase -i: rearrange fixup/squash lines using the rebase--helper Johannes Schindelin
                         ` (2 subsequent siblings)
  11 siblings, 0 replies; 100+ messages in thread
From: Johannes Schindelin @ 2017-04-28 21:32 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Philip Oakley, Jeff King, Phillip Wood

The `git commit --fixup` command unwraps wrapped onelines when
constructing the commit message, without wrapping the result.

We need to make sure that `git rebase --autosquash` keeps handling such
cases correctly, in particular since we are about to move the autosquash
handling into the rebase--helper.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 t/t3415-rebase-autosquash.sh | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/t/t3415-rebase-autosquash.sh b/t/t3415-rebase-autosquash.sh
index 6c37ebdff87..926bb3da788 100755
--- a/t/t3415-rebase-autosquash.sh
+++ b/t/t3415-rebase-autosquash.sh
@@ -316,4 +316,18 @@ test_expect_success 'extra spaces after fixup!' '
 	test $base = $parent
 '
 
+test_expect_success 'wrapped original subject' '
+	if test -d .git/rebase-merge; then git rebase --abort; fi &&
+	base=$(git rev-parse HEAD) &&
+	echo "wrapped subject" >wrapped &&
+	git add wrapped &&
+	test_tick &&
+	git commit --allow-empty -m "$(printf "To\nfixup")" &&
+	test_tick &&
+	git commit --allow-empty -m "fixup! To fixup" &&
+	git rebase -i --autosquash --keep-empty HEAD~2 &&
+	parent=$(git rev-parse HEAD^) &&
+	test $base = $parent
+'
+
 test_done
-- 
2.12.2.windows.2.800.gede8f145e06



^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [PATCH v4 10/10] rebase -i: rearrange fixup/squash lines using the rebase--helper
  2017-04-28 21:30     ` [PATCH v4 00/10] The final building block for a faster rebase -i Johannes Schindelin
                         ` (8 preceding siblings ...)
  2017-04-28 21:32       ` [PATCH v4 09/10] t3415: test fixup with wrapped oneline Johannes Schindelin
@ 2017-04-28 21:33       ` Johannes Schindelin
  2017-05-26  3:16         ` Liam Beguin
  2017-05-26  3:15       ` [PATCH v4 00/10] The final building block for a faster rebase -i Liam Beguin
  2017-05-29  8:30       ` Junio C Hamano
  11 siblings, 1 reply; 100+ messages in thread
From: Johannes Schindelin @ 2017-04-28 21:33 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Philip Oakley, Jeff King, Phillip Wood

This operation has quadratic complexity, which is especially painful
on Windows, where shell scripts are *already* slow (mainly due to the
overhead of the POSIX emulation layer).

Let's reimplement this with linear complexity (using a hash map to
match the commits' subject lines) for the common case; Sadly, the
fixup/squash feature's design neglected performance considerations,
allowing arbitrary prefixes (read: `fixup! hell` will match the
commit subject `hello world`), which means that we are stuck with
quadratic performance in the worst case.

The reimplemented logic also happens to fix a bug where commented-out
lines (representing empty patches) were dropped by the previous code.

While at it, clarify how the fixup/squash feature works in `git rebase
-i`'s man page.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 Documentation/git-rebase.txt |  16 ++--
 builtin/rebase--helper.c     |   6 +-
 git-rebase--interactive.sh   |  90 +-------------------
 sequencer.c                  | 195 +++++++++++++++++++++++++++++++++++++++++++
 sequencer.h                  |   1 +
 t/t3415-rebase-autosquash.sh |   2 +-
 6 files changed, 212 insertions(+), 98 deletions(-)

diff --git a/Documentation/git-rebase.txt b/Documentation/git-rebase.txt
index 53f4e144444..c5464aa5365 100644
--- a/Documentation/git-rebase.txt
+++ b/Documentation/git-rebase.txt
@@ -430,13 +430,15 @@ without an explicit `--interactive`.
 --autosquash::
 --no-autosquash::
 	When the commit log message begins with "squash! ..." (or
-	"fixup! ..."), and there is a commit whose title begins with
-	the same ..., automatically modify the todo list of rebase -i
-	so that the commit marked for squashing comes right after the
-	commit to be modified, and change the action of the moved
-	commit from `pick` to `squash` (or `fixup`).  Ignores subsequent
-	"fixup! " or "squash! " after the first, in case you referred to an
-	earlier fixup/squash with `git commit --fixup/--squash`.
+	"fixup! ..."), and there is already a commit in the todo list that
+	matches the same `...`, automatically modify the todo list of rebase
+	-i so that the commit marked for squashing comes right after the
+	commit to be modified, and change the action of the moved commit
+	from `pick` to `squash` (or `fixup`).  A commit matches the `...` if
+	the commit subject matches, or if the `...` refers to the commit's
+	hash. As a fall-back, partial matches of the commit subject work,
+	too.  The recommended way to create fixup/squash commits is by using
+	the `--fixup`/`--squash` options of linkgit:git-commit[1].
 +
 This option is only valid when the `--interactive` option is used.
 +
diff --git a/builtin/rebase--helper.c b/builtin/rebase--helper.c
index de3ccd9bfbc..e6591f01112 100644
--- a/builtin/rebase--helper.c
+++ b/builtin/rebase--helper.c
@@ -14,7 +14,7 @@ int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
 	int keep_empty = 0;
 	enum {
 		CONTINUE = 1, ABORT, MAKE_SCRIPT, SHORTEN_SHA1S, EXPAND_SHA1S,
-		CHECK_TODO_LIST, SKIP_UNNECESSARY_PICKS
+		CHECK_TODO_LIST, SKIP_UNNECESSARY_PICKS, REARRANGE_SQUASH
 	} command = 0;
 	struct option options[] = {
 		OPT_BOOL(0, "ff", &opts.allow_ff, N_("allow fast-forward")),
@@ -33,6 +33,8 @@ int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
 			N_("check the todo list"), CHECK_TODO_LIST),
 		OPT_CMDMODE(0, "skip-unnecessary-picks", &command,
 			N_("skip unnecessary picks"), SKIP_UNNECESSARY_PICKS),
+		OPT_CMDMODE(0, "rearrange-squash", &command,
+			N_("rearrange fixup/squash lines"), REARRANGE_SQUASH),
 		OPT_END()
 	};
 
@@ -59,5 +61,7 @@ int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
 		return !!check_todo_list();
 	if (command == SKIP_UNNECESSARY_PICKS && argc == 1)
 		return !!skip_unnecessary_picks();
+	if (command == REARRANGE_SQUASH && argc == 1)
+		return !!rearrange_squash();
 	usage_with_options(builtin_rebase_helper_usage, options);
 }
diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
index 92e3ca1de3b..84c6e62518f 100644
--- a/git-rebase--interactive.sh
+++ b/git-rebase--interactive.sh
@@ -721,94 +721,6 @@ collapse_todo_ids() {
 	git rebase--helper --shorten-sha1s
 }
 
-# Rearrange the todo list that has both "pick sha1 msg" and
-# "pick sha1 fixup!/squash! msg" appears in it so that the latter
-# comes immediately after the former, and change "pick" to
-# "fixup"/"squash".
-#
-# Note that if the config has specified a custom instruction format
-# each log message will be re-retrieved in order to normalize the
-# autosquash arrangement
-rearrange_squash () {
-	format=$(git config --get rebase.instructionFormat)
-	# extract fixup!/squash! lines and resolve any referenced sha1's
-	while read -r pick sha1 message
-	do
-		test -z "${format}" || message=$(git log -n 1 --format="%s" ${sha1})
-		case "$message" in
-		"squash! "*|"fixup! "*)
-			action="${message%%!*}"
-			rest=$message
-			prefix=
-			# skip all squash! or fixup! (but save for later)
-			while :
-			do
-				case "$rest" in
-				"squash! "*|"fixup! "*)
-					prefix="$prefix${rest%%!*},"
-					rest="${rest#*! }"
-					;;
-				*)
-					break
-					;;
-				esac
-			done
-			printf '%s %s %s %s\n' "$sha1" "$action" "$prefix" "$rest"
-			# if it's a single word, try to resolve to a full sha1 and
-			# emit a second copy. This allows us to match on both message
-			# and on sha1 prefix
-			if test "${rest#* }" = "$rest"; then
-				fullsha="$(git rev-parse -q --verify "$rest" 2>/dev/null)"
-				if test -n "$fullsha"; then
-					# prefix the action to uniquely identify this line as
-					# intended for full sha1 match
-					echo "$sha1 +$action $prefix $fullsha"
-				fi
-			fi
-		esac
-	done >"$1.sq" <"$1"
-	test -s "$1.sq" || return
-
-	used=
-	while read -r pick sha1 message
-	do
-		case " $used" in
-		*" $sha1 "*) continue ;;
-		esac
-		printf '%s\n' "$pick $sha1 $message"
-		test -z "${format}" || message=$(git log -n 1 --format="%s" ${sha1})
-		used="$used$sha1 "
-		while read -r squash action msg_prefix msg_content
-		do
-			case " $used" in
-			*" $squash "*) continue ;;
-			esac
-			emit=0
-			case "$action" in
-			+*)
-				action="${action#+}"
-				# full sha1 prefix test
-				case "$msg_content" in "$sha1"*) emit=1;; esac ;;
-			*)
-				# message prefix test
-				case "$message" in "$msg_content"*) emit=1;; esac ;;
-			esac
-			if test $emit = 1; then
-				if test -n "${format}"
-				then
-					msg_content=$(git log -n 1 --format="${format}" ${squash})
-				else
-					msg_content="$(echo "$msg_prefix" | sed "s/,/! /g")$msg_content"
-				fi
-				printf '%s\n' "$action $squash $msg_content"
-				used="$used$squash "
-			fi
-		done <"$1.sq"
-	done >"$1.rearranged" <"$1"
-	cat "$1.rearranged" >"$1"
-	rm -f "$1.sq" "$1.rearranged"
-}
-
 # Add commands after a pick or after a squash/fixup serie
 # in the todo list.
 add_exec_commands () {
@@ -1068,7 +980,7 @@ then
 fi
 
 test -s "$todo" || echo noop >> "$todo"
-test -n "$autosquash" && rearrange_squash "$todo"
+test -z "$autosquash" || git rebase--helper --rearrange-squash || exit
 test -n "$cmd" && add_exec_commands "$todo"
 
 todocount=$(git stripspace --strip-comments <"$todo" | wc -l)
diff --git a/sequencer.c b/sequencer.c
index 72e3ad8d145..63a588f0916 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -19,6 +19,7 @@
 #include "trailer.h"
 #include "log-tree.h"
 #include "wt-status.h"
+#include "hashmap.h"
 
 #define GIT_REFLOG_ACTION "GIT_REFLOG_ACTION"
 
@@ -2723,3 +2724,197 @@ int skip_unnecessary_picks(void)
 
 	return 0;
 }
+
+struct subject2item_entry {
+	struct hashmap_entry entry;
+	int i;
+	char subject[FLEX_ARRAY];
+};
+
+static int subject2item_cmp(const struct subject2item_entry *a,
+	const struct subject2item_entry *b, const void *key)
+{
+	return key ? strcmp(a->subject, key) : strcmp(a->subject, b->subject);
+}
+
+/*
+ * Rearrange the todo list that has both "pick sha1 msg" and "pick sha1
+ * fixup!/squash! msg" in it so that the latter is put immediately after the
+ * former, and change "pick" to "fixup"/"squash".
+ *
+ * Note that if the config has specified a custom instruction format, each log
+ * message will have to be retrieved from the commit (as the oneline in the
+ * script cannot be trusted) in order to normalize the autosquash arrangement.
+ */
+int rearrange_squash(void)
+{
+	const char *todo_file = rebase_path_todo();
+	struct todo_list todo_list = TODO_LIST_INIT;
+	struct hashmap subject2item;
+	int res = 0, rearranged = 0, *next, *tail, fd, i;
+	char **subjects;
+
+	fd = open(todo_file, O_RDONLY);
+	if (fd < 0)
+		return error_errno(_("could not open '%s'"), todo_file);
+	if (strbuf_read(&todo_list.buf, fd, 0) < 0) {
+		close(fd);
+		return error(_("could not read '%s'."), todo_file);
+	}
+	close(fd);
+	if (parse_insn_buffer(todo_list.buf.buf, &todo_list) < 0) {
+		todo_list_release(&todo_list);
+		return -1;
+	}
+
+	/*
+	 * The hashmap maps onelines to the respective todo list index.
+	 *
+	 * If any items need to be rearranged, the next[i] value will indicate
+	 * which item was moved directly after the i'th.
+	 *
+	 * In that case, last[i] will indicate the index of the latest item to
+	 * be moved to appear after the i'th.
+	 */
+	hashmap_init(&subject2item, (hashmap_cmp_fn) subject2item_cmp,
+		     todo_list.nr);
+	ALLOC_ARRAY(next, todo_list.nr);
+	ALLOC_ARRAY(tail, todo_list.nr);
+	ALLOC_ARRAY(subjects, todo_list.nr);
+	for (i = 0; i < todo_list.nr; i++) {
+		struct strbuf buf = STRBUF_INIT;
+		struct todo_item *item = todo_list.items + i;
+		const char *commit_buffer, *subject, *p;
+		size_t subject_len;
+		int i2 = -1;
+		struct subject2item_entry *entry;
+
+		next[i] = tail[i] = -1;
+		if (item->command >= TODO_EXEC) {
+			subjects[i] = NULL;
+			continue;
+		}
+
+		if (is_fixup(item->command)) {
+			todo_list_release(&todo_list);
+			return error(_("the script was already rearranged."));
+		}
+
+		item->commit->util = item;
+
+		parse_commit(item->commit);
+		commit_buffer = get_commit_buffer(item->commit, NULL);
+		find_commit_subject(commit_buffer, &subject);
+		format_subject(&buf, subject, " ");
+		subject = subjects[i] = strbuf_detach(&buf, &subject_len);
+		unuse_commit_buffer(item->commit, commit_buffer);
+		if ((skip_prefix(subject, "fixup! ", &p) ||
+		     skip_prefix(subject, "squash! ", &p))) {
+			struct commit *commit2;
+
+			for (;;) {
+				while (isspace(*p))
+					p++;
+				if (!skip_prefix(p, "fixup! ", &p) &&
+				    !skip_prefix(p, "squash! ", &p))
+					break;
+			}
+
+			if ((entry = hashmap_get_from_hash(&subject2item,
+							   strhash(p), p)))
+				/* found by title */
+				i2 = entry->i;
+			else if (!strchr(p, ' ') &&
+				 (commit2 =
+				  lookup_commit_reference_by_name(p)) &&
+				 commit2->util)
+				/* found by commit name */
+				i2 = (struct todo_item *)commit2->util
+					- todo_list.items;
+			else {
+				/* copy can be a prefix of the commit subject */
+				for (i2 = 0; i2 < i; i2++)
+					if (subjects[i2] &&
+					    starts_with(subjects[i2], p))
+						break;
+				if (i2 == i)
+					i2 = -1;
+			}
+		}
+		if (i2 >= 0) {
+			rearranged = 1;
+			todo_list.items[i].command =
+				starts_with(subject, "fixup!") ?
+				TODO_FIXUP : TODO_SQUASH;
+			if (next[i2] < 0)
+				next[i2] = i;
+			else
+				next[tail[i2]] = i;
+			tail[i2] = i;
+		} else if (!hashmap_get_from_hash(&subject2item,
+						strhash(subject), subject)) {
+			FLEX_ALLOC_MEM(entry, subject, subject, subject_len);
+			entry->i = i;
+			hashmap_entry_init(entry, strhash(entry->subject));
+			hashmap_put(&subject2item, entry);
+		}
+	}
+
+	if (rearranged) {
+		struct strbuf buf = STRBUF_INIT;
+
+		for (i = 0; i < todo_list.nr; i++) {
+			enum todo_command command = todo_list.items[i].command;
+			int cur = i;
+
+			/*
+			 * Initially, all commands are 'pick's. If it is a
+			 * fixup or a squash now, we have rearranged it.
+			 */
+			if (is_fixup(command))
+				continue;
+
+			while (cur >= 0) {
+				int offset = todo_list.items[cur].offset_in_buf;
+				int end_offset = cur + 1 < todo_list.nr ?
+					todo_list.items[cur + 1].offset_in_buf :
+					todo_list.buf.len;
+				char *bol = todo_list.buf.buf + offset;
+				char *eol = todo_list.buf.buf + end_offset;
+
+				/* replace 'pick', by 'fixup' or 'squash' */
+				command = todo_list.items[cur].command;
+				if (is_fixup(command)) {
+					strbuf_addstr(&buf,
+						todo_command_info[command].str);
+					bol += strcspn(bol, " \t");
+				}
+
+				strbuf_add(&buf, bol, eol - bol);
+
+				cur = next[cur];
+			}
+		}
+
+		fd = open(todo_file, O_WRONLY);
+		if (fd < 0)
+			res = error_errno(_("could not open '%s'"), todo_file);
+		else if (write(fd, buf.buf, buf.len) < 0)
+			res = error_errno(_("could not read '%s'."), todo_file);
+		else if (ftruncate(fd, buf.len) < 0)
+			res = error_errno(_("could not finish '%s'"),
+					   todo_file);
+		close(fd);
+		strbuf_release(&buf);
+	}
+
+	free(next);
+	free(tail);
+	for (i = 0; i < todo_list.nr; i++)
+		free(subjects[i]);
+	free(subjects);
+	hashmap_free(&subject2item, 1);
+	todo_list_release(&todo_list);
+
+	return res;
+}
diff --git a/sequencer.h b/sequencer.h
index 28e1fc1e9bb..1c94bec7622 100644
--- a/sequencer.h
+++ b/sequencer.h
@@ -51,6 +51,7 @@ int sequencer_make_script(int keep_empty, FILE *out,
 int transform_todo_ids(int shorten_sha1s);
 int check_todo_list(void);
 int skip_unnecessary_picks(void);
+int rearrange_squash(void);
 
 extern const char sign_off_header[];
 
diff --git a/t/t3415-rebase-autosquash.sh b/t/t3415-rebase-autosquash.sh
index 926bb3da788..2f88f50c057 100755
--- a/t/t3415-rebase-autosquash.sh
+++ b/t/t3415-rebase-autosquash.sh
@@ -290,7 +290,7 @@ set_backup_editor () {
 	test_set_editor "$PWD/backup-editor.sh"
 }
 
-test_expect_failure 'autosquash with multiple empty patches' '
+test_expect_success 'autosquash with multiple empty patches' '
 	test_tick &&
 	git commit --allow-empty -m "empty" &&
 	test_tick &&
-- 
2.12.2.windows.2.800.gede8f145e06

^ permalink raw reply related	[flat|nested] 100+ messages in thread

* Re: [PATCH v3 1/9] rebase -i: generate the script via rebase--helper
  2017-04-28 10:08       ` Phillip Wood
  2017-04-28 19:22         ` Johannes Schindelin
@ 2017-05-01  0:49         ` Junio C Hamano
  2017-05-01 11:06           ` Johannes Schindelin
  1 sibling, 1 reply; 100+ messages in thread
From: Junio C Hamano @ 2017-05-01  0:49 UTC (permalink / raw)
  To: Phillip Wood; +Cc: Johannes Schindelin, git, Philip Oakley, Jeff King

Phillip Wood <phillip.wood@talktalk.net> writes:

> This changes the behaviour of
> git -c rebase.instructionFormat= rebase -i
> The shell version treats the rebase.instructionFormat being unset or set
> to the empty string as equivalent. This version generates a todo list
> with lines like 'pick <abbrev sha1>' rather than 'pick <abbrev sha1>
> <subject>'
>
> I only picked this up because I have a script that does 'git -c
> rebase.instructionFormat= rebase -i' with a custom sequence editor.

Sorry to hear that.  As there is no way to unset a configuration
variable from the command line, "git -c var=" like you did above is
the best we can do, and that why treating unset and empty variable
the same way is often necessary.  It seems that Dscho gave an ack to
your message, so hopefully the final version would not have such a
regression.

Thanks for an early warning.


^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH v3 1/9] rebase -i: generate the script via rebase--helper
  2017-04-28 15:13             ` Johannes Schindelin
@ 2017-05-01  3:11               ` Junio C Hamano
  2017-05-01 11:47                 ` Johannes Schindelin
  0 siblings, 1 reply; 100+ messages in thread
From: Junio C Hamano @ 2017-05-01  3:11 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, Philip Oakley, Jeff King

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> In that case, I would strongly advise to consider redesigning the API.

The API we currently have and is used by "log", "rev-list" and
friends is to have setup_revisions() parse the av[], i.e. the
textual API, and it is sufficient to hide from the caller the
implementation detail of what bit rev_info structure has and which
bits are flipped when reacting to say "--format=..."  option [*1*].

As the implementaiton detail of which bits are flipped when reacting
to each options is _not_ the API, we are essentially left with two
choices: write this series to the current textual API, or invent an
alternate API [*2*] and write this series to that new API.

Besides, the original was already using the textual interface to
set-up the revision traversal machinery (after all, it was a shell
script that invoked rev-list), and the series attempts a faithful
rewrite of it in C; writing to the current textual API is a
future-proof way to do so, and something you can do without waiting
for a new API to materialize (that is, assuming that we need an
alternate API, favoured over the current textual API).


[Footnote]

*1* You'll notice that there already are (and were in 2010) users
    that cheated and peeked into the implementation detail by
    looking at unnecessary places the patch that added pretty_given
    bit; some of the places it needed to touch probably didn't have
    to be touched if they were writing to the API and had their av[]
    parsed.

*2* Quite honestly, I do not get how much you would gain dumping the
    current API.  It uses the same codepath "git rev-list" and
    friends use to parse the requests by the end-users and scripts,
    guaranteeing that it will stay stable, unlike the underlying
    implementation that may and will change.  And the set-up of the
    machinery is not even the expensive part anyway.

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH v3 1/9] rebase -i: generate the script via rebase--helper
  2017-04-28 19:22         ` Johannes Schindelin
@ 2017-05-01 10:06           ` Phillip Wood
  2017-05-01 11:58             ` Johannes Schindelin
  0 siblings, 1 reply; 100+ messages in thread
From: Phillip Wood @ 2017-05-01 10:06 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, Junio C Hamano, Philip Oakley, Jeff King

On 28/04/17 20:22, Johannes Schindelin wrote:
> Hi Philip,
> 
> On Fri, 28 Apr 2017, Phillip Wood wrote:
> 
>> On 26/04/17 12:59, Johannes Schindelin wrote:
>>
>>> The first step of an interactive rebase is to generate the so-called
>>> "todo script", to be stored in the state directory as
>>> "git-rebase-todo" and to be edited by the user.
>>>
>>> Originally, we adjusted the output of `git log <options>` using a
>>> simple sed script. Over the course of the years, the code became more
>>> complicated. We now use shell scripting to edit the output of `git
>>> log` conditionally, depending whether to keep "empty" commits (i.e.
>>> commits that do not change any files).
>>>
>>> On platforms where shell scripting is not native, this can be a
>>> serious drag. And it opens the door for incompatibilities between
>>> platforms when it comes to shell scripting or to Unix-y commands.
>>>
>>> Let's just re-implement the todo script generation in plain C, using
>>> the revision machinery directly.
>>>
>>> This is substantially faster, improving the speed relative to the
>>> shell script version of the interactive rebase from 2x to 3x on
>>> Windows.
>>
>> This changes the behaviour of git -c rebase.instructionFormat= rebase -i
>> The shell version treats the rebase.instructionFormat being unset or set
>> to the empty string as equivalent. This version generates a todo list
>> with lines like 'pick <abbrev sha1>' rather than 'pick <abbrev sha1>
>> <subject>'
>>
>> I only picked this up because I have a script that does 'git -c
>> rebase.instructionFormat= rebase -i' with a custom sequence editor. I
>> can easily add '%s' in the appropriate place but I thought I'd point it
>> out in case other people are affected by the change.
> 
> While I would argue that the C version is more correct, it would be
> backwards-incompatible.

I was going to make a point about resetting config variables to their
default value on the command line but Junio beat me to it.

> So I changed it.

That's great, thanks

> BTW in the future you could help me a *lot* by providing a patch that adds
> a test case to our test suite that not only demonstrates what exactly goes
> wrong, but also will help prevent future regressions.

I'll bear that in mind, it does assume that reporters have a good
understanding of the test suite layout and helper functions though. Is
there a particular reason you put the test case in the autosquash tests?
I wouldn't have thought of doing that.

Thanks again

Phillip


^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH v3 1/9] rebase -i: generate the script via rebase--helper
  2017-05-01  0:49         ` Junio C Hamano
@ 2017-05-01 11:06           ` Johannes Schindelin
  0 siblings, 0 replies; 100+ messages in thread
From: Johannes Schindelin @ 2017-05-01 11:06 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Phillip Wood, git, Philip Oakley, Jeff King

Hi Junio,

On Sun, 30 Apr 2017, Junio C Hamano wrote:

> Phillip Wood <phillip.wood@talktalk.net> writes:
> 
> > This changes the behaviour of
> > git -c rebase.instructionFormat= rebase -i
> > The shell version treats the rebase.instructionFormat being unset or set
> > to the empty string as equivalent. This version generates a todo list
> > with lines like 'pick <abbrev sha1>' rather than 'pick <abbrev sha1>
> > <subject>'
> >
> > I only picked this up because I have a script that does 'git -c
> > rebase.instructionFormat= rebase -i' with a custom sequence editor.
> 
> Sorry to hear that.  As there is no way to unset a configuration
> variable from the command line, "git -c var=" like you did above is
> the best we can do, and that why treating unset and empty variable
> the same way is often necessary.  It seems that Dscho gave an ack to
> your message, so hopefully the final version would not have such a
> regression.

As I mentioned in the cover letter of v4, unless you take v3 for the final
version, the final version won't have such a regression:

> - the todo list is generated is again generated in the same way as
>   before when rebase.instructionFormat is empty: it was interpreted as
>   if it had not been set

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH v3 1/9] rebase -i: generate the script via rebase--helper
  2017-05-01  3:11               ` Junio C Hamano
@ 2017-05-01 11:47                 ` Johannes Schindelin
  0 siblings, 0 replies; 100+ messages in thread
From: Johannes Schindelin @ 2017-05-01 11:47 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Philip Oakley, Jeff King

Hi Junio,

On Sun, 30 Apr 2017, Junio C Hamano wrote:

> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> 
> > In that case, I would strongly advise to consider redesigning the API.
> 
> The API we currently have and is used by "log", "rev-list" and friends
> is to have setup_revisions() parse the av[], i.e. the textual API, and
> it is sufficient to hide from the caller the implementation detail of
> what bit rev_info structure has and which bits are flipped when reacting
> to say "--format=..."  option [*1*].

Yes, this (parsing options passed in as strings, with the very real
possibility of catching coding errors only at runtime) is the current way
the API is used.

Sometimes.

And sometimes not.

For example, in builtin/bisect.c's show_diff_tree() function, we *do* call
setup_revisions(), but with argc = 0. We set all the options beforehand to
avoid parsing, to avoid runtime-instead-of-compile-time errors.

The same holds true for builtin/merge.c's squash_message() function.

> As the implementaiton detail of which bits are flipped when reacting to
> each options is _not_ the API, we are essentially left with two choices:
> write this series to the current textual API, or invent an alternate API
> [*2*] and write this series to that new API.

You make it sound as if my goal was to imitate slavishly what that option
did that was passed to the Git command in the git-rebase--interactive.sh
script.

But that is not what my goal is.

My goal is to imitate the *intent* of the shell script. Faithfully, of
course. The fact that the shell script had no better way to access the
internal API than to call the Git command is just a red herring.

What I really want to do *is* to access the revision machinery, as bare
metal as possible, because sequencer *is bare metal, too*.

The current code fulfills that goal rather excellently.

Your suggested change to call the parser and pass plain options as plain
text really flies in the face of this goal.

Your suggested alternative is actually not necessary here, as the code
does exactly what it is supposed to do: it calls, from the internal
libgit.a, another part of libgit.a, and therefore it is totally legitimate
to use implementation details.

If my code were bleeding implementation details to the user interface, I
would agree with you that there is an issue.

But that code does not do that. To the contrary, it hides those
implementation details behind a rather simple user interface.

In the long run, I think you are correct in your fear that bits may be set
incorrectly.

The solution for that is, of course, not to rewrite the API. The solution
is to make the API less fragile.

To be explicit about the fragility in question: the API should not require
the pretty_given bit at all, but it should use an initialized
pretty_print_context within the rev_info struct as indicator that the
pretty print machinery should be used to print out commit messages.

There is something even more fragile about the current concept of parsing
--pretty: the fact that get_commit_format() sets a *file-local* variable
`user_format`, and that that variable is then used for formatting when
pretty_given = 1, is just asking for trouble.

This fragile aspect of the API simply dooms the revision API to suffer
side effects until fixed.

After writing this, I really, really, really fail even more to see why you
make such a big deal out of the pretty_given bit. It is insignificant. If
I were you, I would worry much, much, much, MUCH more about the fact that
`user_format` in pretty.c is changed implicitly by sequencer_make_script()
(not the fault of my patch, of course, but of the way get_commit_format()
operates).

Obviously this latter issue (the `user_format` side effect) is what is a
real problem later on, when we try to make rebase a true builtin, as
sequencer_make_script() will be called as part of a larger operation that
will subsequently run the rebase, and may very well use the revision
machinery to print other commit messages again, *possibly using that
`user_format` by mistake*.

Now, if your suggestion to undo the compile-time safety in favor of a
runtime error, say, in case of a speling eror (which I really would like
to avoid, as I find it a highly sloppy development style to turn compile
time errors into runtime errors for no good reason) would help avoid this
problem with the `user_format`, I would grudgingly bite my tongue,
implement what you suggested and move forward.

But it does not. All it does make the code less safe by pushing a possible
problem from the time of compilation to the time of running the code.
Meaning that problems would be found by users, when developers could have
caught them without this change.

I really wish we were on the same page that this is a really bad idea.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH v3 1/9] rebase -i: generate the script via rebase--helper
  2017-05-01 10:06           ` Phillip Wood
@ 2017-05-01 11:58             ` Johannes Schindelin
  0 siblings, 0 replies; 100+ messages in thread
From: Johannes Schindelin @ 2017-05-01 11:58 UTC (permalink / raw)
  To: phillip.wood; +Cc: git, Junio C Hamano, Philip Oakley, Jeff King

Hi Phillip,

On Mon, 1 May 2017, Phillip Wood wrote:

> On 28/04/17 20:22, Johannes Schindelin wrote:
> 
> > BTW in the future you could help me a *lot* by providing a patch that
> > adds a test case to our test suite that not only demonstrates what
> > exactly goes wrong, but also will help prevent future regressions.
> 
> I'll bear that in mind, it does assume that reporters have a good
> understanding of the test suite layout and helper functions though.

Even a shell script recreating the issue would be helpful, as it is easier
to turn such a reproducer into a test case than to write the test case
from scratch.

> Is there a particular reason you put the test case in the autosquash
> tests?  I wouldn't have thought of doing that.

Yes. I looked for existing test cases setting rebase.instructionFormat.
That's where I put the new one.

(I would also have avoided t3404, as it takes 5 minutes to run on Windows
due to its heavily-scripted nature: the shell script interpreter we use in
Git for Windows jumps through all kinds of hoops to emulate POSIX
functionality, and that costs time)

Ciao,
Johannes

^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH v4 00/10] The final building block for a faster rebase -i
  2017-04-28 21:30     ` [PATCH v4 00/10] The final building block for a faster rebase -i Johannes Schindelin
                         ` (9 preceding siblings ...)
  2017-04-28 21:33       ` [PATCH v4 10/10] rebase -i: rearrange fixup/squash lines using the rebase--helper Johannes Schindelin
@ 2017-05-26  3:15       ` Liam Beguin
  2017-05-27 16:23         ` René Scharfe
  2017-05-29 10:56         ` Johannes Schindelin
  2017-05-29  8:30       ` Junio C Hamano
  11 siblings, 2 replies; 100+ messages in thread
From: Liam Beguin @ 2017-05-26  3:15 UTC (permalink / raw)
  To: johannes.schindelin; +Cc: git, gitster, peff, philipoakley, phillip.wood

Hi Johannes,

Johannes Schindelin <johannes.schindelin@gmx.de> writes:
> This patch series reimplements the expensive pre- and post-processing of
> the todo script in C.
>
> And it concludes the work I did to accelerate rebase -i.
>
> Changes since v3:
>
> - removed the no-longer-used transform_todo_ids shell function
>
> - simplified transform_todo_ids()'s command parsing
>
> - fixed two commits in check_todo_list(), renamed the unclear
>   `raise_error` variable to `advise_to_edit_todo`, build the message
>   about missing commits directly (without the detour to building a
>   commit_list) and instead of assigning an unused pointer to commit->util
>   the code now uses (void *)1.
>
> - return early from check_todo_list() when parsing failed, even if the
>   check level is something else than CHECK_IGNORE
>
> - the todo list is generated is again generated in the same way as
>   before when rebase.instructionFormat is empty: it was interpreted as
>   if it had not been set
>
> - added a test for empty rebase.instructionFormat settings
>
>
> Johannes Schindelin (10):
>   t3415: verify that an empty instructionFormat is handled as before
>   rebase -i: generate the script via rebase--helper
>   rebase -i: remove useless indentation
>   rebase -i: do not invent onelines when expanding/collapsing SHA-1s
>   rebase -i: also expand/collapse the SHA-1s via the rebase--helper
>   t3404: relax rebase.missingCommitsCheck tests
>   rebase -i: check for missing commits in the rebase--helper
>   rebase -i: skip unnecessary picks using the rebase--helper
>   t3415: test fixup with wrapped oneline
>   rebase -i: rearrange fixup/squash lines using the rebase--helper
>
>  Documentation/git-rebase.txt  |  16 +-
>  builtin/rebase--helper.c      |  29 ++-
>  git-rebase--interactive.sh    | 373 ++++-------------------------
>  sequencer.c                   | 530 ++++++++++++++++++++++++++++++++++++++++++
>  sequencer.h                   |   8 +
>  t/t3404-rebase-interactive.sh |  22 +-
>  t/t3415-rebase-autosquash.sh  |  28 ++-
>  7 files changed, 646 insertions(+), 360 deletions(-)
>
>
> base-commit: 027a3b943b444a3e3a76f9a89803fc10245b858f
> Based-On: rebase--helper at https://github.com/dscho/git
> Fetch-Base-Via: git fetch https://github.com/dscho/git rebase--helper
> Published-As: https://github.com/dscho/git/releases/tag/rebase-i-extra-v4
> Fetch-It-Via: git fetch https://github.com/dscho/git rebase-i-extra-v4
>

This is my first review so it's probably not the best you'll get, but
here it goes!

I rebased the series ontop of v2.13.0 and run the whole `make test` on
both revisions.
The changes do not seem to have introduced any evident breakage as the
output of `make test` did not change.

I tried to time the execution on an interactive rebase (on Linux) but
I did not notice a significant change in speed.
Do we have a way to measure performance / speed changes between version?

Liam


^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH v4 02/10] rebase -i: generate the script via rebase--helper
  2017-04-28 21:31       ` [PATCH v4 02/10] rebase -i: generate the script via rebase--helper Johannes Schindelin
@ 2017-05-26  3:15         ` Liam Beguin
  2017-05-29 10:59           ` Johannes Schindelin
  2017-05-30 18:19           ` liam Beguin
  2017-05-29  6:07         ` Junio C Hamano
  1 sibling, 2 replies; 100+ messages in thread
From: Liam Beguin @ 2017-05-26  3:15 UTC (permalink / raw)
  To: johannes.schindelin; +Cc: git, gitster, peff, philipoakley, phillip.wood

Hi Johannes,

Johannes Schindelin <johannes.schindelin@gmx.de> writes:
> The first step of an interactive rebase is to generate the so-called "todo
> script", to be stored in the state directory as "git-rebase-todo" and to
> be edited by the user.
> 
> Originally, we adjusted the output of `git log <options>` using a simple
> sed script. Over the course of the years, the code became more
> complicated. We now use shell scripting to edit the output of `git log`
> conditionally, depending whether to keep "empty" commits (i.e. commits
> that do not change any files).
> 
> On platforms where shell scripting is not native, this can be a serious
> drag. And it opens the door for incompatibilities between platforms when
> it comes to shell scripting or to Unix-y commands.
> 
> Let's just re-implement the todo script generation in plain C, using the
> revision machinery directly.
> 
> This is substantially faster, improving the speed relative to the
> shell script version of the interactive rebase from 2x to 3x on Windows.
> 
> Note that the rearrange_squash() function in git-rebase--interactive
> relied on the fact that we set the "format" variable to the config setting
> rebase.instructionFormat. Relying on a side effect like this is no good,
> hence we explicitly perform that assignment (possibly again) in
> rearrange_squash().
> 
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---
>  builtin/rebase--helper.c   |  8 +++++++-
>  git-rebase--interactive.sh | 44 +++++++++++++++++++++--------------------
>  sequencer.c                | 49 ++++++++++++++++++++++++++++++++++++++++++++++
>  sequencer.h                |  3 +++
>  4 files changed, 82 insertions(+), 22 deletions(-)
> 
> diff --git a/builtin/rebase--helper.c b/builtin/rebase--helper.c
> index ca1ebb2fa18..821058d452d 100644
> --- a/builtin/rebase--helper.c
> +++ b/builtin/rebase--helper.c
> @@ -11,15 +11,19 @@ static const char * const builtin_rebase_helper_usage[] = {
>  int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
>  {
>  	struct replay_opts opts = REPLAY_OPTS_INIT;
> +	int keep_empty = 0;
>  	enum {
> -		CONTINUE = 1, ABORT
> +		CONTINUE = 1, ABORT, MAKE_SCRIPT
>  	} command = 0;
>  	struct option options[] = {
>  		OPT_BOOL(0, "ff", &opts.allow_ff, N_("allow fast-forward")),
> +		OPT_BOOL(0, "keep-empty", &keep_empty, N_("keep empty commits")),
>  		OPT_CMDMODE(0, "continue", &command, N_("continue rebase"),
>  				CONTINUE),
>  		OPT_CMDMODE(0, "abort", &command, N_("abort rebase"),
>  				ABORT),
> +		OPT_CMDMODE(0, "make-script", &command,
> +			N_("make rebase script"), MAKE_SCRIPT),
>  		OPT_END()
>  	};
>  
> @@ -36,5 +40,7 @@ int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
>  		return !!sequencer_continue(&opts);
>  	if (command == ABORT && argc == 1)
>  		return !!sequencer_remove_state(&opts);
> +	if (command == MAKE_SCRIPT && argc > 1)
> +		return !!sequencer_make_script(keep_empty, stdout, argc, argv);
>  	usage_with_options(builtin_rebase_helper_usage, options);
>  }
> diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
> index 2c9c0165b5a..609e150d38f 100644
> --- a/git-rebase--interactive.sh
> +++ b/git-rebase--interactive.sh
> @@ -785,6 +785,7 @@ collapse_todo_ids() {
>  # each log message will be re-retrieved in order to normalize the
>  # autosquash arrangement
>  rearrange_squash () {
> +	format=$(git config --get rebase.instructionFormat)
>  	# extract fixup!/squash! lines and resolve any referenced sha1's
>  	while read -r pick sha1 message
>  	do
> @@ -1210,26 +1211,27 @@ else
>  	revisions=$onto...$orig_head
>  	shortrevisions=$shorthead
>  fi
> -format=$(git config --get rebase.instructionFormat)
> -# the 'rev-list .. | sed' requires %m to parse; the instruction requires %H to parse
> -git rev-list $merges_option --format="%m%H ${format:-%s}" \
> -	--reverse --left-right --topo-order \
> -	$revisions ${restrict_revision+^$restrict_revision} | \
> -	sed -n "s/^>//p" |
> -while read -r sha1 rest
> -do
> -
> -	if test -z "$keep_empty" && is_empty_commit $sha1 && ! is_merge_commit $sha1
> -	then
> -		comment_out="$comment_char "
> -	else
> -		comment_out=
> -	fi
> +if test t != "$preserve_merges"
> +then
> +	git rebase--helper --make-script ${keep_empty:+--keep-empty} \
> +		$revisions ${restrict_revision+^$restrict_revision} >"$todo"
> +else
> +	format=$(git config --get rebase.instructionFormat)
> +	# the 'rev-list .. | sed' requires %m to parse; the instruction requires %H to parse
> +	git rev-list $merges_option --format="%m%H ${format:-%s}" \
> +		--reverse --left-right --topo-order \
> +		$revisions ${restrict_revision+^$restrict_revision} | \
> +		sed -n "s/^>//p" |
> +	while read -r sha1 rest
> +	do
> +
> +		if test -z "$keep_empty" && is_empty_commit $sha1 && ! is_merge_commit $sha1
> +		then
> +			comment_out="$comment_char "
> +		else
> +			comment_out=
> +		fi
>  
> -	if test t != "$preserve_merges"
> -	then
> -		printf '%s\n' "${comment_out}pick $sha1 $rest" >>"$todo"
> -	else
>  		if test -z "$rebase_root"
>  		then
>  			preserve=t
> @@ -1248,8 +1250,8 @@ do
>  			touch "$rewritten"/$sha1
>  			printf '%s\n' "${comment_out}pick $sha1 $rest" >>"$todo"
>  		fi
> -	fi
> -done
> +	done
> +fi
>  
>  # Watch for commits that been dropped by --cherry-pick
>  if test t = "$preserve_merges"
> diff --git a/sequencer.c b/sequencer.c
> index 130cc868e51..88819a1a2a9 100644
> --- a/sequencer.c
> +++ b/sequencer.c
> @@ -2388,3 +2388,52 @@ void append_signoff(struct strbuf *msgbuf, int ignore_footer, unsigned flag)
>  
>  	strbuf_release(&sob);
>  }
> +
> +int sequencer_make_script(int keep_empty, FILE *out,
> +		int argc, const char **argv)
> +{
> +	char *format = NULL;
> +	struct pretty_print_context pp = {0};
> +	struct strbuf buf = STRBUF_INIT;
> +	struct rev_info revs;
> +	struct commit *commit;
> +
> +	init_revisions(&revs, NULL);
> +	revs.verbose_header = 1;
> +	revs.max_parents = 1;
> +	revs.cherry_pick = 1;
> +	revs.limited = 1;
> +	revs.reverse = 1;
> +	revs.right_only = 1;
> +	revs.sort_order = REV_SORT_IN_GRAPH_ORDER;
> +	revs.topo_order = 1;
> +
> +	revs.pretty_given = 1;
> +	git_config_get_string("rebase.instructionFormat", &format);
> +	if (!format || !*format) {
> +		free(format);
> +		format = xstrdup("%s");
> +	}
> +	get_commit_format(format, &revs);
> +	free(format);
> +	pp.fmt = revs.commit_format;
> +	pp.output_encoding = get_log_output_encoding();
> +
> +	if (setup_revisions(argc, argv, &revs, NULL) > 1)
> +		return error(_("make_script: unhandled options"));
> +
> +	if (prepare_revision_walk(&revs) < 0)
> +		return error(_("make_script: error preparing revisions"));
> +
> +	while ((commit = get_revision(&revs))) {
> +		strbuf_reset(&buf);
> +		if (!keep_empty && is_original_commit_empty(commit))
> +			strbuf_addf(&buf, "%c ", comment_line_char);

I've never had to use empty commits before, but while testing this, I
noticed that `git rebase -i --keep-empty` behaves differently if using
the --root option instead of a branch or something like 'HEAD~10'.
I also tested this on v2.13.0 and the behavior is the same.

> +		strbuf_addf(&buf, "pick %s ", oid_to_hex(&commit->object.oid));
> +		pretty_print_commit(&pp, commit, &buf);
> +		strbuf_addch(&buf, '\n');
> +		fputs(buf.buf, out);
> +	}
> +	strbuf_release(&buf);
> +	return 0;
> +}
> diff --git a/sequencer.h b/sequencer.h
> index f885b68395f..83f2943b7a9 100644
> --- a/sequencer.h
> +++ b/sequencer.h
> @@ -45,6 +45,9 @@ int sequencer_continue(struct replay_opts *opts);
>  int sequencer_rollback(struct replay_opts *opts);
>  int sequencer_remove_state(struct replay_opts *opts);
>  
> +int sequencer_make_script(int keep_empty, FILE *out,
> +		int argc, const char **argv);
> +
>  extern const char sign_off_header[];
>  
>  void append_signoff(struct strbuf *msgbuf, int ignore_footer, unsigned flag);
> -- 
> 2.12.2.windows.2.800.gede8f145e06

Liam


^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH v4 03/10] rebase -i: remove useless indentation
  2017-04-28 21:31       ` [PATCH v4 03/10] rebase -i: remove useless indentation Johannes Schindelin
@ 2017-05-26  3:15         ` Liam Beguin
  2017-05-26 17:50           ` Stefan Beller
  0 siblings, 1 reply; 100+ messages in thread
From: Liam Beguin @ 2017-05-26  3:15 UTC (permalink / raw)
  To: johannes.schindelin; +Cc: git, gitster, peff, philipoakley, phillip.wood

Hi Johannes,

Johannes Schindelin <johannes.schindelin@gmx.de> writes:
> The commands used to be indented, and it is nice to look at, but when we
> transform the SHA-1s, the indentation is removed. So let's do away with it.
> 
> For the moment, at least: when we will use the upcoming rebase--helper
> to transform the SHA-1s, we *will* keep the indentation and can
> reintroduce it. Yet, to be able to validate the rebase--helper against
> the output of the current shell script version, we need to remove the
> extra indentation.
> 
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---
>  git-rebase--interactive.sh | 14 +++++++-------
>  1 file changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
> index 609e150d38f..c40b1fd1d2e 100644
> --- a/git-rebase--interactive.sh
> +++ b/git-rebase--interactive.sh
> @@ -155,13 +155,13 @@ reschedule_last_action () {
>  append_todo_help () {
>  	gettext "
>  Commands:
> - p, pick = use commit
> - r, reword = use commit, but edit the commit message
> - e, edit = use commit, but stop for amending
> - s, squash = use commit, but meld into previous commit
> - f, fixup = like \"squash\", but discard this commit's log message
> - x, exec = run command (the rest of the line) using shell
> - d, drop = remove commit
> +p, pick = use commit
> +r, reword = use commit, but edit the commit message
> +e, edit = use commit, but stop for amending
> +s, squash = use commit, but meld into previous commit
> +f, fixup = like \"squash\", but discard this commit's log message
> +x, exec = run command (the rest of the line) using shell
> +d, drop = remove commit

do we also need to update all the translations since this is a `gettext`
function?

>  
>  These lines can be re-ordered; they are executed from top to bottom.
>  " | git stripspace --comment-lines >>"$todo"
> -- 
> 2.12.2.windows.2.800.gede8f145e06

Liam


^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH v4 05/10] rebase -i: also expand/collapse the SHA-1s via the rebase--helper
  2017-04-28 21:32       ` [PATCH v4 05/10] rebase -i: also expand/collapse the SHA-1s via the rebase--helper Johannes Schindelin
@ 2017-05-26  3:15         ` Liam Beguin
  2017-05-29 11:20           ` Johannes Schindelin
  0 siblings, 1 reply; 100+ messages in thread
From: Liam Beguin @ 2017-05-26  3:15 UTC (permalink / raw)
  To: johannes.schindelin; +Cc: git, gitster, peff, philipoakley, phillip.wood

Hi Johannes,

Johannes Schindelin <johannes.schindelin@gmx.de> writes:
> This is crucial to improve performance on Windows, as the speed is now
> mostly dominated by the SHA-1 transformation (because it spawns a new
> rev-parse process for *every* line, and spawning processes is pretty
> slow from Git for Windows' MSYS2 Bash).
> 
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---
>  builtin/rebase--helper.c   | 10 +++++++-
>  git-rebase--interactive.sh | 27 ++--------------------
>  sequencer.c                | 57 ++++++++++++++++++++++++++++++++++++++++++++++
>  sequencer.h                |  2 ++
>  4 files changed, 70 insertions(+), 26 deletions(-)
> 
> diff --git a/builtin/rebase--helper.c b/builtin/rebase--helper.c
> index 821058d452d..9444c8d6c60 100644
> --- a/builtin/rebase--helper.c
> +++ b/builtin/rebase--helper.c
> @@ -13,7 +13,7 @@ int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
>  	struct replay_opts opts = REPLAY_OPTS_INIT;
>  	int keep_empty = 0;
>  	enum {
> -		CONTINUE = 1, ABORT, MAKE_SCRIPT
> +		CONTINUE = 1, ABORT, MAKE_SCRIPT, SHORTEN_SHA1S, EXPAND_SHA1S
>  	} command = 0;
>  	struct option options[] = {
>  		OPT_BOOL(0, "ff", &opts.allow_ff, N_("allow fast-forward")),
> @@ -24,6 +24,10 @@ int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
>  				ABORT),
>  		OPT_CMDMODE(0, "make-script", &command,
>  			N_("make rebase script"), MAKE_SCRIPT),
> +		OPT_CMDMODE(0, "shorten-sha1s", &command,
> +			N_("shorten SHA-1s in the todo list"), SHORTEN_SHA1S),
> +		OPT_CMDMODE(0, "expand-sha1s", &command,
> +			N_("expand SHA-1s in the todo list"), EXPAND_SHA1S),

Since work is being done to convert to `struct object_id` would it
not be best to use a more generic name instead of 'sha1'?
maybe something like {shorten,expand}-hashs

>  		OPT_END()
>  	};
>  
> @@ -42,5 +46,9 @@ int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
>  		return !!sequencer_remove_state(&opts);
>  	if (command == MAKE_SCRIPT && argc > 1)
>  		return !!sequencer_make_script(keep_empty, stdout, argc, argv);
> +	if (command == SHORTEN_SHA1S && argc == 1)
> +		return !!transform_todo_ids(1);
> +	if (command == EXPAND_SHA1S && argc == 1)
> +		return !!transform_todo_ids(0);
>  	usage_with_options(builtin_rebase_helper_usage, options);
>  }
> diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
> index 214af0372ba..82a1941c42c 100644
> --- a/git-rebase--interactive.sh
> +++ b/git-rebase--interactive.sh
> @@ -750,35 +750,12 @@ skip_unnecessary_picks () {
>  		die "$(gettext "Could not skip unnecessary pick commands")"
>  }
>  
> -transform_todo_ids () {
> -	while read -r command rest
> -	do
> -		case "$command" in
> -		"$comment_char"* | exec)
> -			# Be careful for oddball commands like 'exec'
> -			# that do not have a SHA-1 at the beginning of $rest.
> -			;;
> -		*)
> -			sha1=$(git rev-parse --verify --quiet "$@" ${rest%%[	 ]*}) &&
> -			if test "a$rest" = "a${rest#*[	 ]}"
> -			then
> -				rest=$sha1
> -			else
> -				rest="$sha1 ${rest#*[	 ]}"
> -			fi
> -			;;
> -		esac
> -		printf '%s\n' "$command${rest:+ }$rest"
> -	done <"$todo" >"$todo.new" &&
> -	mv -f "$todo.new" "$todo"
> -}
> -
>  expand_todo_ids() {
> -	transform_todo_ids
> +	git rebase--helper --expand-sha1s
>  }
>  
>  collapse_todo_ids() {
> -	transform_todo_ids --short
> +	git rebase--helper --shorten-sha1s
>  }
>  
>  # Rearrange the todo list that has both "pick sha1 msg" and
> diff --git a/sequencer.c b/sequencer.c
> index 88819a1a2a9..201d45b1677 100644
> --- a/sequencer.c
> +++ b/sequencer.c
> @@ -2437,3 +2437,60 @@ int sequencer_make_script(int keep_empty, FILE *out,
>  	strbuf_release(&buf);
>  	return 0;
>  }
> +
> +
> +int transform_todo_ids(int shorten_sha1s)
> +{
> +	const char *todo_file = rebase_path_todo();
> +	struct todo_list todo_list = TODO_LIST_INIT;
> +	int fd, res, i;
> +	FILE *out;
> +
> +	strbuf_reset(&todo_list.buf);
> +	fd = open(todo_file, O_RDONLY);
> +	if (fd < 0)
> +		return error_errno(_("could not open '%s'"), todo_file);
> +	if (strbuf_read(&todo_list.buf, fd, 0) < 0) {
> +		close(fd);
> +		return error(_("could not read '%s'."), todo_file);
> +	}
> +	close(fd);
> +
> +	res = parse_insn_buffer(todo_list.buf.buf, &todo_list);
> +	if (res) {
> +		todo_list_release(&todo_list);
> +		return error(_("unusable instruction sheet: '%s'"), todo_file);

As you pointed out last time, the name of the "todo script" can be a
source of confusion. The migration to C could be a good opportunity to
clarify this.
I don't know which is the preferred name but we could go with
"todo list" as it is the most common across the code base.

$ git grep  'todo[ -]list' | wc -l
20
$ git grep  'rebase[ -]script' | wc -l
0
$ git grep  'instruction[ -]list' | wc -l
1
$ git grep  'instruction[ -]sheet' | wc -l
20
$ git grep  'instruction[ -]sheet' | grep -v ^po | wc -l
8

> +	}
> +
> +	out = fopen(todo_file, "w");
> +	if (!out) {
> +		todo_list_release(&todo_list);
> +		return error(_("unable to open '%s' for writing"), todo_file);
> +	}
> +	for (i = 0; i < todo_list.nr; i++) {
> +		struct todo_item *item = todo_list.items + i;
> +		int bol = item->offset_in_buf;
> +		const char *p = todo_list.buf.buf + bol;
> +		int eol = i + 1 < todo_list.nr ?
> +			todo_list.items[i + 1].offset_in_buf :
> +			todo_list.buf.len;
> +
> +		if (item->command >= TODO_EXEC && item->command != TODO_DROP)
> +			fwrite(p, eol - bol, 1, out);
> +		else {
> +			const char *sha1 = shorten_sha1s ?
> +				short_commit_name(item->commit) :
> +				oid_to_hex(&item->commit->object.oid);

We could also use 'hash' or 'ids' here instead of 'sha1'.

> +			int len;
> +
> +			p += strspn(p, " \t"); /* left-trim command */
> +			len = strcspn(p, " \t"); /* length of command */
> +
> +			fprintf(out, "%.*s %s %.*s\n",
> +				len, p, sha1, item->arg_len, item->arg);
> +		}
> +	}
> +	fclose(out);
> +	todo_list_release(&todo_list);
> +	return 0;
> +}
> diff --git a/sequencer.h b/sequencer.h
> index 83f2943b7a9..47a81034e76 100644
> --- a/sequencer.h
> +++ b/sequencer.h
> @@ -48,6 +48,8 @@ int sequencer_remove_state(struct replay_opts *opts);
>  int sequencer_make_script(int keep_empty, FILE *out,
>  		int argc, const char **argv);
>  
> +int transform_todo_ids(int shorten_sha1s);
> +
>  extern const char sign_off_header[];
>  
>  void append_signoff(struct strbuf *msgbuf, int ignore_footer, unsigned flag);
> -- 
> 2.12.2.windows.2.800.gede8f145e06

Liam

^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH v4 10/10] rebase -i: rearrange fixup/squash lines using the rebase--helper
  2017-04-28 21:33       ` [PATCH v4 10/10] rebase -i: rearrange fixup/squash lines using the rebase--helper Johannes Schindelin
@ 2017-05-26  3:16         ` Liam Beguin
  2017-05-29 11:26           ` Johannes Schindelin
  0 siblings, 1 reply; 100+ messages in thread
From: Liam Beguin @ 2017-05-26  3:16 UTC (permalink / raw)
  To: johannes.schindelin; +Cc: git, gitster, peff, philipoakley, phillip.wood

Hi Johannes,

Johannes Schindelin <johannes.schindelin@gmx.de> writes:
> This operation has quadratic complexity, which is especially painful
> on Windows, where shell scripts are *already* slow (mainly due to the
> overhead of the POSIX emulation layer).
>
> Let's reimplement this with linear complexity (using a hash map to
> match the commits' subject lines) for the common case; Sadly, the
> fixup/squash feature's design neglected performance considerations,
> allowing arbitrary prefixes (read: `fixup! hell` will match the
> commit subject `hello world`), which means that we are stuck with
> quadratic performance in the worst case.
>
> The reimplemented logic also happens to fix a bug where commented-out
> lines (representing empty patches) were dropped by the previous code.
>
> While at it, clarify how the fixup/squash feature works in `git rebase
> -i`'s man page.
>
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---
>  Documentation/git-rebase.txt |  16 ++--
>  builtin/rebase--helper.c     |   6 +-
>  git-rebase--interactive.sh   |  90 +-------------------
>  sequencer.c                  | 195 +++++++++++++++++++++++++++++++++++++++++++
>  sequencer.h                  |   1 +
>  t/t3415-rebase-autosquash.sh |   2 +-
>  6 files changed, 212 insertions(+), 98 deletions(-)
>
> diff --git a/Documentation/git-rebase.txt b/Documentation/git-rebase.txt
> index 53f4e144444..c5464aa5365 100644
> --- a/Documentation/git-rebase.txt
> +++ b/Documentation/git-rebase.txt
> @@ -430,13 +430,15 @@ without an explicit `--interactive`.
>  --autosquash::
>  --no-autosquash::
>  	When the commit log message begins with "squash! ..." (or
> -	"fixup! ..."), and there is a commit whose title begins with
> -	the same ..., automatically modify the todo list of rebase -i
> -	so that the commit marked for squashing comes right after the
> -	commit to be modified, and change the action of the moved
> -	commit from `pick` to `squash` (or `fixup`).  Ignores subsequent
> -	"fixup! " or "squash! " after the first, in case you referred to an
> -	earlier fixup/squash with `git commit --fixup/--squash`.
> +	"fixup! ..."), and there is already a commit in the todo list that
> +	matches the same `...`, automatically modify the todo list of rebase
> +	-i so that the commit marked for squashing comes right after the
> +	commit to be modified, and change the action of the moved commit
> +	from `pick` to `squash` (or `fixup`).  A commit matches the `...` if
> +	the commit subject matches, or if the `...` refers to the commit's
> +	hash. As a fall-back, partial matches of the commit subject work,
> +	too.  The recommended way to create fixup/squash commits is by using
> +	the `--fixup`/`--squash` options of linkgit:git-commit[1].
>  +
>  This option is only valid when the `--interactive` option is used.
>  +
> diff --git a/builtin/rebase--helper.c b/builtin/rebase--helper.c
> index de3ccd9bfbc..e6591f01112 100644
> --- a/builtin/rebase--helper.c
> +++ b/builtin/rebase--helper.c
> @@ -14,7 +14,7 @@ int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
>  	int keep_empty = 0;
>  	enum {
>  		CONTINUE = 1, ABORT, MAKE_SCRIPT, SHORTEN_SHA1S, EXPAND_SHA1S,
> -		CHECK_TODO_LIST, SKIP_UNNECESSARY_PICKS
> +		CHECK_TODO_LIST, SKIP_UNNECESSARY_PICKS, REARRANGE_SQUASH
>  	} command = 0;
>  	struct option options[] = {
>  		OPT_BOOL(0, "ff", &opts.allow_ff, N_("allow fast-forward")),
> @@ -33,6 +33,8 @@ int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
>  			N_("check the todo list"), CHECK_TODO_LIST),
>  		OPT_CMDMODE(0, "skip-unnecessary-picks", &command,
>  			N_("skip unnecessary picks"), SKIP_UNNECESSARY_PICKS),
> +		OPT_CMDMODE(0, "rearrange-squash", &command,
> +			N_("rearrange fixup/squash lines"), REARRANGE_SQUASH),
>  		OPT_END()
>  	};
>  
> @@ -59,5 +61,7 @@ int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
>  		return !!check_todo_list();
>  	if (command == SKIP_UNNECESSARY_PICKS && argc == 1)
>  		return !!skip_unnecessary_picks();
> +	if (command == REARRANGE_SQUASH && argc == 1)
> +		return !!rearrange_squash();
>  	usage_with_options(builtin_rebase_helper_usage, options);
>  }
> diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
> index 92e3ca1de3b..84c6e62518f 100644
> --- a/git-rebase--interactive.sh
> +++ b/git-rebase--interactive.sh
> @@ -721,94 +721,6 @@ collapse_todo_ids() {
>  	git rebase--helper --shorten-sha1s
>  }
>  
> -# Rearrange the todo list that has both "pick sha1 msg" and
> -# "pick sha1 fixup!/squash! msg" appears in it so that the latter
> -# comes immediately after the former, and change "pick" to
> -# "fixup"/"squash".
> -#
> -# Note that if the config has specified a custom instruction format
> -# each log message will be re-retrieved in order to normalize the
> -# autosquash arrangement
> -rearrange_squash () {
> -	format=$(git config --get rebase.instructionFormat)
> -	# extract fixup!/squash! lines and resolve any referenced sha1's
> -	while read -r pick sha1 message
> -	do
> -		test -z "${format}" || message=$(git log -n 1 --format="%s" ${sha1})
> -		case "$message" in
> -		"squash! "*|"fixup! "*)
> -			action="${message%%!*}"
> -			rest=$message
> -			prefix=
> -			# skip all squash! or fixup! (but save for later)
> -			while :
> -			do
> -				case "$rest" in
> -				"squash! "*|"fixup! "*)
> -					prefix="$prefix${rest%%!*},"
> -					rest="${rest#*! }"
> -					;;
> -				*)
> -					break
> -					;;
> -				esac
> -			done
> -			printf '%s %s %s %s\n' "$sha1" "$action" "$prefix" "$rest"
> -			# if it's a single word, try to resolve to a full sha1 and
> -			# emit a second copy. This allows us to match on both message
> -			# and on sha1 prefix
> -			if test "${rest#* }" = "$rest"; then
> -				fullsha="$(git rev-parse -q --verify "$rest" 2>/dev/null)"
> -				if test -n "$fullsha"; then
> -					# prefix the action to uniquely identify this line as
> -					# intended for full sha1 match
> -					echo "$sha1 +$action $prefix $fullsha"
> -				fi
> -			fi
> -		esac
> -	done >"$1.sq" <"$1"
> -	test -s "$1.sq" || return
> -
> -	used=
> -	while read -r pick sha1 message
> -	do
> -		case " $used" in
> -		*" $sha1 "*) continue ;;
> -		esac
> -		printf '%s\n' "$pick $sha1 $message"
> -		test -z "${format}" || message=$(git log -n 1 --format="%s" ${sha1})
> -		used="$used$sha1 "
> -		while read -r squash action msg_prefix msg_content
> -		do
> -			case " $used" in
> -			*" $squash "*) continue ;;
> -			esac
> -			emit=0
> -			case "$action" in
> -			+*)
> -				action="${action#+}"
> -				# full sha1 prefix test
> -				case "$msg_content" in "$sha1"*) emit=1;; esac ;;
> -			*)
> -				# message prefix test
> -				case "$message" in "$msg_content"*) emit=1;; esac ;;
> -			esac
> -			if test $emit = 1; then
> -				if test -n "${format}"
> -				then
> -					msg_content=$(git log -n 1 --format="${format}" ${squash})
> -				else
> -					msg_content="$(echo "$msg_prefix" | sed "s/,/! /g")$msg_content"
> -				fi
> -				printf '%s\n' "$action $squash $msg_content"
> -				used="$used$squash "
> -			fi
> -		done <"$1.sq"
> -	done >"$1.rearranged" <"$1"
> -	cat "$1.rearranged" >"$1"
> -	rm -f "$1.sq" "$1.rearranged"
> -}
> -
>  # Add commands after a pick or after a squash/fixup serie
>  # in the todo list.
>  add_exec_commands () {
> @@ -1068,7 +980,7 @@ then
>  fi
>  
>  test -s "$todo" || echo noop >> "$todo"
> -test -n "$autosquash" && rearrange_squash "$todo"
> +test -z "$autosquash" || git rebase--helper --rearrange-squash || exit
>  test -n "$cmd" && add_exec_commands "$todo"
>  
>  todocount=$(git stripspace --strip-comments <"$todo" | wc -l)
> diff --git a/sequencer.c b/sequencer.c
> index 72e3ad8d145..63a588f0916 100644
> --- a/sequencer.c
> +++ b/sequencer.c
> @@ -19,6 +19,7 @@
>  #include "trailer.h"
>  #include "log-tree.h"
>  #include "wt-status.h"
> +#include "hashmap.h"
>  
>  #define GIT_REFLOG_ACTION "GIT_REFLOG_ACTION"
>  
> @@ -2723,3 +2724,197 @@ int skip_unnecessary_picks(void)
>  
>  	return 0;
>  }
> +
> +struct subject2item_entry {
> +	struct hashmap_entry entry;
> +	int i;
> +	char subject[FLEX_ARRAY];
> +};
> +
> +static int subject2item_cmp(const struct subject2item_entry *a,
> +	const struct subject2item_entry *b, const void *key)
> +{
> +	return key ? strcmp(a->subject, key) : strcmp(a->subject, b->subject);
> +}
> +
> +/*
> + * Rearrange the todo list that has both "pick sha1 msg" and "pick sha1
> + * fixup!/squash! msg" in it so that the latter is put immediately after the
> + * former, and change "pick" to "fixup"/"squash".
> + *
> + * Note that if the config has specified a custom instruction format, each log
> + * message will have to be retrieved from the commit (as the oneline in the
> + * script cannot be trusted) in order to normalize the autosquash arrangement.
> + */
> +int rearrange_squash(void)
> +{
> +	const char *todo_file = rebase_path_todo();
> +	struct todo_list todo_list = TODO_LIST_INIT;
> +	struct hashmap subject2item;
> +	int res = 0, rearranged = 0, *next, *tail, fd, i;
> +	char **subjects;
> +
> +	fd = open(todo_file, O_RDONLY);
> +	if (fd < 0)
> +		return error_errno(_("could not open '%s'"), todo_file);
> +	if (strbuf_read(&todo_list.buf, fd, 0) < 0) {
> +		close(fd);
> +		return error(_("could not read '%s'."), todo_file);
> +	}
> +	close(fd);
> +	if (parse_insn_buffer(todo_list.buf.buf, &todo_list) < 0) {
> +		todo_list_release(&todo_list);
> +		return -1;
> +	}
> +
> +	/*
> +	 * The hashmap maps onelines to the respective todo list index.
> +	 *
> +	 * If any items need to be rearranged, the next[i] value will indicate
> +	 * which item was moved directly after the i'th.
> +	 *
> +	 * In that case, last[i] will indicate the index of the latest item to
> +	 * be moved to appear after the i'th.
> +	 */
> +	hashmap_init(&subject2item, (hashmap_cmp_fn) subject2item_cmp,
> +		     todo_list.nr);
> +	ALLOC_ARRAY(next, todo_list.nr);
> +	ALLOC_ARRAY(tail, todo_list.nr);
> +	ALLOC_ARRAY(subjects, todo_list.nr);
> +	for (i = 0; i < todo_list.nr; i++) {
> +		struct strbuf buf = STRBUF_INIT;
> +		struct todo_item *item = todo_list.items + i;
> +		const char *commit_buffer, *subject, *p;
> +		size_t subject_len;
> +		int i2 = -1;
> +		struct subject2item_entry *entry;
> +
> +		next[i] = tail[i] = -1;
> +		if (item->command >= TODO_EXEC) {
> +			subjects[i] = NULL;
> +			continue;
> +		}
> +
> +		if (is_fixup(item->command)) {
> +			todo_list_release(&todo_list);
> +			return error(_("the script was already rearranged."));
> +		}
> +
> +		item->commit->util = item;
> +
> +		parse_commit(item->commit);
> +		commit_buffer = get_commit_buffer(item->commit, NULL);
> +		find_commit_subject(commit_buffer, &subject);
> +		format_subject(&buf, subject, " ");
> +		subject = subjects[i] = strbuf_detach(&buf, &subject_len);
> +		unuse_commit_buffer(item->commit, commit_buffer);
> +		if ((skip_prefix(subject, "fixup! ", &p) ||
> +		     skip_prefix(subject, "squash! ", &p))) {
> +			struct commit *commit2;
> +
> +			for (;;) {
> +				while (isspace(*p))
> +					p++;
> +				if (!skip_prefix(p, "fixup! ", &p) &&
> +				    !skip_prefix(p, "squash! ", &p))
> +					break;
> +			}
> +
> +			if ((entry = hashmap_get_from_hash(&subject2item,
> +							   strhash(p), p)))
> +				/* found by title */
> +				i2 = entry->i;
> +			else if (!strchr(p, ' ') &&
> +				 (commit2 =
> +				  lookup_commit_reference_by_name(p)) &&
> +				 commit2->util)
> +				/* found by commit name */
> +				i2 = (struct todo_item *)commit2->util
> +					- todo_list.items;
> +			else {
> +				/* copy can be a prefix of the commit subject */
> +				for (i2 = 0; i2 < i; i2++)
> +					if (subjects[i2] &&
> +					    starts_with(subjects[i2], p))
> +						break;
> +				if (i2 == i)
> +					i2 = -1;
> +			}
> +		}
> +		if (i2 >= 0) {
> +			rearranged = 1;
> +			todo_list.items[i].command =
> +				starts_with(subject, "fixup!") ?
> +				TODO_FIXUP : TODO_SQUASH;
> +			if (next[i2] < 0)
> +				next[i2] = i;
> +			else
> +				next[tail[i2]] = i;
> +			tail[i2] = i;
> +		} else if (!hashmap_get_from_hash(&subject2item,
> +						strhash(subject), subject)) {
> +			FLEX_ALLOC_MEM(entry, subject, subject, subject_len);
> +			entry->i = i;
> +			hashmap_entry_init(entry, strhash(entry->subject));
> +			hashmap_put(&subject2item, entry);
> +		}
> +	}
> +
> +	if (rearranged) {
> +		struct strbuf buf = STRBUF_INIT;
> +
> +		for (i = 0; i < todo_list.nr; i++) {
> +			enum todo_command command = todo_list.items[i].command;
> +			int cur = i;
> +
> +			/*
> +			 * Initially, all commands are 'pick's. If it is a
> +			 * fixup or a squash now, we have rearranged it.
> +			 */
> +			if (is_fixup(command))
> +				continue;
> +
> +			while (cur >= 0) {
> +				int offset = todo_list.items[cur].offset_in_buf;
> +				int end_offset = cur + 1 < todo_list.nr ?
> +					todo_list.items[cur + 1].offset_in_buf :
> +					todo_list.buf.len;
> +				char *bol = todo_list.buf.buf + offset;
> +				char *eol = todo_list.buf.buf + end_offset;

I got a little confused with these offsets. I know it was part of a
previous series, but maybe we could add a description to the fields
of `struct todo_list` and `struct todo_item`.
Other than that, I don't have any particular comments.

> +
> +				/* replace 'pick', by 'fixup' or 'squash' */
> +				command = todo_list.items[cur].command;
> +				if (is_fixup(command)) {
> +					strbuf_addstr(&buf,
> +						todo_command_info[command].str);
> +					bol += strcspn(bol, " \t");
> +				}
> +
> +				strbuf_add(&buf, bol, eol - bol);
> +
> +				cur = next[cur];
> +			}
> +		}
> +
> +		fd = open(todo_file, O_WRONLY);
> +		if (fd < 0)
> +			res = error_errno(_("could not open '%s'"), todo_file);
> +		else if (write(fd, buf.buf, buf.len) < 0)
> +			res = error_errno(_("could not read '%s'."), todo_file);
> +		else if (ftruncate(fd, buf.len) < 0)
> +			res = error_errno(_("could not finish '%s'"),
> +					   todo_file);
> +		close(fd);
> +		strbuf_release(&buf);
> +	}
> +
> +	free(next);
> +	free(tail);
> +	for (i = 0; i < todo_list.nr; i++)
> +		free(subjects[i]);
> +	free(subjects);
> +	hashmap_free(&subject2item, 1);
> +	todo_list_release(&todo_list);
> +
> +	return res;
> +}
> diff --git a/sequencer.h b/sequencer.h
> index 28e1fc1e9bb..1c94bec7622 100644
> --- a/sequencer.h
> +++ b/sequencer.h
> @@ -51,6 +51,7 @@ int sequencer_make_script(int keep_empty, FILE *out,
>  int transform_todo_ids(int shorten_sha1s);
>  int check_todo_list(void);
>  int skip_unnecessary_picks(void);
> +int rearrange_squash(void);
>  
>  extern const char sign_off_header[];
>  
> diff --git a/t/t3415-rebase-autosquash.sh b/t/t3415-rebase-autosquash.sh
> index 926bb3da788..2f88f50c057 100755
> --- a/t/t3415-rebase-autosquash.sh
> +++ b/t/t3415-rebase-autosquash.sh
> @@ -290,7 +290,7 @@ set_backup_editor () {
>  	test_set_editor "$PWD/backup-editor.sh"
>  }
>  
> -test_expect_failure 'autosquash with multiple empty patches' '
> +test_expect_success 'autosquash with multiple empty patches' '
>  	test_tick &&
>  	git commit --allow-empty -m "empty" &&
>  	test_tick &&
> -- 
> 2.12.2.windows.2.800.gede8f145e06

Liam

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH v4 03/10] rebase -i: remove useless indentation
  2017-05-26  3:15         ` Liam Beguin
@ 2017-05-26 17:50           ` Stefan Beller
  2017-05-27  3:15             ` liam Beguin
  0 siblings, 1 reply; 100+ messages in thread
From: Stefan Beller @ 2017-05-26 17:50 UTC (permalink / raw)
  To: Liam Beguin
  Cc: Johannes Schindelin, git@vger.kernel.org, Junio C Hamano,
	Jeff King, Philip Oakley, phillip.wood

On Thu, May 25, 2017 at 8:15 PM, Liam Beguin <liambeguin@gmail.com> wrote:
> Hi Johannes,
>
> Johannes Schindelin <johannes.schindelin@gmx.de> writes:
>> The commands used to be indented, and it is nice to look at, but when we
>> transform the SHA-1s, the indentation is removed. So let's do away with it.
>>
>> For the moment, at least: when we will use the upcoming rebase--helper
>> to transform the SHA-1s, we *will* keep the indentation and can
>> reintroduce it. Yet, to be able to validate the rebase--helper against
>> the output of the current shell script version, we need to remove the
>> extra indentation.
>>
>> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
>> ---
>>  git-rebase--interactive.sh | 14 +++++++-------
>>  1 file changed, 7 insertions(+), 7 deletions(-)
>>
>> diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
>> index 609e150d38f..c40b1fd1d2e 100644
>> --- a/git-rebase--interactive.sh
>> +++ b/git-rebase--interactive.sh
>> @@ -155,13 +155,13 @@ reschedule_last_action () {
>>  append_todo_help () {
>>       gettext "
>>  Commands:
>> - p, pick = use commit
>> - r, reword = use commit, but edit the commit message
>> - e, edit = use commit, but stop for amending
>> - s, squash = use commit, but meld into previous commit
>> - f, fixup = like \"squash\", but discard this commit's log message
>> - x, exec = run command (the rest of the line) using shell
>> - d, drop = remove commit
>> +p, pick = use commit
>> +r, reword = use commit, but edit the commit message
>> +e, edit = use commit, but stop for amending
>> +s, squash = use commit, but meld into previous commit
>> +f, fixup = like \"squash\", but discard this commit's log message
>> +x, exec = run command (the rest of the line) using shell
>> +d, drop = remove commit
>
> do we also need to update all the translations since this is a `gettext`
> function?

Translations are handled separately, later before a release is done.
Separation of skills. ;)

As programming is quite complicated and involved we'd rather ask
Johannes to only focus on the code in such a patch here and then later
the translators will focus on getting the translation right. As the translation
tools are sophisticated, they will likely give the previous translation such
that the translators see that there is only a white space change.

But as the commit message hints at a later patch that will reintroduce the
original indentation, maybe the translators won't even see that change?

For more information on how the translations work in the git workflow,
see 951ea7656e (Merge tag 'l10n-2.13.0-rnd2.1' of
git://github.com/git-l10n/git-po, 2017-05-09) or see
https://public-inbox.org/git/CANYiYbGfDXj4jJTcd3PpXqsDN-TwCC8Dm8B9Ov_3NaSzwsrCfg@mail.gmail.com/

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH v4 03/10] rebase -i: remove useless indentation
  2017-05-26 17:50           ` Stefan Beller
@ 2017-05-27  3:15             ` liam Beguin
  0 siblings, 0 replies; 100+ messages in thread
From: liam Beguin @ 2017-05-27  3:15 UTC (permalink / raw)
  To: Stefan Beller
  Cc: Johannes Schindelin, git@vger.kernel.org, Junio C Hamano,
	Jeff King, Philip Oakley, phillip.wood

Hi Stefan,

On 26/05/17 01:50 PM, Stefan Beller wrote:
> On Thu, May 25, 2017 at 8:15 PM, Liam Beguin <liambeguin@gmail.com> wrote:
>> Hi Johannes,
>>
>> Johannes Schindelin <johannes.schindelin@gmx.de> writes:
>>> The commands used to be indented, and it is nice to look at, but when we
>>> transform the SHA-1s, the indentation is removed. So let's do away with it.
>>>
>>> For the moment, at least: when we will use the upcoming rebase--helper
>>> to transform the SHA-1s, we *will* keep the indentation and can
>>> reintroduce it. Yet, to be able to validate the rebase--helper against
>>> the output of the current shell script version, we need to remove the
>>> extra indentation.
>>>
>>> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
>>> ---
>>>  git-rebase--interactive.sh | 14 +++++++-------
>>>  1 file changed, 7 insertions(+), 7 deletions(-)
>>>
>>> diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
>>> index 609e150d38f..c40b1fd1d2e 100644
>>> --- a/git-rebase--interactive.sh
>>> +++ b/git-rebase--interactive.sh
>>> @@ -155,13 +155,13 @@ reschedule_last_action () {
>>>  append_todo_help () {
>>>       gettext "
>>>  Commands:
>>> - p, pick = use commit
>>> - r, reword = use commit, but edit the commit message
>>> - e, edit = use commit, but stop for amending
>>> - s, squash = use commit, but meld into previous commit
>>> - f, fixup = like \"squash\", but discard this commit's log message
>>> - x, exec = run command (the rest of the line) using shell
>>> - d, drop = remove commit
>>> +p, pick = use commit
>>> +r, reword = use commit, but edit the commit message
>>> +e, edit = use commit, but stop for amending
>>> +s, squash = use commit, but meld into previous commit
>>> +f, fixup = like \"squash\", but discard this commit's log message
>>> +x, exec = run command (the rest of the line) using shell
>>> +d, drop = remove commit
>>
>> do we also need to update all the translations since this is a `gettext`
>> function?
> 
> Translations are handled separately, later before a release is done.
> Separation of skills. ;)
> 
> As programming is quite complicated and involved we'd rather ask
> Johannes to only focus on the code in such a patch here and then later
> the translators will focus on getting the translation right. As the translation
> tools are sophisticated, they will likely give the previous translation such
> that the translators see that there is only a white space change.

Thanks for the clarification, I was wondering how this part was handled.

> 
> But as the commit message hints at a later patch that will reintroduce the
> original indentation, maybe the translators won't even see that change?
> 
> For more information on how the translations work in the git workflow,
> see 951ea7656e (Merge tag 'l10n-2.13.0-rnd2.1' of
> git://github.com/git-l10n/git-po, 2017-05-09) or see
> https://public-inbox.org/git/CANYiYbGfDXj4jJTcd3PpXqsDN-TwCC8Dm8B9Ov_3NaSzwsrCfg@mail.gmail.com/
> 

Liam 

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH v4 00/10] The final building block for a faster rebase -i
  2017-05-26  3:15       ` [PATCH v4 00/10] The final building block for a faster rebase -i Liam Beguin
@ 2017-05-27 16:23         ` René Scharfe
  2017-05-29 10:51           ` Johannes Schindelin
  2017-05-29 10:56         ` Johannes Schindelin
  1 sibling, 1 reply; 100+ messages in thread
From: René Scharfe @ 2017-05-27 16:23 UTC (permalink / raw)
  To: Liam Beguin, johannes.schindelin
  Cc: git, gitster, peff, philipoakley, phillip.wood

Am 26.05.2017 um 05:15 schrieb Liam Beguin:
> I tried to time the execution on an interactive rebase (on Linux) but
> I did not notice a significant change in speed.
> Do we have a way to measure performance / speed changes between version?

Well, there's performance test script p3404-rebase-interactive.sh.  You
could run it e.g. like this:

	$ (cd t/perf && ./run origin/master HEAD ./p3404*.sh)

This would compare the performance of master with the current branch
you're on.  The results of p3404 are quite noisy for me on master,
though (saw 15% difference between runs without any code changes), so
take them with a bag of salt.

There's more info on performance tests in general in t/perf/README.

René

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH v4 02/10] rebase -i: generate the script via rebase--helper
  2017-04-28 21:31       ` [PATCH v4 02/10] rebase -i: generate the script via rebase--helper Johannes Schindelin
  2017-05-26  3:15         ` Liam Beguin
@ 2017-05-29  6:07         ` Junio C Hamano
  2017-05-29 10:54           ` Johannes Schindelin
  1 sibling, 1 reply; 100+ messages in thread
From: Junio C Hamano @ 2017-05-29  6:07 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, Philip Oakley, Jeff King, Phillip Wood

Johannes Schindelin <johannes.schindelin@gmx.de> writes:

> diff --git a/sequencer.c b/sequencer.c
> index 130cc868e51..88819a1a2a9 100644
> --- a/sequencer.c
> +++ b/sequencer.c
> @@ -2388,3 +2388,52 @@ void append_signoff(struct strbuf *msgbuf, int ignore_footer, unsigned flag)
>  
>  	strbuf_release(&sob);
>  }
> +
> +int sequencer_make_script(int keep_empty, FILE *out,
> +		int argc, const char **argv)
> +{
> +	char *format = NULL;
> +	struct pretty_print_context pp = {0};
> +	struct strbuf buf = STRBUF_INIT;
> +	struct rev_info revs;
> +	struct commit *commit;
> +
> +	init_revisions(&revs, NULL);
> +	revs.verbose_header = 1;
> +	revs.max_parents = 1;
> +	revs.cherry_pick = 1;
> +	revs.limited = 1;
> +	revs.reverse = 1;
> +	revs.right_only = 1;
> +	revs.sort_order = REV_SORT_IN_GRAPH_ORDER;
> +	revs.topo_order = 1;

cf. <xmqq4lx5i83q.fsf@gitster.mtv.corp.google.com>

This is still futzing below the API implementation detail
unnecessarily.

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH v4 00/10] The final building block for a faster rebase -i
  2017-04-28 21:30     ` [PATCH v4 00/10] The final building block for a faster rebase -i Johannes Schindelin
                         ` (10 preceding siblings ...)
  2017-05-26  3:15       ` [PATCH v4 00/10] The final building block for a faster rebase -i Liam Beguin
@ 2017-05-29  8:30       ` Junio C Hamano
  11 siblings, 0 replies; 100+ messages in thread
From: Junio C Hamano @ 2017-05-29  8:30 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, Philip Oakley, Jeff King, Phillip Wood

Johannes Schindelin <johannes.schindelin@gmx.de> writes:

> This patch series reimplements the expensive pre- and post-processing of
> the todo script in C.
>
> And it concludes the work I did to accelerate rebase -i.
>

I took another look at the series (as "What's cooking" report was
listing this in the "Needs review" state).  Except for an inssue
that I already pointed out in an earlier review, I didn't spot
anything glaringly wrong in this v4 round.

Thanks.

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH v4 00/10] The final building block for a faster rebase -i
  2017-05-27 16:23         ` René Scharfe
@ 2017-05-29 10:51           ` Johannes Schindelin
  2017-05-29 12:50             ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 100+ messages in thread
From: Johannes Schindelin @ 2017-05-29 10:51 UTC (permalink / raw)
  To: René Scharfe
  Cc: Liam Beguin, git, gitster, peff, philipoakley, phillip.wood

[-- Attachment #1: Type: text/plain, Size: 1396 bytes --]

Hi René,

On Sat, 27 May 2017, René Scharfe wrote:

> Am 26.05.2017 um 05:15 schrieb Liam Beguin:
> > I tried to time the execution on an interactive rebase (on Linux) but
> > I did not notice a significant change in speed.
> > Do we have a way to measure performance / speed changes between version?
> 
> Well, there's performance test script p3404-rebase-interactive.sh.  You
> could run it e.g. like this:
> 
> 	$ (cd t/perf && ./run origin/master HEAD ./p3404*.sh)
> 
> This would compare the performance of master with the current branch
> you're on.  The results of p3404 are quite noisy for me on master,
> though (saw 15% difference between runs without any code changes), so
> take them with a bag of salt.

Indeed. Our performance tests are simply not very meaningful.

Part of it is the use of shell scripting (which defeats performance
testing pretty well), another part is that we have no performance testing
experts among us, and failed to attract any, so not only is the repeat
count ridiculously small, we also have no graphing worth speaking of (and
therefore it is impossible to even see trends, which is a rather important
visual way to verify sound performance testing).

Frankly, I have no illusion about this getting fixed, ever.

So yes, in the meantime we need to use those numbers with a considerable
amount of skepticism.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH v4 02/10] rebase -i: generate the script via rebase--helper
  2017-05-29  6:07         ` Junio C Hamano
@ 2017-05-29 10:54           ` Johannes Schindelin
  2017-05-30  4:57             ` Junio C Hamano
  0 siblings, 1 reply; 100+ messages in thread
From: Johannes Schindelin @ 2017-05-29 10:54 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Philip Oakley, Jeff King, Phillip Wood

Hi Junio,

On Mon, 29 May 2017, Junio C Hamano wrote:

> Johannes Schindelin <johannes.schindelin@gmx.de> writes:
> 
> > diff --git a/sequencer.c b/sequencer.c
> > index 130cc868e51..88819a1a2a9 100644
> > --- a/sequencer.c
> > +++ b/sequencer.c
> > @@ -2388,3 +2388,52 @@ void append_signoff(struct strbuf *msgbuf, int ignore_footer, unsigned flag)
> >  
> >  	strbuf_release(&sob);
> >  }
> > +
> > +int sequencer_make_script(int keep_empty, FILE *out,
> > +		int argc, const char **argv)
> > +{
> > +	char *format = NULL;
> > +	struct pretty_print_context pp = {0};
> > +	struct strbuf buf = STRBUF_INIT;
> > +	struct rev_info revs;
> > +	struct commit *commit;
> > +
> > +	init_revisions(&revs, NULL);
> > +	revs.verbose_header = 1;
> > +	revs.max_parents = 1;
> > +	revs.cherry_pick = 1;
> > +	revs.limited = 1;
> > +	revs.reverse = 1;
> > +	revs.right_only = 1;
> > +	revs.sort_order = REV_SORT_IN_GRAPH_ORDER;
> > +	revs.topo_order = 1;
> 
> cf. <xmqq4lx5i83q.fsf@gitster.mtv.corp.google.com>
> 
> This is still futzing below the API implementation detail
> unnecessarily.

You still ask me to pass options in plain text that has to be parsed at
run-time, rather than compile-time-verifiable flags.

I am quite puzzled that you keep asking me to make my code sloppy.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH v4 00/10] The final building block for a faster rebase -i
  2017-05-26  3:15       ` [PATCH v4 00/10] The final building block for a faster rebase -i Liam Beguin
  2017-05-27 16:23         ` René Scharfe
@ 2017-05-29 10:56         ` Johannes Schindelin
  1 sibling, 0 replies; 100+ messages in thread
From: Johannes Schindelin @ 2017-05-29 10:56 UTC (permalink / raw)
  To: Liam Beguin; +Cc: git, gitster, peff, philipoakley, phillip.wood

Hi Liam,

On Thu, 25 May 2017, Liam Beguin wrote:

> Johannes Schindelin <johannes.schindelin@gmx.de> writes:
> > This patch series reimplements the expensive pre- and post-processing of
> > the todo script in C.
> >
> > [...]

I see that you used git-send-email to send this. It did look a bit funny
not to have "Re: " prefixed subjects, so at first I thought you simply
re-sent my patch series. But I guess we have no convenient way to perform
patch review, therefore I don't fault you...

Thanks for reviewing!
Johannes

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH v4 02/10] rebase -i: generate the script via rebase--helper
  2017-05-26  3:15         ` Liam Beguin
@ 2017-05-29 10:59           ` Johannes Schindelin
  2017-05-30 15:57             ` liam Beguin
  2017-05-30 18:19           ` liam Beguin
  1 sibling, 1 reply; 100+ messages in thread
From: Johannes Schindelin @ 2017-05-29 10:59 UTC (permalink / raw)
  To: Liam Beguin; +Cc: git, gitster, peff, philipoakley, phillip.wood

Hi Liam,

On Thu, 25 May 2017, Liam Beguin wrote:

> Johannes Schindelin <johannes.schindelin@gmx.de> writes:
>
> > diff --git a/sequencer.c b/sequencer.c
> > index 130cc868e51..88819a1a2a9 100644
> > --- a/sequencer.c
> > +++ b/sequencer.c
> > @@ -2388,3 +2388,52 @@ void append_signoff(struct strbuf *msgbuf, int ignore_footer, unsigned flag)
> >  
> >  	strbuf_release(&sob);
> >  }
> > +
> > +int sequencer_make_script(int keep_empty, FILE *out,
> > +		int argc, const char **argv)
> > +{
> > +	char *format = NULL;
> > +	struct pretty_print_context pp = {0};
> > +	struct strbuf buf = STRBUF_INIT;
> > +	struct rev_info revs;
> > +	struct commit *commit;
> > +
> > +	init_revisions(&revs, NULL);
> > +	revs.verbose_header = 1;
> > +	revs.max_parents = 1;
> > +	revs.cherry_pick = 1;
> > +	revs.limited = 1;
> > +	revs.reverse = 1;
> > +	revs.right_only = 1;
> > +	revs.sort_order = REV_SORT_IN_GRAPH_ORDER;
> > +	revs.topo_order = 1;
> > +
> > +	revs.pretty_given = 1;
> > +	git_config_get_string("rebase.instructionFormat", &format);
> > +	if (!format || !*format) {
> > +		free(format);
> > +		format = xstrdup("%s");
> > +	}
> > +	get_commit_format(format, &revs);
> > +	free(format);
> > +	pp.fmt = revs.commit_format;
> > +	pp.output_encoding = get_log_output_encoding();
> > +
> > +	if (setup_revisions(argc, argv, &revs, NULL) > 1)
> > +		return error(_("make_script: unhandled options"));
> > +
> > +	if (prepare_revision_walk(&revs) < 0)
> > +		return error(_("make_script: error preparing revisions"));
> > +
> > +	while ((commit = get_revision(&revs))) {
> > +		strbuf_reset(&buf);
> > +		if (!keep_empty && is_original_commit_empty(commit))
> > +			strbuf_addf(&buf, "%c ", comment_line_char);
> 
> I've never had to use empty commits before, but while testing this, I
> noticed that `git rebase -i --keep-empty` behaves differently if using
> the --root option instead of a branch or something like 'HEAD~10'.
> I also tested this on v2.13.0 and the behavior is the same.

FWIW the terminology "empty commit" is a pretty poor naming choice. This
is totally not your fault at all. I just wish we had a much more intuitive
name to describe a commit that does not introduce any changes to the tree.

And yes, doing this with --root is a bit of a hack. That's because --root
is a bit of a hack.

I am curious, though, as to the exact differences you experienced. I mean,
it could be buggy behavior that needs to be fixed (independently of this
patch series, of course).

Ciao,
Johannes

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH v4 05/10] rebase -i: also expand/collapse the SHA-1s via the rebase--helper
  2017-05-26  3:15         ` Liam Beguin
@ 2017-05-29 11:20           ` Johannes Schindelin
  0 siblings, 0 replies; 100+ messages in thread
From: Johannes Schindelin @ 2017-05-29 11:20 UTC (permalink / raw)
  To: Liam Beguin; +Cc: git, gitster, peff, philipoakley, phillip.wood

Hi Liam,

On Thu, 25 May 2017, Liam Beguin wrote:

> Johannes Schindelin <johannes.schindelin@gmx.de> writes:
> 
> > diff --git a/builtin/rebase--helper.c b/builtin/rebase--helper.c
> > index 821058d452d..9444c8d6c60 100644
> > --- a/builtin/rebase--helper.c
> > +++ b/builtin/rebase--helper.c
> > @@ -24,6 +24,10 @@ int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
> >  				ABORT),
> >  		OPT_CMDMODE(0, "make-script", &command,
> >  			N_("make rebase script"), MAKE_SCRIPT),
> > +		OPT_CMDMODE(0, "shorten-sha1s", &command,
> > +			N_("shorten SHA-1s in the todo list"), SHORTEN_SHA1S),
> > +		OPT_CMDMODE(0, "expand-sha1s", &command,
> > +			N_("expand SHA-1s in the todo list"), EXPAND_SHA1S),
> 
> Since work is being done to convert to `struct object_id` would it
> not be best to use a more generic name instead of 'sha1'?
> maybe something like {shorten,expand}-hashs

Good point. You suggest the use of "ids" later, and I think that is an
even better name: what we try to do here is to expand/reduce the commit
*identifiers*. The fact that they are hexadecimal representations of
hashes is an implementation detail.

> > diff --git a/sequencer.c b/sequencer.c
> > index 88819a1a2a9..201d45b1677 100644
> > --- a/sequencer.c
> > +++ b/sequencer.c
> > @@ -2437,3 +2437,60 @@ int sequencer_make_script(int keep_empty, FILE *out,
> >  	strbuf_release(&buf);
> >  	return 0;
> >  }
> > +
> > +
> > +int transform_todo_ids(int shorten_sha1s)
> > +{
> > +	const char *todo_file = rebase_path_todo();
> > +	struct todo_list todo_list = TODO_LIST_INIT;
> > +	int fd, res, i;
> > +	FILE *out;
> > +
> > +	strbuf_reset(&todo_list.buf);
> > +	fd = open(todo_file, O_RDONLY);
> > +	if (fd < 0)
> > +		return error_errno(_("could not open '%s'"), todo_file);
> > +	if (strbuf_read(&todo_list.buf, fd, 0) < 0) {
> > +		close(fd);
> > +		return error(_("could not read '%s'."), todo_file);
> > +	}
> > +	close(fd);
> > +
> > +	res = parse_insn_buffer(todo_list.buf.buf, &todo_list);
> > +	if (res) {
> > +		todo_list_release(&todo_list);
> > +		return error(_("unusable instruction sheet: '%s'"), todo_file);
> 
> As you pointed out last time, the name of the "todo script" can be a
> source of confusion. The migration to C could be a good opportunity to
> clarify this.

True. This was simply a copy-edited part, and I should have caught that.

> I don't know which is the preferred name but we could go with
> "todo list" as it is the most common across the code base.

Yep, my next iteration will use that term.

> > +	}
> > +
> > +	out = fopen(todo_file, "w");
> > +	if (!out) {
> > +		todo_list_release(&todo_list);
> > +		return error(_("unable to open '%s' for writing"), todo_file);
> > +	}
> > +	for (i = 0; i < todo_list.nr; i++) {
> > +		struct todo_item *item = todo_list.items + i;
> > +		int bol = item->offset_in_buf;
> > +		const char *p = todo_list.buf.buf + bol;
> > +		int eol = i + 1 < todo_list.nr ?
> > +			todo_list.items[i + 1].offset_in_buf :
> > +			todo_list.buf.len;
> > +
> > +		if (item->command >= TODO_EXEC && item->command != TODO_DROP)
> > +			fwrite(p, eol - bol, 1, out);
> > +		else {
> > +			const char *sha1 = shorten_sha1s ?
> > +				short_commit_name(item->commit) :
> > +				oid_to_hex(&item->commit->object.oid);
> 
> We could also use 'hash' or 'ids' here instead of 'sha1'.

Absolutely!

Thank you,
Johannes

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH v4 10/10] rebase -i: rearrange fixup/squash lines using the rebase--helper
  2017-05-26  3:16         ` Liam Beguin
@ 2017-05-29 11:26           ` Johannes Schindelin
  0 siblings, 0 replies; 100+ messages in thread
From: Johannes Schindelin @ 2017-05-29 11:26 UTC (permalink / raw)
  To: Liam Beguin; +Cc: git, gitster, peff, philipoakley, phillip.wood

Hi Liam,

On Thu, 25 May 2017, Liam Beguin wrote:

> Johannes Schindelin <johannes.schindelin@gmx.de> writes:
> [...]
> > +	if (rearranged) {
> > +		struct strbuf buf = STRBUF_INIT;
> > +
> > +		for (i = 0; i < todo_list.nr; i++) {
> > +			enum todo_command command = todo_list.items[i].command;
> > +			int cur = i;
> > +
> > +			/*
> > +			 * Initially, all commands are 'pick's. If it is a
> > +			 * fixup or a squash now, we have rearranged it.
> > +			 */
> > +			if (is_fixup(command))
> > +				continue;
> > +
> > +			while (cur >= 0) {
> > +				int offset = todo_list.items[cur].offset_in_buf;
> > +				int end_offset = cur + 1 < todo_list.nr ?
> > +					todo_list.items[cur + 1].offset_in_buf :
> > +					todo_list.buf.len;
> > +				char *bol = todo_list.buf.buf + offset;
> > +				char *eol = todo_list.buf.buf + end_offset;
> 
> I got a little confused with these offsets. I know it was part of a
> previous series, but maybe we could add a description to the fields
> of `struct todo_list` and `struct todo_item`.

You mean "offset_in_buf"?

Sure, I can add a comment there. I would like to keep it out of this patch
series, though, as the purpose of it is to accelerate rebase -i by moving
logic from the (slow) shell script code to the (decently fast) C code.

Ciao,
Johannes

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH v4 00/10] The final building block for a faster rebase -i
  2017-05-29 10:51           ` Johannes Schindelin
@ 2017-05-29 12:50             ` Ævar Arnfjörð Bjarmason
  2017-05-30 15:44               ` Johannes Schindelin
  0 siblings, 1 reply; 100+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2017-05-29 12:50 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: René Scharfe, Liam Beguin, Git Mailing List, Junio C Hamano,
	Jeff King, Philip Oakley, Phillip Wood

On Mon, May 29, 2017 at 12:51 PM, Johannes Schindelin
<Johannes.Schindelin@gmx.de> wrote:
> Hi René,
>
> On Sat, 27 May 2017, René Scharfe wrote:
>
>> Am 26.05.2017 um 05:15 schrieb Liam Beguin:
>> > I tried to time the execution on an interactive rebase (on Linux) but
>> > I did not notice a significant change in speed.
>> > Do we have a way to measure performance / speed changes between version?
>>
>> Well, there's performance test script p3404-rebase-interactive.sh.  You
>> could run it e.g. like this:
>>
>>       $ (cd t/perf && ./run origin/master HEAD ./p3404*.sh)
>>
>> This would compare the performance of master with the current branch
>> you're on.  The results of p3404 are quite noisy for me on master,
>> though (saw 15% difference between runs without any code changes), so
>> take them with a bag of salt.
>
> Indeed. Our performance tests are simply not very meaningful.
>
> Part of it is the use of shell scripting (which defeats performance
> testing pretty well),

Don't the performance tests take long enough that the shellscripting
overhead gets lost in the noise? E.g. on Windows what do you get when
you run this in t/perf:

    $ GIT_PERF_REPEAT_COUNT=3 GIT_PERF_MAKE_OPTS="-j6 NO_OPENSSL=Y
BLK_SHA1=Y CFLAGS=-O3" ./run v2.10.0 v2.12.0 v2.13.0 p3400-rebase.sh

I get split-index performance improving by 28% in 2.12 and 58% in
2.13, small error bars even with just 3 runs. This is on Linux, but my
sense of fork overhead on Windows is that it isn't so bad as to matter
here.

I'd also be interested to see what sort of results you get for my
"grep: add support for the PCRE v1 JIT API" patch which is in pu now,
assuming you have a PCRE newer than 8.32 or so.

> another part is that we have no performance testing
> experts among us, and failed to attract any, so not only is the repeat
> count ridiculously small, we also have no graphing worth speaking of (and
> therefore it is impossible to even see trends, which is a rather important
> visual way to verify sound performance testing).
>
> Frankly, I have no illusion about this getting fixed, ever.

I have a project on my TODO that I've been meaning to get to which
would address this. I'd be interested to know what people think about
the design:

* Run the perf tests in some more where the raw runtimes are saved away
* Have some way to dump a static html page from that with graphs over
time (with gnuplot svg?)
* Supply some config file to drive this, so you can e.g. run each
tests N times against your repo X for the last 10 versions of git.
* Since it's static HTML it would be trivial for anyone to share such
results, and e.g. setup running them in cron to regularly publish to
github pages.

> So yes, in the meantime we need to use those numbers with a considerable
> amount of skepticism.

...however, while the presentation could be improved, I've seen no
reason to think that the underlying numbers are suspect, or that the
perf framework needs to be rewritten as opposed to improved upon. If
you don't think so I'd like to know why.

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH v4 02/10] rebase -i: generate the script via rebase--helper
  2017-05-29 10:54           ` Johannes Schindelin
@ 2017-05-30  4:57             ` Junio C Hamano
  2017-05-30 14:59               ` Johannes Schindelin
  2017-05-30 15:08               ` revision API design, was " Johannes Schindelin
  0 siblings, 2 replies; 100+ messages in thread
From: Junio C Hamano @ 2017-05-30  4:57 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, Philip Oakley, Jeff King, Phillip Wood

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> Hi Junio,
>
> On Mon, 29 May 2017, Junio C Hamano wrote:
>
>> Johannes Schindelin <johannes.schindelin@gmx.de> writes:
>> 
>> > diff --git a/sequencer.c b/sequencer.c
>> > index 130cc868e51..88819a1a2a9 100644
>> > --- a/sequencer.c
>> > +++ b/sequencer.c
>> > @@ -2388,3 +2388,52 @@ void append_signoff(struct strbuf *msgbuf, int ignore_footer, unsigned flag)
>> >  
>> >  	strbuf_release(&sob);
>> >  }
>> > +
>> > +int sequencer_make_script(int keep_empty, FILE *out,
>> > +		int argc, const char **argv)
>> > +{
>> > +	char *format = NULL;
>> > +	struct pretty_print_context pp = {0};
>> > +	struct strbuf buf = STRBUF_INIT;
>> > +	struct rev_info revs;
>> > +	struct commit *commit;
>> > +
>> > +	init_revisions(&revs, NULL);
>> > +	revs.verbose_header = 1;
>> > +	revs.max_parents = 1;
>> > +	revs.cherry_pick = 1;
>> > +	revs.limited = 1;
>> > +	revs.reverse = 1;
>> > +	revs.right_only = 1;
>> > +	revs.sort_order = REV_SORT_IN_GRAPH_ORDER;
>> > +	revs.topo_order = 1;
>> 
>> cf. <xmqq4lx5i83q.fsf@gitster.mtv.corp.google.com>
>> 
>> This is still futzing below the API implementation detail
>> unnecessarily.
>
> You still ask me to pass options in plain text that has to be parsed at
> run-time, rather than compile-time-verifiable flags.

Absolutely.  

We do not want these implementation details to code that does not
implement command line parsing.  This one is not parsing anybody's
set of options and should not be mucking with the low level
implementation details.

See 66b2ed09 ("Fix "log" family not to be too agressive about
showing notes", 2010-01-20) and poinder.  Back then, nobody outside
builtin/log.c and revision.c (these are the two primary things that
implement command line parsing; "log.c" needs access to the low
level details because it wants to establish custom default that is
applied before it reads options given by the end-user) mucked
directly with verbose_header, which allowed the addition of
"pretty_given" to be limited only to these places that actually do
the parsing.  If the above patch to sequencer.c existed before
66b2ed09, it would have required unnecessary change to tweak
"pretty_given" in there too when 66b2ed09 was done.  That is forcing
a totally unnecesary work.  And there is no reason to expect that
the kind of change 66b2ed09 made to the general command line parsing
will not happen in the future.  Adding more code like the above that
knows the implementation detail is unwarranted, when you can just
ask the existing command line parser to set them for you.


^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH v4 02/10] rebase -i: generate the script via rebase--helper
  2017-05-30  4:57             ` Junio C Hamano
@ 2017-05-30 14:59               ` Johannes Schindelin
  2017-05-30 15:08               ` revision API design, was " Johannes Schindelin
  1 sibling, 0 replies; 100+ messages in thread
From: Johannes Schindelin @ 2017-05-30 14:59 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Philip Oakley, Jeff King, Phillip Wood

Hi Junio,

On Tue, 30 May 2017, Junio C Hamano wrote:

> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> 
> > You still ask me to pass options in plain text that has to be parsed at
> > run-time, rather than compile-time-verifiable flags.
> 
> Absolutely.  

In other words, you want me to put my name to sloppy code.

Sorry, not interested,
Dscho

^ permalink raw reply	[flat|nested] 100+ messages in thread

* revision API design, was Re: [PATCH v4 02/10] rebase -i: generate the script via rebase--helper
  2017-05-30  4:57             ` Junio C Hamano
  2017-05-30 14:59               ` Johannes Schindelin
@ 2017-05-30 15:08               ` Johannes Schindelin
  2017-05-30 22:53                 ` Junio C Hamano
  1 sibling, 1 reply; 100+ messages in thread
From: Johannes Schindelin @ 2017-05-30 15:08 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Philip Oakley, Jeff King, Phillip Wood

Hi Junio,

On Tue, 30 May 2017, Junio C Hamano wrote:

> We do not want these [revision API] implementation details to code that
> does not implement command line parsing.  This one is not parsing
> anybody's set of options and should not be mucking with the low level
> implementation details.

Just to make sure we are on the same level: you want the argc & argv to be
the free-form API of setup_revisions(), rather than code setting fields
in struct rev_info whose names can be verified at compile time, and whose
names also suggest their intended use.

I am still flabberghasted that any seasoned software developer sincerely
would suggest this.

> See 66b2ed09 ("Fix "log" family not to be too agressive about
> showing notes", 2010-01-20) and poinder.  Back then, nobody outside
> builtin/log.c and revision.c (these are the two primary things that
> implement command line parsing; "log.c" needs access to the low
> level details because it wants to establish custom default that is
> applied before it reads options given by the end-user) mucked
> directly with verbose_header, which allowed the addition of
> "pretty_given" to be limited only to these places that actually do
> the parsing.  If the above patch to sequencer.c existed before
> 66b2ed09, it would have required unnecessary change to tweak
> "pretty_given" in there too when 66b2ed09 was done.  That is forcing
> a totally unnecesary work.  And there is no reason to expect that
> the kind of change 66b2ed09 made to the general command line parsing
> will not happen in the future.  Adding more code like the above that
> knows the implementation detail is unwarranted, when you can just
> ask the existing command line parser to set them for you.

This is a very eloquent description of a problem with the API.

The correct suggestion would be to fix the API, of course. Not to declare
an interface intended for command-line parsing the internal API.

Maybe the introduction of `pretty_given` was a strong hint at a design
flaw to begin with, pointing to the fact that user_format is a singleton
(yes, really, you can't have two different user_formats in the same Git
process, what were we thinking)?

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH v4 00/10] The final building block for a faster rebase -i
  2017-05-29 12:50             ` Ævar Arnfjörð Bjarmason
@ 2017-05-30 15:44               ` Johannes Schindelin
  2017-05-30 20:22                 ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 100+ messages in thread
From: Johannes Schindelin @ 2017-05-30 15:44 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: René Scharfe, Liam Beguin, Git Mailing List, Junio C Hamano,
	Jeff King, Philip Oakley, Phillip Wood

[-- Attachment #1: Type: text/plain, Size: 7664 bytes --]

Hi Ævar,

On Mon, 29 May 2017, Ævar Arnfjörð Bjarmason wrote:

> On Mon, May 29, 2017 at 12:51 PM, Johannes Schindelin
> <Johannes.Schindelin@gmx.de> wrote:
> >
> > On Sat, 27 May 2017, René Scharfe wrote:
> >> Am 26.05.2017 um 05:15 schrieb Liam Beguin:
> >> > I tried to time the execution on an interactive rebase (on Linux)
> >> > but I did not notice a significant change in speed.  Do we have a
> >> > way to measure performance / speed changes between version?
> >>
> >> Well, there's performance test script p3404-rebase-interactive.sh.
> >> You could run it e.g. like this:
> >>
> >>       $ (cd t/perf && ./run origin/master HEAD ./p3404*.sh)
> >>
> >> This would compare the performance of master with the current branch
> >> you're on.  The results of p3404 are quite noisy for me on master,
> >> though (saw 15% difference between runs without any code changes), so
> >> take them with a bag of salt.
> >
> > Indeed. Our performance tests are simply not very meaningful.
> >
> > Part of it is the use of shell scripting (which defeats performance
> > testing pretty well),
> 
> Don't the performance tests take long enough that the shellscripting
> overhead gets lost in the noise?

Okay, here you go, my weekly (or so) clarification about the POSIX
emulation layer called MSYS2 (which itself kind of a portable Cygwin).

Whenever Git for Windows has to execute Unix shell scripts (which are not
native to Windows, as the "Unix" in "Unix shell scripts" so clearly
suggests), we resort to calling the Bash from the MSYS2 project, which
spins up a POSIX emulation layer. Git for Windows' own .exe files
(and in particular, git.exe) is *not* affected by the POSIX emulation
layer, as they are real Win32 programs.

Whenever execution has to bridge into, or out of, the POSIX emulation
layer, a few things need to be done. To emulate signal handling, for
example, a completely new process has to be spun up that itself has the
non-MSYS2 process as a child. The environment has to be converted, to
reflect the fact that some things are Unix-y paths (or path lists) inside
the POSIX emulation layer and Windows paths outside.

Even when staying within the POSIX emulation layer, some operations are
not as cheap as "Linux folks" are used to. For example, to spawn a
subshell, due to the absence of a spawn syscall fork() is called, followed
by exec(). However, fork() itself is not native to Windows and has to be
emulated. The POSIX emulation layer spins up a new process, meticulously
copies the entire memory, tries to reopen the file descriptors, network
connections, etc (to emulate the fork() semantics).

Obviously, this is anything but cheap.

And this is only a glimpse into the entire problem, as I am not aware of
any thorough analysis what is going on in msys-2.0.dll when shell scripts
run. All I know is that it slows things down dramatically.

As a consequence, even the simple act of creating a repository, or
spawning Win32 processes from within a shell, become quite the
contributing factors to the noise of the measurements.

> E.g. on Windows what do you get when you run this in t/perf:
> 
>     $ GIT_PERF_REPEAT_COUNT=3 GIT_PERF_MAKE_OPTS="-j6 NO_OPENSSL=Y
> BLK_SHA1=Y CFLAGS=-O3" ./run v2.10.0 v2.12.0 v2.13.0 p3400-rebase.sh

In my hands, a repeat count of 3 always resulted in quite a bit of noise
previously.

Mind you, I am working my machine. It has to run two VMs in the
background, has multiple browsers and dozens of tabs open, checks for
mails and Tweets and RSS feeds and a couple of Skypes are open, too.

So yeah, obviously there is a bit of noise involved.

> I get split-index performance improving by 28% in 2.12 and 58% in
> 2.13, small error bars even with just 3 runs. This is on Linux, but my
> sense of fork overhead on Windows is that it isn't so bad as to matter
> here.

Test
------------------------------------------------------
3400.2: rebase on top of a lot of unrelated changes

v2.10.0            v2.12.0                  v2.13.0
------------------------------------------------------------------
60.65(0.01+0.03)   55.75(0.01+0.07) -8.1%   55.97(0.04+0.09) -7.7%

(wrapped myself, as the ./run output is a lot wider than the 80 columns
allowed in plain text email format)

And what does it tell you?

Not much, right? You have no idea about the trend line of the three tests,
not even of the standard deviation (not that it would be meaningful for
N=3). It is not immediately obvious whether the first run is always a tad
slower (or faster), or whether there is no noticable difference between
the first and any subsequent runs.

In other words, there is no measure of confidence in those results. We
can't say how reliable those numbers are.

And we certainly can't know how much the shell scripting hurts.

Although... let's try something. Let's run the same command in a *Linux
VM* on the same machine! Yes, that should give us an idea. So here goes:

Test
------------------------------------------------------
3400.2: rebase on top of a lot of unrelated changes

v2.10.0           v2.12.0                 v2.13.0
---------------------------------------------------------------
2.08(1.76+0.15)   2.10(1.76+0.15) +1.0%   2.00(1.65+0.15) -3.8%


A ha! Not only does this show a curious *increase* in v2.12.0 (but I'd not
put much stock into that, again N=3 is way too low a repetition number),
it also shows that the Linux VM runs the same thing roughly 30x faster.

I did see a few speed differences between native git.exe on Windows and
the git executable on Linux, but it was barely in the two-digit
*percentage* region. Nowhere near the four-digit percentage region.

So now you know how much shell scripting hurts performance testing.

A lot.

It pretty much renders the entire endeavor of testing performance
completely and utterly useless.

> I'd also be interested to see what sort of results you get for my
> "grep: add support for the PCRE v1 JIT API" patch which is in pu now,
> assuming you have a PCRE newer than 8.32 or so.

pu does not build for me:

2017-05-30T11:38:50.0089681Z libgit.a(grep.o): In function `pcre1match':
2017-05-30T11:38:50.0289250Z .../grep.c:411: undefined reference to `__imp_pcre_jit_exec'
2017-05-30T11:38:50.0329160Z collect2.exe: error: ld returned 1 exit
status

> > Frankly, I have no illusion about this getting fixed, ever.
> 
> I have a project on my TODO that I've been meaning to get to which
> would address this. I'd be interested to know what people think about
> the design:
> 
> * Run the perf tests in some more where the raw runtimes are saved away

You mean a test helper designed to do the timing and the setting up so as
to time *just* the operations that should be timed?

If so: I am all for it.

> * Have some way to dump a static html page from that with graphs over
> time (with gnuplot svg?)

If you already go HTML, it would make much more sense to go d3.js. I would
even prefer to go c3.js (which uses d3.js) right away. Would make
everything so much easier.

Not to mention more portable.

> * Supply some config file to drive this, so you can e.g. run each
> tests N times against your repo X for the last 10 versions of git.

Sure.

> * Since it's static HTML it would be trivial for anyone to share such
> results, and e.g. setup running them in cron to regularly publish to
> github pages.

It does not need to be static. It can use, say, c3.js, for the added
benefit of being able to toggle multiple graphs in the same diagram.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH v4 02/10] rebase -i: generate the script via rebase--helper
  2017-05-29 10:59           ` Johannes Schindelin
@ 2017-05-30 15:57             ` liam Beguin
  0 siblings, 0 replies; 100+ messages in thread
From: liam Beguin @ 2017-05-30 15:57 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, gitster, peff, philipoakley, phillip.wood

Hi Johannes,

On 29/05/17 06:59 AM, Johannes Schindelin wrote:
> Hi Liam,
> 
> On Thu, 25 May 2017, Liam Beguin wrote:
> 
>> Johannes Schindelin <johannes.schindelin@gmx.de> writes:
>>
>>> diff --git a/sequencer.c b/sequencer.c
>>> index 130cc868e51..88819a1a2a9 100644
>>> --- a/sequencer.c
>>> +++ b/sequencer.c
>>> @@ -2388,3 +2388,52 @@ void append_signoff(struct strbuf *msgbuf, int ignore_footer, unsigned flag)
>>>  
>>>  	strbuf_release(&sob);
>>>  }
>>> +
>>> +int sequencer_make_script(int keep_empty, FILE *out,
>>> +		int argc, const char **argv)
>>> +{
>>> +	char *format = NULL;
>>> +	struct pretty_print_context pp = {0};
>>> +	struct strbuf buf = STRBUF_INIT;
>>> +	struct rev_info revs;
>>> +	struct commit *commit;
>>> +
>>> +	init_revisions(&revs, NULL);
>>> +	revs.verbose_header = 1;
>>> +	revs.max_parents = 1;
>>> +	revs.cherry_pick = 1;
>>> +	revs.limited = 1;
>>> +	revs.reverse = 1;
>>> +	revs.right_only = 1;
>>> +	revs.sort_order = REV_SORT_IN_GRAPH_ORDER;
>>> +	revs.topo_order = 1;
>>> +
>>> +	revs.pretty_given = 1;
>>> +	git_config_get_string("rebase.instructionFormat", &format);
>>> +	if (!format || !*format) {
>>> +		free(format);
>>> +		format = xstrdup("%s");
>>> +	}
>>> +	get_commit_format(format, &revs);
>>> +	free(format);
>>> +	pp.fmt = revs.commit_format;
>>> +	pp.output_encoding = get_log_output_encoding();
>>> +
>>> +	if (setup_revisions(argc, argv, &revs, NULL) > 1)
>>> +		return error(_("make_script: unhandled options"));
>>> +
>>> +	if (prepare_revision_walk(&revs) < 0)
>>> +		return error(_("make_script: error preparing revisions"));
>>> +
>>> +	while ((commit = get_revision(&revs))) {
>>> +		strbuf_reset(&buf);
>>> +		if (!keep_empty && is_original_commit_empty(commit))
>>> +			strbuf_addf(&buf, "%c ", comment_line_char);
>>
>> I've never had to use empty commits before, but while testing this, I
>> noticed that `git rebase -i --keep-empty` behaves differently if using
>> the --root option instead of a branch or something like 'HEAD~10'.
>> I also tested this on v2.13.0 and the behavior is the same.
> 
> FWIW the terminology "empty commit" is a pretty poor naming choice. This
> is totally not your fault at all. I just wish we had a much more intuitive
> name to describe a commit that does not introduce any changes to the tree.
> 
> And yes, doing this with --root is a bit of a hack. That's because --root
> is a bit of a hack.
> 
> I am curious, though, as to the exact differences you experienced. I mean,
> it could be buggy behavior that needs to be fixed (independently of this
> patch series, of course).
> 

Sorry, reading this I realized that I didn't give any details!
When using --root, --keep-empty has no effect. The empty commits
do not appear in the todo list and they are removed.
Also, when using --root without --keep-empty, the empty commits
do not show up as comments in the list.

> Ciao,
> Johannes
> 

Liam 

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH v4 02/10] rebase -i: generate the script via rebase--helper
  2017-05-26  3:15         ` Liam Beguin
  2017-05-29 10:59           ` Johannes Schindelin
@ 2017-05-30 18:19           ` liam Beguin
  1 sibling, 0 replies; 100+ messages in thread
From: liam Beguin @ 2017-05-30 18:19 UTC (permalink / raw)
  To: johannes.schindelin; +Cc: git, gitster, peff, philipoakley, phillip.wood

Hi Johannes,

Johannes Schindelin <johannes.schindelin@gmx.de> writes:
> The first step of an interactive rebase is to generate the so-called "todo
> script", to be stored in the state directory as "git-rebase-todo" and to
> be edited by the user.
>
> Originally, we adjusted the output of `git log <options>` using a simple
> sed script. Over the course of the years, the code became more
> complicated. We now use shell scripting to edit the output of `git log`
> conditionally, depending whether to keep "empty" commits (i.e. commits
> that do not change any files).
>
> On platforms where shell scripting is not native, this can be a serious
> drag. And it opens the door for incompatibilities between platforms when
> it comes to shell scripting or to Unix-y commands.
>
> Let's just re-implement the todo script generation in plain C, using the
> revision machinery directly.
>
> This is substantially faster, improving the speed relative to the
> shell script version of the interactive rebase from 2x to 3x on Windows.
>
> Note that the rearrange_squash() function in git-rebase--interactive
> relied on the fact that we set the "format" variable to the config setting
> rebase.instructionFormat. Relying on a side effect like this is no good,
> hence we explicitly perform that assignment (possibly again) in
> rearrange_squash().
>
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---
>  builtin/rebase--helper.c   |  8 +++++++-
>  git-rebase--interactive.sh | 44 +++++++++++++++++++++--------------------
>  sequencer.c                | 49 ++++++++++++++++++++++++++++++++++++++++++++++
>  sequencer.h                |  3 +++
>  4 files changed, 82 insertions(+), 22 deletions(-)
>
> diff --git a/builtin/rebase--helper.c b/builtin/rebase--helper.c
> index ca1ebb2fa18..821058d452d 100644
> --- a/builtin/rebase--helper.c
> +++ b/builtin/rebase--helper.c
> @@ -11,15 +11,19 @@ static const char * const builtin_rebase_helper_usage[] = {
>  int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
>  {
>  	struct replay_opts opts = REPLAY_OPTS_INIT;
> +	int keep_empty = 0;
>  	enum {
> -		CONTINUE = 1, ABORT
> +		CONTINUE = 1, ABORT, MAKE_SCRIPT
>  	} command = 0;
>  	struct option options[] = {
>  		OPT_BOOL(0, "ff", &opts.allow_ff, N_("allow fast-forward")),
> +		OPT_BOOL(0, "keep-empty", &keep_empty, N_("keep empty commits")),
>  		OPT_CMDMODE(0, "continue", &command, N_("continue rebase"),
>  				CONTINUE),
>  		OPT_CMDMODE(0, "abort", &command, N_("abort rebase"),
>  				ABORT),
> +		OPT_CMDMODE(0, "make-script", &command,
> +			N_("make rebase script"), MAKE_SCRIPT),
>  		OPT_END()
>  	};

This is probably being picky, but we could also use a different name
here instead of 'rebase script'. This would help keep the name of this
script consistent as you already pointed out.
maybe something like 'make-todo-list' or just 'make-list'?

>  
> @@ -36,5 +40,7 @@ int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
>  		return !!sequencer_continue(&opts);
>  	if (command == ABORT && argc == 1)
>  		return !!sequencer_remove_state(&opts);
> +	if (command == MAKE_SCRIPT && argc > 1)
> +		return !!sequencer_make_script(keep_empty, stdout, argc, argv);

same here.

>  	usage_with_options(builtin_rebase_helper_usage, options);
>  }
> diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
> index 2c9c0165b5a..609e150d38f 100644
> --- a/git-rebase--interactive.sh
> +++ b/git-rebase--interactive.sh
> @@ -785,6 +785,7 @@ collapse_todo_ids() {
>  # each log message will be re-retrieved in order to normalize the
>  # autosquash arrangement
>  rearrange_squash () {
> +	format=$(git config --get rebase.instructionFormat)
>  	# extract fixup!/squash! lines and resolve any referenced sha1's
>  	while read -r pick sha1 message
>  	do
> @@ -1210,26 +1211,27 @@ else
>  	revisions=$onto...$orig_head
>  	shortrevisions=$shorthead
>  fi
> -format=$(git config --get rebase.instructionFormat)
> -# the 'rev-list .. | sed' requires %m to parse; the instruction requires %H to parse
> -git rev-list $merges_option --format="%m%H ${format:-%s}" \
> -	--reverse --left-right --topo-order \
> -	$revisions ${restrict_revision+^$restrict_revision} | \
> -	sed -n "s/^>//p" |
> -while read -r sha1 rest
> -do
> -
> -	if test -z "$keep_empty" && is_empty_commit $sha1 && ! is_merge_commit $sha1
> -	then
> -		comment_out="$comment_char "
> -	else
> -		comment_out=
> -	fi
> +if test t != "$preserve_merges"
> +then
> +	git rebase--helper --make-script ${keep_empty:+--keep-empty} \
> +		$revisions ${restrict_revision+^$restrict_revision} >"$todo"
> +else
> +	format=$(git config --get rebase.instructionFormat)
> +	# the 'rev-list .. | sed' requires %m to parse; the instruction requires %H to parse
> +	git rev-list $merges_option --format="%m%H ${format:-%s}" \
> +		--reverse --left-right --topo-order \
> +		$revisions ${restrict_revision+^$restrict_revision} | \
> +		sed -n "s/^>//p" |
> +	while read -r sha1 rest
> +	do
> +
> +		if test -z "$keep_empty" && is_empty_commit $sha1 && ! is_merge_commit $sha1
> +		then
> +			comment_out="$comment_char "
> +		else
> +			comment_out=
> +		fi
>  
> -	if test t != "$preserve_merges"
> -	then
> -		printf '%s\n' "${comment_out}pick $sha1 $rest" >>"$todo"
> -	else
>  		if test -z "$rebase_root"
>  		then
>  			preserve=t
> @@ -1248,8 +1250,8 @@ do
>  			touch "$rewritten"/$sha1
>  			printf '%s\n' "${comment_out}pick $sha1 $rest" >>"$todo"
>  		fi
> -	fi
> -done
> +	done
> +fi
>  
>  # Watch for commits that been dropped by --cherry-pick
>  if test t = "$preserve_merges"
> diff --git a/sequencer.c b/sequencer.c
> index 130cc868e51..88819a1a2a9 100644
> --- a/sequencer.c
> +++ b/sequencer.c
> @@ -2388,3 +2388,52 @@ void append_signoff(struct strbuf *msgbuf, int ignore_footer, unsigned flag)
>  
>  	strbuf_release(&sob);
>  }
> +
> +int sequencer_make_script(int keep_empty, FILE *out,
> +		int argc, const char **argv)
> +{
> +	char *format = NULL;
> +	struct pretty_print_context pp = {0};
> +	struct strbuf buf = STRBUF_INIT;
> +	struct rev_info revs;
> +	struct commit *commit;
> +
> +	init_revisions(&revs, NULL);
> +	revs.verbose_header = 1;
> +	revs.max_parents = 1;
> +	revs.cherry_pick = 1;
> +	revs.limited = 1;
> +	revs.reverse = 1;
> +	revs.right_only = 1;
> +	revs.sort_order = REV_SORT_IN_GRAPH_ORDER;
> +	revs.topo_order = 1;
> +
> +	revs.pretty_given = 1;
> +	git_config_get_string("rebase.instructionFormat", &format);
> +	if (!format || !*format) {
> +		free(format);
> +		format = xstrdup("%s");
> +	}
> +	get_commit_format(format, &revs);
> +	free(format);
> +	pp.fmt = revs.commit_format;
> +	pp.output_encoding = get_log_output_encoding();
> +
> +	if (setup_revisions(argc, argv, &revs, NULL) > 1)
> +		return error(_("make_script: unhandled options"));
> +
> +	if (prepare_revision_walk(&revs) < 0)
> +		return error(_("make_script: error preparing revisions"));
> +
> +	while ((commit = get_revision(&revs))) {
> +		strbuf_reset(&buf);
> +		if (!keep_empty && is_original_commit_empty(commit))
> +			strbuf_addf(&buf, "%c ", comment_line_char);
> +		strbuf_addf(&buf, "pick %s ", oid_to_hex(&commit->object.oid));
> +		pretty_print_commit(&pp, commit, &buf);
> +		strbuf_addch(&buf, '\n');
> +		fputs(buf.buf, out);
> +	}
> +	strbuf_release(&buf);
> +	return 0;
> +}
> diff --git a/sequencer.h b/sequencer.h
> index f885b68395f..83f2943b7a9 100644
> --- a/sequencer.h
> +++ b/sequencer.h
> @@ -45,6 +45,9 @@ int sequencer_continue(struct replay_opts *opts);
>  int sequencer_rollback(struct replay_opts *opts);
>  int sequencer_remove_state(struct replay_opts *opts);
>  
> +int sequencer_make_script(int keep_empty, FILE *out,
> +		int argc, const char **argv);
> +
>  extern const char sign_off_header[];
>  
>  void append_signoff(struct strbuf *msgbuf, int ignore_footer, unsigned flag);
> -- 
> 2.12.2.windows.2.800.gede8f145e06

If you want, I could add to what I did in "rebase -i: add config to
abbreviate command-names"[1] and try to resolve these naming inconsistencies
in a separate patch.

Liam

[1]: https://public-inbox.org/git/20170502040048.9065-1-liambeguin@gmail.com/

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH v4 00/10] The final building block for a faster rebase -i
  2017-05-30 15:44               ` Johannes Schindelin
@ 2017-05-30 20:22                 ` Ævar Arnfjörð Bjarmason
  2017-05-31 18:46                   ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 100+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2017-05-30 20:22 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: René Scharfe, Liam Beguin, Git Mailing List, Junio C Hamano,
	Jeff King, Philip Oakley, Phillip Wood

On Tue, May 30, 2017 at 5:44 PM, Johannes Schindelin
<Johannes.Schindelin@gmx.de> wrote:
> Hi Ævar,
>
> On Mon, 29 May 2017, Ævar Arnfjörð Bjarmason wrote:
>
>> On Mon, May 29, 2017 at 12:51 PM, Johannes Schindelin
>> <Johannes.Schindelin@gmx.de> wrote:
>> >
>> > On Sat, 27 May 2017, René Scharfe wrote:
>> >> Am 26.05.2017 um 05:15 schrieb Liam Beguin:
>> >> > I tried to time the execution on an interactive rebase (on Linux)
>> >> > but I did not notice a significant change in speed.  Do we have a
>> >> > way to measure performance / speed changes between version?
>> >>
>> >> Well, there's performance test script p3404-rebase-interactive.sh.
>> >> You could run it e.g. like this:
>> >>
>> >>       $ (cd t/perf && ./run origin/master HEAD ./p3404*.sh)
>> >>
>> >> This would compare the performance of master with the current branch
>> >> you're on.  The results of p3404 are quite noisy for me on master,
>> >> though (saw 15% difference between runs without any code changes), so
>> >> take them with a bag of salt.
>> >
>> > Indeed. Our performance tests are simply not very meaningful.
>> >
>> > Part of it is the use of shell scripting (which defeats performance
>> > testing pretty well),
>>
>> Don't the performance tests take long enough that the shellscripting
>> overhead gets lost in the noise?
>
> Okay, here you go, my weekly (or so) clarification about the POSIX
> emulation layer called MSYS2 (which itself kind of a portable Cygwin).
>
> Whenever Git for Windows has to execute Unix shell scripts (which are not
> native to Windows, as the "Unix" in "Unix shell scripts" so clearly
> suggests), we resort to calling the Bash from the MSYS2 project, which
> spins up a POSIX emulation layer. Git for Windows' own .exe files
> (and in particular, git.exe) is *not* affected by the POSIX emulation
> layer, as they are real Win32 programs.
>
> Whenever execution has to bridge into, or out of, the POSIX emulation
> layer, a few things need to be done. To emulate signal handling, for
> example, a completely new process has to be spun up that itself has the
> non-MSYS2 process as a child. The environment has to be converted, to
> reflect the fact that some things are Unix-y paths (or path lists) inside
> the POSIX emulation layer and Windows paths outside.
>
> Even when staying within the POSIX emulation layer, some operations are
> not as cheap as "Linux folks" are used to. For example, to spawn a
> subshell, due to the absence of a spawn syscall fork() is called, followed
> by exec(). However, fork() itself is not native to Windows and has to be
> emulated. The POSIX emulation layer spins up a new process, meticulously
> copies the entire memory, tries to reopen the file descriptors, network
> connections, etc (to emulate the fork() semantics).
>
> Obviously, this is anything but cheap.
>
> And this is only a glimpse into the entire problem, as I am not aware of
> any thorough analysis what is going on in msys-2.0.dll when shell scripts
> run. All I know is that it slows things down dramatically.
>
> As a consequence, even the simple act of creating a repository, or
> spawning Win32 processes from within a shell, become quite the
> contributing factors to the noise of the measurements.
>
>> E.g. on Windows what do you get when you run this in t/perf:
>>
>>     $ GIT_PERF_REPEAT_COUNT=3 GIT_PERF_MAKE_OPTS="-j6 NO_OPENSSL=Y
>> BLK_SHA1=Y CFLAGS=-O3" ./run v2.10.0 v2.12.0 v2.13.0 p3400-rebase.sh
>
> In my hands, a repeat count of 3 always resulted in quite a bit of noise
> previously.
>
> Mind you, I am working my machine. It has to run two VMs in the
> background, has multiple browsers and dozens of tabs open, checks for
> mails and Tweets and RSS feeds and a couple of Skypes are open, too.
>
> So yeah, obviously there is a bit of noise involved.
>
>> I get split-index performance improving by 28% in 2.12 and 58% in
>> 2.13, small error bars even with just 3 runs. This is on Linux, but my
>> sense of fork overhead on Windows is that it isn't so bad as to matter
>> here.
>
> Test
> ------------------------------------------------------
> 3400.2: rebase on top of a lot of unrelated changes
>
> v2.10.0            v2.12.0                  v2.13.0
> ------------------------------------------------------------------
> 60.65(0.01+0.03)   55.75(0.01+0.07) -8.1%   55.97(0.04+0.09) -7.7%
>
> (wrapped myself, as the ./run output is a lot wider than the 80 columns
> allowed in plain text email format)
>
> And what does it tell you?

For all the above: Yes I get that various things are slower on
Windows, but I think that regardless of that by far the majority of
the time in most of the perf tests is spent on the code being tested,
so it doesn't come into play:

$ parallel -k -j4 '(printf "%s\t" {} && (time GIT_PERF_REPO=~/g/linux
GIT_PERF_LARGE_REPO=~/g/linux ./run {}) 2>&1 | grep ^real | grep -o
[0-9].*) | awk "{print \$2 \" \" \$1}"' ::: p[0-9]*sh
0m34.333s p0000-perf-lib-sanity.sh
4m26.415s p0001-rev-list.sh
1m41.647s p0002-read-cache.sh
28m45.001s p0003-delta-base-cache.sh
0m55.767s p0004-lazy-init-name-hash.sh
0m15.891s p0005-status.sh
3m53.143s p0006-read-tree-checkout.sh
3m28.627s p0071-sort.sh
0m49.541s p0100-globbing.sh
14m42.273s p3400-rebase.sh
0m0.573s p3404-rebase-interactive.sh
0m54.103s p4000-diff-algorithms.sh
0m8.102s p4001-diff-no-index.sh
17m34.819s p4211-line-log.sh
4m50.497s p4220-log-grep-engines.sh
3m59.636s p4221-log-grep-engines-fixed.sh

I.e. I'd expct this to come into play with e.g.
p4000-diff-algorithms.sh, but most of the time spent in e.g.
p4220-log-grep-engines.sh on any non-trivially sized repo is hanging
on git-grep to finish.

Also if you look at test_run_perf_ in perf-lib.sh we only
/usr/bin/time some tiny shell code, actually I can't see now why we're
not just doing *only* /usr/bin/time <code to perf test>, right now we
measure some setup code & sourcing the shell library too.

> Not much, right? You have no idea about the trend line of the three tests,
> not even of the standard deviation (not that it would be meaningful for
> N=3). It is not immediately obvious whether the first run is always a tad
> slower (or faster), or whether there is no noticable difference between
> the first and any subsequent runs.

Right, this could really be improved, but it's  purely in the
presentation layer, we log the raw runtimes:

    $ GIT_PERF_REPEAT_COUNT=6 ./p4220-log-grep-engines.sh
    $ cat trash\ directory.p4220-log-grep-engines/test_time.*
    0:00.01 0.01 0.00
    0:00.01 0.01 0.00
    0:00.02 0.01 0.00
    0:00.02 0.02 0.00
    0:00.01 0.01 0.00
    0:00.02 0.01 0.00

> In other words, there is no measure of confidence in those results. We
> can't say how reliable those numbers are.
>
> And we certainly can't know how much the shell scripting hurts.
>
> Although... let's try something. Let's run the same command in a *Linux
> VM* on the same machine! Yes, that should give us an idea. So here goes:
>
> Test
> ------------------------------------------------------
> 3400.2: rebase on top of a lot of unrelated changes
>
> v2.10.0           v2.12.0                 v2.13.0
> ---------------------------------------------------------------
> 2.08(1.76+0.15)   2.10(1.76+0.15) +1.0%   2.00(1.65+0.15) -3.8%
>
>
> A ha! Not only does this show a curious *increase* in v2.12.0 (but I'd not
> put much stock into that, again N=3 is way too low a repetition number),
> it also shows that the Linux VM runs the same thing roughly 30x faster.
>
> I did see a few speed differences between native git.exe on Windows and
> the git executable on Linux, but it was barely in the two-digit
> *percentage* region. Nowhere near the four-digit percentage region.
>
> So now you know how much shell scripting hurts performance testing.
>
> A lot.

Maybe that rebase test is just very fuzzy. Could you try to run this please:

$ cat p0000-perf-lib-sleep.sh
#!/bin/sh

test_description="Tests sleep performance"

. ./perf-lib.sh

test_perf_default_repo

for s in 1 2 5
do
        test_perf "sleep $s" "sleep $s"
done

test_done

I get:

$ ./run p0000-perf-lib-sleep.sh
=== Running 1 tests in this tree ===
perf 1 - sleep 1: 1 2 3 ok
perf 2 - sleep 2: 1 2 3 ok
perf 3 - sleep 5: 1 2 3 ok
# passed all 3 test(s)
1..3
Test              this tree
---------------------------------
0000.1: sleep 1   1.00(0.00+0.00)
0000.2: sleep 2   2.00(0.00+0.00)
0000.3: sleep 5   5.00(0.00+0.00)

I'd be very curious to see if you get any different results on any of
your systems, particularly Windows.

Unless I'm missing something, if those numbers are 1, 2, 5 or close
enough and with small enough error bars, then the perf framework's use
of shellscripting should be fine, if not we've got something...

> It pretty much renders the entire endeavor of testing performance
> completely and utterly useless.
>
>> I'd also be interested to see what sort of results you get for my
>> "grep: add support for the PCRE v1 JIT API" patch which is in pu now,
>> assuming you have a PCRE newer than 8.32 or so.
>
> pu does not build for me:
>
> 2017-05-30T11:38:50.0089681Z libgit.a(grep.o): In function `pcre1match':
> 2017-05-30T11:38:50.0289250Z .../grep.c:411: undefined reference to `__imp_pcre_jit_exec'
> 2017-05-30T11:38:50.0329160Z collect2.exe: error: ld returned 1 exit
> status

Ouch, looks like I've missed some spot in my pcre1 jit series. What's
the PCRE version you have? This is somehow doing the wrong thing with
this bit in grep.h:

    #include <pcre.h>
    #ifdef PCRE_CONFIG_JIT
    #if PCRE_MAJOR >= 8 && PCRE_MINOR >= 32
    #define GIT_PCRE1_USE_JIT
    [...]

>> > Frankly, I have no illusion about this getting fixed, ever.
>>
>> I have a project on my TODO that I've been meaning to get to which
>> would address this. I'd be interested to know what people think about
>> the design:
>>
>> * Run the perf tests in some more where the raw runtimes are saved away
>
> You mean a test helper designed to do the timing and the setting up so as
> to time *just* the operations that should be timed?
>
> If so: I am all for it.

No I just mean save away the raw trash*/test_time.* files, so you can
do any sort of analysis on them later, e.g. render stdev, percentiles
etc. But this is assuming the raw numbers are useful, let's find that
out.

>> * Have some way to dump a static html page from that with graphs over
>> time (with gnuplot svg?)
>
> If you already go HTML, it would make much more sense to go d3.js. I would
> even prefer to go c3.js (which uses d3.js) right away. Would make
> everything so much easier.
>
> Not to mention more portable.
>
>> * Supply some config file to drive this, so you can e.g. run each
>> tests N times against your repo X for the last 10 versions of git.
>
> Sure.
>
>> * Since it's static HTML it would be trivial for anyone to share such
>> results, and e.g. setup running them in cron to regularly publish to
>> github pages.
>
> It does not need to be static. It can use, say, c3.js, for the added
> benefit of being able to toggle multiple graphs in the same diagram.

Right, I mean "static" in the sense that it wouldn't require some
dynamic backend, it requiring JS is fine, i.e. I'd just like someone
to be able to publish some raw data, point some JS/html at it, and
have it be rendered.

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: revision API design, was Re: [PATCH v4 02/10] rebase -i: generate the script via rebase--helper
  2017-05-30 15:08               ` revision API design, was " Johannes Schindelin
@ 2017-05-30 22:53                 ` Junio C Hamano
  2017-06-01  6:48                   ` Junio C Hamano
  0 siblings, 1 reply; 100+ messages in thread
From: Junio C Hamano @ 2017-05-30 22:53 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, Philip Oakley, Jeff King, Phillip Wood

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

>> See 66b2ed09 ("Fix "log" family not to be too agressive about
>> ...
>> ask the existing command line parser to set them for you.
>
> This is a very eloquent description of a problem with the API.

Yes, but ...

> The correct suggestion would be to fix the API, of course. Not to declare
> an interface intended for command-line parsing the internal API.
>
> Maybe the introduction of `pretty_given` was a strong hint at a design
> flaw to begin with, pointing to the fact that user_format is a singleton
> (yes, really, you can't have two different user_formats in the same Git
> process, what were we thinking)?

... this tells me that you do not understand the issue.  The
embarrasing but necessary pretty-given field was not about
user_format (let alone singleton-ness of it) at all.  It was about
the show_notes feature that wants to be on by default most of the
time, but needs to be defeated when the end user asked for certain
formats.

Quite frankly, I am not interested in listening to a proposal to
update API by a person who does not understand the issue in the API,
but that is a separate issue.  A more important thing is that the
update to "rebase -i" is important enough and we do not want to
delay it by introducing further dependency.  IOW, I do not want to
spend an extra development cycle or two to update the revision setup
API and have you rebase the series on top after the API update is
done.

The current API to hide such an embarrasing but necessary
implementation details from the code that does not need to know
them, i.e. the consumer of rev-info structure with various settings,
is to invoke the command line parser.  Bypassing and going directly
down to the internal implementation detail of rev_info _is_ being
sloppy.  I would strongly prefer to see that the current series
written for the API to allow us get it over with the "rebase -i"
thing, and think revision setup API separately and later.

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH v4 00/10] The final building block for a faster rebase -i
  2017-05-30 20:22                 ` Ævar Arnfjörð Bjarmason
@ 2017-05-31 18:46                   ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 100+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2017-05-31 18:46 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: René Scharfe, Liam Beguin, Git Mailing List, Junio C Hamano,
	Jeff King, Philip Oakley, Phillip Wood

On Tue, May 30, 2017 at 10:22 PM, Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
> On Tue, May 30, 2017 at 5:44 PM, Johannes Schindelin
> <Johannes.Schindelin@gmx.de> wrote:
>> pu does not build for me:
>>
>> 2017-05-30T11:38:50.0089681Z libgit.a(grep.o): In function `pcre1match':
>> 2017-05-30T11:38:50.0289250Z .../grep.c:411: undefined reference to `__imp_pcre_jit_exec'
>> 2017-05-30T11:38:50.0329160Z collect2.exe: error: ld returned 1 exit
>> status
>
> Ouch, looks like I've missed some spot in my pcre1 jit series. What's
> the PCRE version you have? This is somehow doing the wrong thing with
> this bit in grep.h:
>
>     #include <pcre.h>
>     #ifdef PCRE_CONFIG_JIT
>     #if PCRE_MAJOR >= 8 && PCRE_MINOR >= 32
>     #define GIT_PCRE1_USE_JIT
>     [...]

I've found what the problem is myself. I'll submit a new version of
the series that fixes this.

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: revision API design, was Re: [PATCH v4 02/10] rebase -i: generate the script via rebase--helper
  2017-05-30 22:53                 ` Junio C Hamano
@ 2017-06-01  6:48                   ` Junio C Hamano
  0 siblings, 0 replies; 100+ messages in thread
From: Junio C Hamano @ 2017-06-01  6:48 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, Philip Oakley, Jeff King, Phillip Wood

Junio C Hamano <gitster@pobox.com> writes:

> The current API to hide such an embarrasing but necessary
> implementation details from the code that does not need to know
> them, i.e. the consumer of rev-info structure with various settings,
> is to invoke the command line parser.  Bypassing and going directly
> down to the internal implementation detail of rev_info _is_ being
> sloppy.  I would strongly prefer to see that the current series
> written for the API to allow us get it over with the "rebase -i"
> thing, and think revision setup API separately and later.

An updated API that hides the implementation details may look like
a vast enhancement of this patch.

I say "vast" because we need to do this for _all_ of the "else if"
cascade you see in revision.c, and probably for fields set by other
helper functions in the same file.  Otherwise, it doesn't have much
value.

Anybody who is tempted to say "We need to do these only for the
complex ones, like the one that needs to set revs->pretty_given
while setting something else" hasn't learned from the example of
66b2ed09.  Interactions between options start happening when new
options that are added, and that is when handling of a seemingly
unrelated old option that used to be very simple needs to do more in
the new world order.  And that is why this illustration has a
wrapper for "--first-parent-only", which happens to be very simple
today.

We do not want revision traversal API's customers to write

	revs.first_parent_only = 1;

because that's mucking with the implementation detail.  The current
API to hide that detail is:

	memset(&revs, 0, sizeof(revs);
	argv_pushl(&args, "--first-parent-only", NULL);
	... may be more options ...
	setup_revisions(args.argc, args.argv, &revs, ...);

and

	memset(&revs, 0, sizeof(revs);
	rev_option_first_parent_only(&revs);
	... may be more options ...
	setup_revisions(0, NULL, &revs, ...);

would be a more statically-checked rewrite, if such an API was
available.

 revision-internal.h | 18 ++++++++++++++++++
 revision.c          |  9 +++------
 2 files changed, 21 insertions(+), 6 deletions(-)

diff --git a/revision-internal.h b/revision-internal.h
new file mode 100644
index 0000000000..dea4c48412
--- /dev/null
+++ b/revision-internal.h
@@ -0,0 +1,18 @@
+#ifndef REVISION_INTERNAL_H
+#define REVISION_INTERNAL_H 1
+
+static inline void rev_option_first_parent_only(struct rev_info *revs)
+{
+	revs->first_parent_only = 1;
+}
+
+static inline void rev_option_simplify_merges(struct rev_info *revs)
+{
+	revs->simplify_merges = 1;
+	revs->topo_order = 1;
+	revs->rewrite_parents = 1;
+	revs->simplify_history = 0;
+	revs->limited = 1;
+}
+
+#endif
diff --git a/revision.c b/revision.c
index 265611e01f..9418676264 100644
--- a/revision.c
+++ b/revision.c
@@ -20,6 +20,7 @@
 #include "cache-tree.h"
 #include "bisect.h"
 #include "worktree.h"
+#include "revision-internal.h"
 
 volatile show_early_output_fn_t show_early_output;
 
@@ -1807,7 +1808,7 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
 		revs->min_age = approxidate(optarg);
 		return argcount;
 	} else if (!strcmp(arg, "--first-parent")) {
-		revs->first_parent_only = 1;
+		rev_option_first_parent_only(&revs);
 	} else if (!strcmp(arg, "--ancestry-path")) {
 		revs->ancestry_path = 1;
 		revs->simplify_history = 0;
@@ -1825,11 +1826,7 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
 		revs->sort_order = REV_SORT_IN_GRAPH_ORDER;
 		revs->topo_order = 1;
 	} else if (!strcmp(arg, "--simplify-merges")) {
-		revs->simplify_merges = 1;
-		revs->topo_order = 1;
-		revs->rewrite_parents = 1;
-		revs->simplify_history = 0;
-		revs->limited = 1;
+		rev_option_simplify_merges(&revs);
 	} else if (!strcmp(arg, "--simplify-by-decoration")) {
 		revs->simplify_merges = 1;
 		revs->topo_order = 1;

^ permalink raw reply related	[flat|nested] 100+ messages in thread

end of thread, other threads:[~2017-06-01  6:48 UTC | newest]

Thread overview: 100+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-09-02 16:22 [PATCH 0/9] The final building block for a faster rebase -i Johannes Schindelin
2016-09-02 16:23 ` [PATCH 1/9] rebase -i: generate the script via rebase--helper Johannes Schindelin
2016-09-02 16:23 ` [PATCH 2/9] rebase -i: remove useless indentation Johannes Schindelin
2016-09-02 16:23 ` [PATCH 3/9] rebase -i: do not invent onelines when expanding/collapsing SHA-1s Johannes Schindelin
2016-09-02 16:23 ` [PATCH 4/9] rebase -i: also expand/collapse the SHA-1s via the rebase--helper Johannes Schindelin
2016-09-02 20:56   ` Dennis Kaarsemaker
2016-09-03  7:01     ` Johannes Schindelin
2016-09-02 16:23 ` [PATCH 5/9] t3404: relax rebase.missingCommitsCheck tests Johannes Schindelin
2016-09-02 16:23 ` [PATCH 6/9] rebase -i: check for missing commits in the rebase--helper Johannes Schindelin
2016-09-02 20:59   ` Dennis Kaarsemaker
2016-09-02 16:23 ` [PATCH 7/9] rebase -i: skip unnecessary picks using " Johannes Schindelin
2016-09-02 16:23 ` [PATCH 8/9] t3415: test fixup with wrapped oneline Johannes Schindelin
2016-09-02 16:23 ` [PATCH 9/9] rebase -i: rearrange fixup/squash lines using the rebase--helper Johannes Schindelin
2016-09-03 18:03   ` Josh Triplett
2016-09-04  6:47     ` Johannes Schindelin
2017-04-25 13:51 ` [PATCH v2 0/9] The final building block for a faster rebase -i Johannes Schindelin
2017-04-25 13:51   ` [PATCH v2 1/9] rebase -i: generate the script via rebase--helper Johannes Schindelin
2017-04-26 10:45     ` Jeff King
2017-04-26 11:34       ` Johannes Schindelin
2017-04-25 13:51   ` [PATCH v2 2/9] rebase -i: remove useless indentation Johannes Schindelin
2017-04-25 13:51   ` [PATCH v2 3/9] rebase -i: do not invent onelines when expanding/collapsing SHA-1s Johannes Schindelin
2017-04-25 13:51   ` [PATCH v2 4/9] rebase -i: also expand/collapse the SHA-1s via the rebase--helper Johannes Schindelin
2017-04-25 13:52   ` [PATCH v2 5/9] t3404: relax rebase.missingCommitsCheck tests Johannes Schindelin
2017-04-25 13:52   ` [PATCH v2 6/9] rebase -i: check for missing commits in the rebase--helper Johannes Schindelin
2017-04-25 13:52   ` [PATCH v2 7/9] rebase -i: skip unnecessary picks using " Johannes Schindelin
2017-04-26 10:55     ` Jeff King
2017-04-26 11:31       ` Johannes Schindelin
2017-04-25 13:52   ` [PATCH v2 8/9] t3415: test fixup with wrapped oneline Johannes Schindelin
2017-04-25 13:52   ` [PATCH v2 9/9] rebase -i: rearrange fixup/squash lines using the rebase--helper Johannes Schindelin
2017-04-26  3:32   ` [PATCH v2 0/9] The final building block for a faster rebase -i Junio C Hamano
2017-04-26 11:59   ` [PATCH v3 " Johannes Schindelin
2017-04-26 11:59     ` [PATCH v3 1/9] rebase -i: generate the script via rebase--helper Johannes Schindelin
2017-04-27  4:31       ` Junio C Hamano
2017-04-27 14:18         ` Johannes Schindelin
2017-04-28  0:13           ` Junio C Hamano
2017-04-28  2:36             ` Junio C Hamano
2017-04-28 15:13             ` Johannes Schindelin
2017-05-01  3:11               ` Junio C Hamano
2017-05-01 11:47                 ` Johannes Schindelin
2017-04-28 10:08       ` Phillip Wood
2017-04-28 19:22         ` Johannes Schindelin
2017-05-01 10:06           ` Phillip Wood
2017-05-01 11:58             ` Johannes Schindelin
2017-05-01  0:49         ` Junio C Hamano
2017-05-01 11:06           ` Johannes Schindelin
2017-04-26 11:59     ` [PATCH v3 2/9] rebase -i: remove useless indentation Johannes Schindelin
2017-04-26 11:59     ` [PATCH v3 3/9] rebase -i: do not invent onelines when expanding/collapsing SHA-1s Johannes Schindelin
2017-04-26 11:59     ` [PATCH v3 4/9] rebase -i: also expand/collapse the SHA-1s via the rebase--helper Johannes Schindelin
2017-04-27  5:00       ` Junio C Hamano
2017-04-27  6:47         ` Junio C Hamano
2017-04-27 21:44         ` Johannes Schindelin
2017-04-28  0:15           ` Junio C Hamano
2017-04-28 15:15             ` Johannes Schindelin
2017-04-26 11:59     ` [PATCH v3 5/9] t3404: relax rebase.missingCommitsCheck tests Johannes Schindelin
2017-04-27  5:05       ` Junio C Hamano
2017-04-27 22:01         ` Johannes Schindelin
2017-04-26 11:59     ` [PATCH v3 6/9] rebase -i: check for missing commits in the rebase--helper Johannes Schindelin
2017-04-27  5:32       ` Junio C Hamano
2017-04-28 15:10         ` Johannes Schindelin
2017-04-26 12:00     ` [PATCH v3 7/9] rebase -i: skip unnecessary picks using " Johannes Schindelin
2017-04-26 12:00     ` [PATCH v3 8/9] t3415: test fixup with wrapped oneline Johannes Schindelin
2017-04-26 12:00     ` [PATCH v3 9/9] rebase -i: rearrange fixup/squash lines using the rebase--helper Johannes Schindelin
2017-04-28 21:30     ` [PATCH v4 00/10] The final building block for a faster rebase -i Johannes Schindelin
2017-04-28 21:31       ` [PATCH v4 01/10] t3415: verify that an empty instructionFormat is handled as before Johannes Schindelin
2017-04-28 21:31       ` [PATCH v4 02/10] rebase -i: generate the script via rebase--helper Johannes Schindelin
2017-05-26  3:15         ` Liam Beguin
2017-05-29 10:59           ` Johannes Schindelin
2017-05-30 15:57             ` liam Beguin
2017-05-30 18:19           ` liam Beguin
2017-05-29  6:07         ` Junio C Hamano
2017-05-29 10:54           ` Johannes Schindelin
2017-05-30  4:57             ` Junio C Hamano
2017-05-30 14:59               ` Johannes Schindelin
2017-05-30 15:08               ` revision API design, was " Johannes Schindelin
2017-05-30 22:53                 ` Junio C Hamano
2017-06-01  6:48                   ` Junio C Hamano
2017-04-28 21:31       ` [PATCH v4 03/10] rebase -i: remove useless indentation Johannes Schindelin
2017-05-26  3:15         ` Liam Beguin
2017-05-26 17:50           ` Stefan Beller
2017-05-27  3:15             ` liam Beguin
2017-04-28 21:32       ` [PATCH v4 04/10] rebase -i: do not invent onelines when expanding/collapsing SHA-1s Johannes Schindelin
2017-04-28 21:32       ` [PATCH v4 05/10] rebase -i: also expand/collapse the SHA-1s via the rebase--helper Johannes Schindelin
2017-05-26  3:15         ` Liam Beguin
2017-05-29 11:20           ` Johannes Schindelin
2017-04-28 21:32       ` [PATCH v4 06/10] t3404: relax rebase.missingCommitsCheck tests Johannes Schindelin
2017-04-28 21:32       ` [PATCH v4 07/10] rebase -i: check for missing commits in the rebase--helper Johannes Schindelin
2017-04-28 21:32       ` [PATCH v4 08/10] rebase -i: skip unnecessary picks using " Johannes Schindelin
2017-04-28 21:32       ` [PATCH v4 09/10] t3415: test fixup with wrapped oneline Johannes Schindelin
2017-04-28 21:33       ` [PATCH v4 10/10] rebase -i: rearrange fixup/squash lines using the rebase--helper Johannes Schindelin
2017-05-26  3:16         ` Liam Beguin
2017-05-29 11:26           ` Johannes Schindelin
2017-05-26  3:15       ` [PATCH v4 00/10] The final building block for a faster rebase -i Liam Beguin
2017-05-27 16:23         ` René Scharfe
2017-05-29 10:51           ` Johannes Schindelin
2017-05-29 12:50             ` Ævar Arnfjörð Bjarmason
2017-05-30 15:44               ` Johannes Schindelin
2017-05-30 20:22                 ` Ævar Arnfjörð Bjarmason
2017-05-31 18:46                   ` Ævar Arnfjörð Bjarmason
2017-05-29 10:56         ` Johannes Schindelin
2017-05-29  8:30       ` Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).