git@vger.kernel.org list mirror (unofficial, one of many)
 help / color / Atom feed
* [PATCH 0/3] builtin/commit-graph.c: new split/merge options
@ 2020-01-31  0:28 Taylor Blau
  2020-01-31  0:28 ` [PATCH 1/3] builtin/commit-graph.c: support '--split[=<strategy>]' Taylor Blau
                   ` (7 more replies)
  0 siblings, 8 replies; 58+ messages in thread
From: Taylor Blau @ 2020-01-31  0:28 UTC (permalink / raw)
  To: git; +Cc: peff, dstolee, gitster

Hi,

Here are another few patches that came out of working on GitHub's
deployment of incremental commit-graphs. These three patches introduce
two new options: '--split[=<merge-all|no-merge>]' and
'--input=<source>'.

The former controls whether or not commit-graph's split machinery should
either write an incremental commit graph, squash the chain of
incrementals, or defer to the other options.

(This comes from GitHub's desire to have more fine-grained control over
the commit-graph chain's behavior. We run short jobs after every push
that we would like to limit the running time of, and hence we do not
want to ever merge a long chain of incrementals unless we specifically
opt into that.)

The latter of the two new options does two things:

  * It cleans up the many options that specify input sources (e.g.,
    '--stdin-commits', '--stdin-packs', '--reachable' and so on) under
    one unifying name.

  * It allows us to introduce a new argument '--input=none', to prevent
    walking each packfile when neither '--stdin-commits' nor
    '--stdin-packs' was given.

Together, these have the combined effect of being able to write the
following two new invocations:

  $ git commit-graph write --split=merge-all --input=none

  $ git commit-graph write --split=no-merge --input=stdin-packs

to (1) merge the chain, and (2) write a single new incremental.

Thanks in advance for your review, as always.

Taylor Blau (3):
  builtin/commit-graph.c: support '--split[=<strategy>]'
  builtin/commit-graph.c: introduce '--input=<source>'
  builtin/commit-graph.c: support '--input=none'

 Documentation/git-commit-graph.txt |  55 ++++++++------
 builtin/commit-graph.c             | 113 +++++++++++++++++++++++------
 commit-graph.c                     |  25 ++++---
 commit-graph.h                     |  10 ++-
 t/t5318-commit-graph.sh            |  46 ++++++------
 t/t5324-split-commit-graph.sh      |  85 +++++++++++++++++-----
 6 files changed, 239 insertions(+), 95 deletions(-)

--
2.25.0.dirty

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [PATCH 1/3] builtin/commit-graph.c: support '--split[=<strategy>]'
  2020-01-31  0:28 [PATCH 0/3] builtin/commit-graph.c: new split/merge options Taylor Blau
@ 2020-01-31  0:28 ` Taylor Blau
  2020-01-31 14:19   ` Derrick Stolee
                     ` (2 more replies)
  2020-01-31  0:28 ` [PATCH 2/3] builtin/commit-graph.c: introduce '--input=<source>' Taylor Blau
                   ` (6 subsequent siblings)
  7 siblings, 3 replies; 58+ messages in thread
From: Taylor Blau @ 2020-01-31  0:28 UTC (permalink / raw)
  To: git; +Cc: peff, dstolee, gitster

With '--split', the commit-graph machinery writes new commits in another
incremental commit-graph which is part of the existing chain, and
optionally decides to condense the chain into a single commit-graph.
This is done to ensure that the aysmptotic behavior of looking up a
commit in an incremental chain is dominated by the number of
incrementals in that chain. It can be controlled by the '--max-commits'
and '--size-multiple' options.

On occasion, callers may want to ensure that 'git commit-graph write
--split' always writes an incremental, and never spends effort
condensing the incremental chain [1]. Previously, this was possible by
passing '--size-multiple=0', but this no longer the case following
63020f175f (commit-graph: prefer default size_mult when given zero,
2020-01-02).

Reintroduce a less-magical variant of the above with a new pair of
arguments to '--split': '--split=no-merge' and '--split=merge-all'. When
'--split=no-merge' is given, the commit-graph machinery will never
condense an existing chain and will always write a new incremental.
Conversely, if '--split=merge-all' is given, any invocation including it
will always condense a chain if one exists.  If '--split' is given with
no arguments, it behaves as before and defers to '--size-multiple', and
so on.

[1]: This might occur when, for example, a server administrator running
some program after each push may want to ensure that each job runs
proportional in time to the size of the push, and does not "jump" when
the commit-graph machinery decides to trigger a merge.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 Documentation/git-commit-graph.txt | 18 +++++++++++-----
 builtin/commit-graph.c             | 33 ++++++++++++++++++++++++++----
 commit-graph.c                     | 19 +++++++++--------
 commit-graph.h                     |  7 +++++++
 t/t5324-split-commit-graph.sh      | 25 ++++++++++++++++++++++
 5 files changed, 85 insertions(+), 17 deletions(-)

diff --git a/Documentation/git-commit-graph.txt b/Documentation/git-commit-graph.txt
index 28d1fee505..8d61ba9f56 100644
--- a/Documentation/git-commit-graph.txt
+++ b/Documentation/git-commit-graph.txt
@@ -57,11 +57,19 @@ or `--stdin-packs`.)
 With the `--append` option, include all commits that are present in the
 existing commit-graph file.
 +
-With the `--split` option, write the commit-graph as a chain of multiple
-commit-graph files stored in `<dir>/info/commit-graphs`. The new commits
-not already in the commit-graph are added in a new "tip" file. This file
-is merged with the existing file if the following merge conditions are
-met:
+With the `--split[=<strategy>]` option, write the commit-graph as a
+chain of multiple commit-graph files stored in
+`<dir>/info/commit-graphs`. Commit-graph layers are merged based on the
+strategy and other splitting options. The new commits not already in the
+commit-graph are added in a new "tip" file. This file is merged with the
+existing file if the following merge conditions are met:
+* If `--split=merge-always` is specified, then a merge is always
+conducted, and the remaining options are ignored. Conversely, if
+`--split=no-merge` is specified, a merge is never performed, and the
+remaining options are ignored. A bare `--split` defers to the remaining
+options. (Note that merging a chain of commit graphs replaces the
+existing chain with a length-1 chain where the first and only
+incremental holds the entire graph).
 +
 * If `--size-multiple=<X>` is not specified, let `X` equal 2. If the new
 tip file would have `N` commits and the previous tip has `M` commits and
diff --git a/builtin/commit-graph.c b/builtin/commit-graph.c
index de321c71ad..f03b46d627 100644
--- a/builtin/commit-graph.c
+++ b/builtin/commit-graph.c
@@ -9,7 +9,9 @@
 
 static char const * const builtin_commit_graph_usage[] = {
 	N_("git commit-graph verify [--object-dir <objdir>] [--shallow] [--[no-]progress]"),
-	N_("git commit-graph write [--object-dir <objdir>] [--append|--split] [--reachable|--stdin-packs|--stdin-commits] [--[no-]progress] <split options>"),
+	N_("git commit-graph write [--object-dir <objdir>] [--append] "
+	   "[--split[=<strategy>]] [--reachable|--stdin-packs|--stdin-commits] "
+	   "[--[no-]progress] <split options>"),
 	NULL
 };
 
@@ -19,7 +21,9 @@ static const char * const builtin_commit_graph_verify_usage[] = {
 };
 
 static const char * const builtin_commit_graph_write_usage[] = {
-	N_("git commit-graph write [--object-dir <objdir>] [--append|--split] [--reachable|--stdin-packs|--stdin-commits] [--[no-]progress] <split options>"),
+	N_("git commit-graph write [--object-dir <objdir>] [--append] "
+	   "[--split[=<strategy>]] [--reachable|--stdin-packs|--stdin-commits] "
+	   "[--[no-]progress] <split options>"),
 	NULL
 };
 
@@ -101,6 +105,25 @@ static int graph_verify(int argc, const char **argv)
 extern int read_replace_refs;
 static struct split_commit_graph_opts split_opts;
 
+static int write_option_parse_split(const struct option *opt, const char *arg,
+				    int unset)
+{
+	enum commit_graph_split_flags *flags = opt->value;
+
+	opts.split = 1;
+	if (!arg)
+		return 0;
+
+	if (!strcmp(arg, "merge-all"))
+		*flags = COMMIT_GRAPH_SPLIT_MERGE_REQUIRED;
+	else if (!strcmp(arg, "no-merge"))
+		*flags = COMMIT_GRAPH_SPLIT_MERGE_PROHIBITED;
+	else
+		die(_("unrecognized --split argument, %s"), arg);
+
+	return 0;
+}
+
 static int graph_write(int argc, const char **argv)
 {
 	struct string_list *pack_indexes = NULL;
@@ -123,8 +146,10 @@ static int graph_write(int argc, const char **argv)
 		OPT_BOOL(0, "append", &opts.append,
 			N_("include all commits already in the commit-graph file")),
 		OPT_BOOL(0, "progress", &opts.progress, N_("force progress reporting")),
-		OPT_BOOL(0, "split", &opts.split,
-			N_("allow writing an incremental commit-graph file")),
+		OPT_CALLBACK_F(0, "split", &split_opts.flags, NULL,
+			N_("allow writing an incremental commit-graph file"),
+			PARSE_OPT_OPTARG | PARSE_OPT_NONEG,
+			write_option_parse_split),
 		OPT_INTEGER(0, "max-commits", &split_opts.max_commits,
 			N_("maximum number of commits in a non-base split commit-graph")),
 		OPT_INTEGER(0, "size-multiple", &split_opts.size_multiple,
diff --git a/commit-graph.c b/commit-graph.c
index 6d34829f57..02e6ad9d1f 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -1565,15 +1565,18 @@ static void split_graph_merge_strategy(struct write_commit_graph_context *ctx)
 	num_commits = ctx->commits.nr;
 	ctx->num_commit_graphs_after = ctx->num_commit_graphs_before + 1;
 
-	while (g && (g->num_commits <= size_mult * num_commits ||
-		    (max_commits && num_commits > max_commits))) {
-		if (g->odb != ctx->odb)
-			break;
+	if (ctx->split_opts->flags != COMMIT_GRAPH_SPLIT_MERGE_PROHIBITED) {
+		while (g && (g->num_commits <= size_mult * num_commits ||
+			    (max_commits && num_commits > max_commits) ||
+			    (ctx->split_opts->flags == COMMIT_GRAPH_SPLIT_MERGE_REQUIRED))) {
+			if (g->odb != ctx->odb)
+				break;
 
-		num_commits += g->num_commits;
-		g = g->base_graph;
+			num_commits += g->num_commits;
+			g = g->base_graph;
 
-		ctx->num_commit_graphs_after--;
+			ctx->num_commit_graphs_after--;
+		}
 	}
 
 	ctx->new_base_graph = g;
@@ -1881,7 +1884,7 @@ int write_commit_graph(struct object_directory *odb,
 		goto cleanup;
 	}
 
-	if (!ctx->commits.nr)
+	if (!ctx->commits.nr && (!ctx->split || ctx->split_opts->flags != COMMIT_GRAPH_SPLIT_MERGE_REQUIRED))
 		goto cleanup;
 
 	if (ctx->split) {
diff --git a/commit-graph.h b/commit-graph.h
index 7d9fc9c16a..dadcc03808 100644
--- a/commit-graph.h
+++ b/commit-graph.h
@@ -84,10 +84,17 @@ enum commit_graph_write_flags {
 	COMMIT_GRAPH_WRITE_CHECK_OIDS = (1 << 3)
 };
 
+enum commit_graph_split_flags {
+	COMMIT_GRAPH_SPLIT_UNSPECIFIED      = 0,
+	COMMIT_GRAPH_SPLIT_MERGE_REQUIRED   = 1,
+	COMMIT_GRAPH_SPLIT_MERGE_PROHIBITED = 2
+};
+
 struct split_commit_graph_opts {
 	int size_multiple;
 	int max_commits;
 	timestamp_t expire_time;
+	enum commit_graph_split_flags flags;
 };
 
 /*
diff --git a/t/t5324-split-commit-graph.sh b/t/t5324-split-commit-graph.sh
index c24823431f..a165b48afe 100755
--- a/t/t5324-split-commit-graph.sh
+++ b/t/t5324-split-commit-graph.sh
@@ -344,4 +344,29 @@ test_expect_success 'split across alternate where alternate is not split' '
 	test_cmp commit-graph .git/objects/info/commit-graph
 '
 
+test_expect_success '--split=merge-all always merges incrementals' '
+	test_when_finished rm -rf a b c &&
+	rm -rf $graphdir $infodir/commit-graph &&
+	git reset --hard commits/10 &&
+	git rev-list -3 HEAD~4 >a &&
+	git rev-list -2 HEAD~2 >b &&
+	git rev-list -2 HEAD >c &&
+	git commit-graph write --split=no-merge --stdin-commits <a &&
+	git commit-graph write --split=no-merge --stdin-commits <b &&
+	test_line_count = 2 $graphdir/commit-graph-chain &&
+	git commit-graph write --split=merge-all --stdin-commits <c &&
+	test_line_count = 1 $graphdir/commit-graph-chain
+'
+
+test_expect_success '--split=no-merge always writes an incremental' '
+	test_when_finished rm -rf a b &&
+	rm -rf $graphdir &&
+	git reset --hard commits/2 &&
+	git rev-list HEAD~1 >a &&
+	git rev-list HEAD >b &&
+	git commit-graph write --split --stdin-commits <a &&
+	git commit-graph write --split=no-merge --stdin-commits <b &&
+	test_line_count = 2 $graphdir/commit-graph-chain
+'
+
 test_done
-- 
2.25.0.dirty


^ permalink raw reply	[flat|nested] 58+ messages in thread

* [PATCH 2/3] builtin/commit-graph.c: introduce '--input=<source>'
  2020-01-31  0:28 [PATCH 0/3] builtin/commit-graph.c: new split/merge options Taylor Blau
  2020-01-31  0:28 ` [PATCH 1/3] builtin/commit-graph.c: support '--split[=<strategy>]' Taylor Blau
@ 2020-01-31  0:28 ` Taylor Blau
  2020-01-31 14:40   ` Derrick Stolee
  2020-01-31 19:34   ` Martin Ågren
  2020-01-31  0:28 ` [PATCH 3/3] builtin/commit-graph.c: support '--input=none' Taylor Blau
                   ` (5 subsequent siblings)
  7 siblings, 2 replies; 58+ messages in thread
From: Taylor Blau @ 2020-01-31  0:28 UTC (permalink / raw)
  To: git; +Cc: peff, dstolee, gitster

The 'write' mode of the 'commit-graph' supports input from a number of
different sources: pack indexes over stdin, commits over stdin, commits
reachable from all references, and so on. Each of these options are
specified with a unique option: '--stdin-packs', '--stdin-commits', etc.

Similar to our replacement of 'git config [--<type>]' with 'git config
[--type=<type>]' (c.f., fb0dc3bac1 (builtin/config.c: support
`--type=<type>` as preferred alias for `--<type>`, 2018-04-18)), softly
deprecate '[--<input>]' in favor of '[--input=<source>]'.

This makes it more clear to implement new options that are combinations
of other options (such as, for example, "none", a combination of the old
"--append" and a new sentinel to specify to _not_ look in other packs,
which we will implement in a future patch).

Unfortunately, the new enumerated type is a bitfield, even though it
makes much more sense as '0, 1, 2, ...'. Even though *almost* all
options are pairwise exclusive, '--stdin-{packs,commits}' *is*
compatible with '--append'. For this reason, use a bitfield.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 Documentation/git-commit-graph.txt | 26 +++++-----
 builtin/commit-graph.c             | 77 ++++++++++++++++++++++--------
 t/t5318-commit-graph.sh            | 46 +++++++++---------
 t/t5324-split-commit-graph.sh      | 44 ++++++++---------
 4 files changed, 114 insertions(+), 79 deletions(-)

diff --git a/Documentation/git-commit-graph.txt b/Documentation/git-commit-graph.txt
index 8d61ba9f56..cbf80226e9 100644
--- a/Documentation/git-commit-graph.txt
+++ b/Documentation/git-commit-graph.txt
@@ -41,21 +41,21 @@ COMMANDS
 
 Write a commit-graph file based on the commits found in packfiles.
 +
-With the `--stdin-packs` option, generate the new commit graph by
+With the `--input=stdin-packs` option, generate the new commit graph by
 walking objects only in the specified pack-indexes. (Cannot be combined
-with `--stdin-commits` or `--reachable`.)
+with `--input=stdin-commits` or `--input=reachable`.)
 +
-With the `--stdin-commits` option, generate the new commit graph by
-walking commits starting at the commits specified in stdin as a list
+With the `--input=stdin-commits` option, generate the new commit graph
+by walking commits starting at the commits specified in stdin as a list
 of OIDs in hex, one OID per line. (Cannot be combined with
-`--stdin-packs` or `--reachable`.)
+`--input=stdin-packs` or `--input=reachable`.)
 +
-With the `--reachable` option, generate the new commit graph by walking
-commits starting at all refs. (Cannot be combined with `--stdin-commits`
-or `--stdin-packs`.)
+With the `--input=reachable` option, generate the new commit graph by
+walking commits starting at all refs. (Cannot be combined with
+`--input=stdin-commits` or `--input=stdin-packs`.)
 +
-With the `--append` option, include all commits that are present in the
-existing commit-graph file.
+With the `--input=append` option, include all commits that are present
+in the existing commit-graph file.
 +
 With the `--split[=<strategy>]` option, write the commit-graph as a
 chain of multiple commit-graph files stored in
@@ -107,20 +107,20 @@ $ git commit-graph write
   using commits in `<pack-index>`.
 +
 ------------------------------------------------
-$ echo <pack-index> | git commit-graph write --stdin-packs
+$ echo <pack-index> | git commit-graph write --input=stdin-packs
 ------------------------------------------------
 
 * Write a commit-graph file containing all reachable commits.
 +
 ------------------------------------------------
-$ git show-ref -s | git commit-graph write --stdin-commits
+$ git show-ref -s | git commit-graph write --input=stdin-commits
 ------------------------------------------------
 
 * Write a commit-graph file containing all commits in the current
   commit-graph file along with those reachable from `HEAD`.
 +
 ------------------------------------------------
-$ git rev-parse HEAD | git commit-graph write --stdin-commits --append
+$ git rev-parse HEAD | git commit-graph write --input=stdin-commits --input=append
 ------------------------------------------------
 
 
diff --git a/builtin/commit-graph.c b/builtin/commit-graph.c
index f03b46d627..03d815e652 100644
--- a/builtin/commit-graph.c
+++ b/builtin/commit-graph.c
@@ -10,7 +10,7 @@
 static char const * const builtin_commit_graph_usage[] = {
 	N_("git commit-graph verify [--object-dir <objdir>] [--shallow] [--[no-]progress]"),
 	N_("git commit-graph write [--object-dir <objdir>] [--append] "
-	   "[--split[=<strategy>]] [--reachable|--stdin-packs|--stdin-commits] "
+	   "[--split[=<strategy>]] [--input=<reachable|stdin-packs|stdin-commits>] "
 	   "[--[no-]progress] <split options>"),
 	NULL
 };
@@ -22,22 +22,48 @@ static const char * const builtin_commit_graph_verify_usage[] = {
 
 static const char * const builtin_commit_graph_write_usage[] = {
 	N_("git commit-graph write [--object-dir <objdir>] [--append] "
-	   "[--split[=<strategy>]] [--reachable|--stdin-packs|--stdin-commits] "
+	   "[--split[=<strategy>]] [--input=<reachable|stdin-packs|stdin-commits>] "
 	   "[--[no-]progress] <split options>"),
 	NULL
 };
 
+enum commit_graph_input {
+	COMMIT_GRAPH_INPUT_REACHABLE     = (1 << 1),
+	COMMIT_GRAPH_INPUT_STDIN_PACKS   = (1 << 2),
+	COMMIT_GRAPH_INPUT_STDIN_COMMITS = (1 << 3),
+	COMMIT_GRAPH_INPUT_APPEND        = (1 << 4)
+};
+
 static struct opts_commit_graph {
 	const char *obj_dir;
-	int reachable;
-	int stdin_packs;
-	int stdin_commits;
-	int append;
+	enum commit_graph_input input;
 	int split;
 	int shallow;
 	int progress;
 } opts;
 
+static int option_parse_input(const struct option *opt, const char *arg,
+			      int unset)
+{
+	enum commit_graph_input *to = opt->value;
+	if (unset || !strcmp(arg, "packs")) {
+		*to = 0;
+		return 0;
+	}
+
+	if (!strcmp(arg, "reachable"))
+		*to |= COMMIT_GRAPH_INPUT_REACHABLE;
+	else if (!strcmp(arg, "stdin-packs"))
+		*to |= COMMIT_GRAPH_INPUT_STDIN_PACKS;
+	else if (!strcmp(arg, "stdin-commits"))
+		*to |= COMMIT_GRAPH_INPUT_STDIN_COMMITS;
+	else if (!strcmp(arg, "append"))
+		*to |= COMMIT_GRAPH_INPUT_APPEND;
+	else
+		die(_("unrecognized --input source, %s"), arg);
+	return 0;
+}
+
 static struct object_directory *find_odb_or_die(struct repository *r,
 						const char *obj_dir)
 {
@@ -137,14 +163,21 @@ static int graph_write(int argc, const char **argv)
 		OPT_STRING(0, "object-dir", &opts.obj_dir,
 			N_("dir"),
 			N_("The object directory to store the graph")),
-		OPT_BOOL(0, "reachable", &opts.reachable,
-			N_("start walk at all refs")),
-		OPT_BOOL(0, "stdin-packs", &opts.stdin_packs,
-			N_("scan pack-indexes listed by stdin for commits")),
-		OPT_BOOL(0, "stdin-commits", &opts.stdin_commits,
-			N_("start walk at commits listed by stdin")),
-		OPT_BOOL(0, "append", &opts.append,
-			N_("include all commits already in the commit-graph file")),
+		OPT_CALLBACK(0, "input", &opts.input, NULL,
+			N_("include commits from this source in the graph"),
+			option_parse_input),
+		OPT_BIT(0, "reachable", &opts.input,
+			N_("start walk at all refs"),
+			COMMIT_GRAPH_INPUT_REACHABLE),
+		OPT_BIT(0, "stdin-packs", &opts.input,
+			N_("scan pack-indexes listed by stdin for commits"),
+			COMMIT_GRAPH_INPUT_STDIN_PACKS),
+		OPT_BIT(0, "stdin-commits", &opts.input,
+			N_("start walk at commits listed by stdin"),
+			COMMIT_GRAPH_INPUT_STDIN_COMMITS),
+		OPT_BIT(0, "append", &opts.input,
+			N_("include all commits already in the commit-graph file"),
+			COMMIT_GRAPH_INPUT_APPEND),
 		OPT_BOOL(0, "progress", &opts.progress, N_("force progress reporting")),
 		OPT_CALLBACK_F(0, "split", &split_opts.flags, NULL,
 			N_("allow writing an incremental commit-graph file"),
@@ -170,11 +203,13 @@ static int graph_write(int argc, const char **argv)
 			     builtin_commit_graph_write_options,
 			     builtin_commit_graph_write_usage, 0);
 
-	if (opts.reachable + opts.stdin_packs + opts.stdin_commits > 1)
-		die(_("use at most one of --reachable, --stdin-commits, or --stdin-packs"));
+	if ((!!(opts.input & COMMIT_GRAPH_INPUT_REACHABLE) +
+	     !!(opts.input & COMMIT_GRAPH_INPUT_STDIN_PACKS) +
+	     !!(opts.input & COMMIT_GRAPH_INPUT_STDIN_COMMITS)) > 1)
+		die(_("use at most one of --input=reachable, --input=stdin-commits, or --input=stdin-packs"));
 	if (!opts.obj_dir)
 		opts.obj_dir = get_object_directory();
-	if (opts.append)
+	if (opts.input & COMMIT_GRAPH_INPUT_APPEND)
 		flags |= COMMIT_GRAPH_WRITE_APPEND;
 	if (opts.split)
 		flags |= COMMIT_GRAPH_WRITE_SPLIT;
@@ -184,22 +219,22 @@ static int graph_write(int argc, const char **argv)
 	read_replace_refs = 0;
 	odb = find_odb_or_die(the_repository, opts.obj_dir);
 
-	if (opts.reachable) {
+	if (opts.input & COMMIT_GRAPH_INPUT_REACHABLE) {
 		if (write_commit_graph_reachable(odb, flags, &split_opts))
 			return 1;
 		return 0;
 	}
 
 	string_list_init(&lines, 0);
-	if (opts.stdin_packs || opts.stdin_commits) {
+	if (opts.input & (COMMIT_GRAPH_INPUT_STDIN_PACKS | COMMIT_GRAPH_INPUT_STDIN_COMMITS)) {
 		struct strbuf buf = STRBUF_INIT;
 
 		while (strbuf_getline(&buf, stdin) != EOF)
 			string_list_append(&lines, strbuf_detach(&buf, NULL));
 
-		if (opts.stdin_packs)
+		if (opts.input & COMMIT_GRAPH_INPUT_STDIN_PACKS)
 			pack_indexes = &lines;
-		if (opts.stdin_commits) {
+		if (opts.input & COMMIT_GRAPH_INPUT_STDIN_COMMITS) {
 			commit_hex = &lines;
 			flags |= COMMIT_GRAPH_WRITE_CHECK_OIDS;
 		}
diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh
index 0bf98b56ec..6533724bc5 100755
--- a/t/t5318-commit-graph.sh
+++ b/t/t5318-commit-graph.sh
@@ -23,10 +23,10 @@ test_expect_success 'write graph with no packs' '
 	test_path_is_missing $objdir/info/commit-graph
 '
 
-test_expect_success 'exit with correct error on bad input to --stdin-packs' '
+test_expect_success 'exit with correct error on bad input to --input=stdin-packs' '
 	cd "$TRASH_DIRECTORY/full" &&
 	echo doesnotexist >in &&
-	test_expect_code 1 git commit-graph write --stdin-packs <in 2>stderr &&
+	test_expect_code 1 git commit-graph write --input=stdin-packs <in 2>stderr &&
 	test_i18ngrep "error adding pack" stderr
 '
 
@@ -40,12 +40,12 @@ test_expect_success 'create commits and repack' '
 	git repack
 '
 
-test_expect_success 'exit with correct error on bad input to --stdin-commits' '
+test_expect_success 'exit with correct error on bad input to --input=stdin-commits' '
 	cd "$TRASH_DIRECTORY/full" &&
-	echo HEAD | test_expect_code 1 git commit-graph write --stdin-commits 2>stderr &&
+	echo HEAD | test_expect_code 1 git commit-graph write --input=stdin-commits 2>stderr &&
 	test_i18ngrep "invalid commit object id" stderr &&
 	# valid tree OID, but not a commit OID
-	git rev-parse HEAD^{tree} | test_expect_code 1 git commit-graph write --stdin-commits 2>stderr &&
+	git rev-parse HEAD^{tree} | test_expect_code 1 git commit-graph write --input=stdin-commits 2>stderr &&
 	test_i18ngrep "invalid commit object id" stderr
 '
 
@@ -227,7 +227,7 @@ graph_git_behavior 'cleared graph, commit 8 vs merge 2' full commits/8 merge/2
 
 test_expect_success 'build graph from latest pack with closure' '
 	cd "$TRASH_DIRECTORY/full" &&
-	cat new-idx | git commit-graph write --stdin-packs &&
+	cat new-idx | git commit-graph write --input=stdin-packs &&
 	test_path_is_file $objdir/info/commit-graph &&
 	graph_read_expect "9" "extra_edges"
 '
@@ -240,7 +240,7 @@ test_expect_success 'build graph from commits with closure' '
 	git tag -a -m "merge" tag/merge merge/2 &&
 	git rev-parse tag/merge >commits-in &&
 	git rev-parse merge/1 >>commits-in &&
-	cat commits-in | git commit-graph write --stdin-commits &&
+	cat commits-in | git commit-graph write --input=stdin-commits &&
 	test_path_is_file $objdir/info/commit-graph &&
 	graph_read_expect "6"
 '
@@ -250,7 +250,7 @@ graph_git_behavior 'graph from commits, commit 8 vs merge 2' full commits/8 merg
 
 test_expect_success 'build graph from commits with append' '
 	cd "$TRASH_DIRECTORY/full" &&
-	git rev-parse merge/3 | git commit-graph write --stdin-commits --append &&
+	git rev-parse merge/3 | git commit-graph write --input=stdin-commits --input=append &&
 	test_path_is_file $objdir/info/commit-graph &&
 	graph_read_expect "10" "extra_edges"
 '
@@ -260,7 +260,7 @@ graph_git_behavior 'append graph, commit 8 vs merge 2' full commits/8 merge/2
 
 test_expect_success 'build graph using --reachable' '
 	cd "$TRASH_DIRECTORY/full" &&
-	git commit-graph write --reachable &&
+	git commit-graph write --input=reachable &&
 	test_path_is_file $objdir/info/commit-graph &&
 	graph_read_expect "11" "extra_edges"
 '
@@ -301,14 +301,14 @@ test_expect_success 'perform fast-forward merge in full repo' '
 test_expect_success 'check that gc computes commit-graph' '
 	cd "$TRASH_DIRECTORY/full" &&
 	git commit --allow-empty -m "blank" &&
-	git commit-graph write --reachable &&
+	git commit-graph write --input=reachable &&
 	cp $objdir/info/commit-graph commit-graph-before-gc &&
 	git reset --hard HEAD~1 &&
 	git config gc.writeCommitGraph true &&
 	git gc &&
 	cp $objdir/info/commit-graph commit-graph-after-gc &&
 	! test_cmp_bin commit-graph-before-gc commit-graph-after-gc &&
-	git commit-graph write --reachable &&
+	git commit-graph write --input=reachable &&
 	test_cmp_bin commit-graph-after-gc $objdir/info/commit-graph
 '
 
@@ -318,18 +318,18 @@ test_expect_success 'replace-objects invalidates commit-graph' '
 	git clone full replace &&
 	(
 		cd replace &&
-		git commit-graph write --reachable &&
+		git commit-graph write --input=reachable &&
 		test_path_is_file .git/objects/info/commit-graph &&
 		git replace HEAD~1 HEAD~2 &&
 		git -c core.commitGraph=false log >expect &&
 		git -c core.commitGraph=true log >actual &&
 		test_cmp expect actual &&
-		git commit-graph write --reachable &&
+		git commit-graph write --input=reachable &&
 		git -c core.commitGraph=false --no-replace-objects log >expect &&
 		git -c core.commitGraph=true --no-replace-objects log >actual &&
 		test_cmp expect actual &&
 		rm -rf .git/objects/info/commit-graph &&
-		git commit-graph write --reachable &&
+		git commit-graph write --input=reachable &&
 		test_path_is_file .git/objects/info/commit-graph
 	)
 '
@@ -340,7 +340,7 @@ test_expect_success 'commit grafts invalidate commit-graph' '
 	git clone full graft &&
 	(
 		cd graft &&
-		git commit-graph write --reachable &&
+		git commit-graph write --input=reachable &&
 		test_path_is_file .git/objects/info/commit-graph &&
 		H1=$(git rev-parse --verify HEAD~1) &&
 		H3=$(git rev-parse --verify HEAD~3) &&
@@ -348,12 +348,12 @@ test_expect_success 'commit grafts invalidate commit-graph' '
 		git -c core.commitGraph=false log >expect &&
 		git -c core.commitGraph=true log >actual &&
 		test_cmp expect actual &&
-		git commit-graph write --reachable &&
+		git commit-graph write --input=reachable &&
 		git -c core.commitGraph=false --no-replace-objects log >expect &&
 		git -c core.commitGraph=true --no-replace-objects log >actual &&
 		test_cmp expect actual &&
 		rm -rf .git/objects/info/commit-graph &&
-		git commit-graph write --reachable &&
+		git commit-graph write --input=reachable &&
 		test_path_is_missing .git/objects/info/commit-graph
 	)
 '
@@ -364,10 +364,10 @@ test_expect_success 'replace-objects invalidates commit-graph' '
 	git clone --depth 2 "file://$TRASH_DIRECTORY/full" shallow &&
 	(
 		cd shallow &&
-		git commit-graph write --reachable &&
+		git commit-graph write --input=reachable &&
 		test_path_is_missing .git/objects/info/commit-graph &&
 		git fetch origin --unshallow &&
-		git commit-graph write --reachable &&
+		git commit-graph write --input=reachable &&
 		test_path_is_file .git/objects/info/commit-graph
 	)
 '
@@ -380,7 +380,7 @@ test_expect_success 'replace-objects invalidates commit-graph' '
 
 test_expect_success 'git commit-graph verify' '
 	cd "$TRASH_DIRECTORY/full" &&
-	git rev-parse commits/8 | git commit-graph write --stdin-commits &&
+	git rev-parse commits/8 | git commit-graph write --input=stdin-commits &&
 	git commit-graph verify >output
 '
 
@@ -591,7 +591,7 @@ test_expect_success 'setup non-the_repository tests' '
 	test_commit -C repo two &&
 	git -C repo config core.commitGraph true &&
 	git -C repo rev-parse two | \
-		git -C repo commit-graph write --stdin-commits
+		git -C repo commit-graph write --input=stdin-commits
 '
 
 test_expect_success 'parse_commit_in_graph works for non-the_repository' '
@@ -637,7 +637,7 @@ test_expect_success 'corrupt commit-graph write (broken parent)' '
 		EOF
 		broken="$(git hash-object -w -t commit --literally broken)" &&
 		git commit-tree -p "$broken" -m "good commit" "$empty" >good &&
-		test_must_fail git commit-graph write --stdin-commits \
+		test_must_fail git commit-graph write --input=stdin-commits \
 			<good 2>test_err &&
 		test_i18ngrep "unable to parse commit" test_err
 	)
@@ -658,7 +658,7 @@ test_expect_success 'corrupt commit-graph write (missing tree)' '
 		EOF
 		broken="$(git hash-object -w -t commit --literally broken)" &&
 		git commit-tree -p "$broken" -m "good" "$tree" >good &&
-		test_must_fail git commit-graph write --stdin-commits \
+		test_must_fail git commit-graph write --input=stdin-commits \
 			<good 2>test_err &&
 		test_i18ngrep "unable to parse commit" test_err
 	)
diff --git a/t/t5324-split-commit-graph.sh b/t/t5324-split-commit-graph.sh
index a165b48afe..dd74295885 100755
--- a/t/t5324-split-commit-graph.sh
+++ b/t/t5324-split-commit-graph.sh
@@ -35,7 +35,7 @@ test_expect_success 'create commits and write commit-graph' '
 		test_commit $i &&
 		git branch commits/$i || return 1
 	done &&
-	git commit-graph write --reachable &&
+	git commit-graph write --input=reachable &&
 	test_path_is_file $infodir/commit-graph &&
 	graph_read_expect 3
 '
@@ -87,7 +87,7 @@ test_expect_success 'add more commits, and write a new base graph' '
 	git reset --hard commits/4 &&
 	git merge commits/6 &&
 	git branch merge/2 &&
-	git commit-graph write --reachable &&
+	git commit-graph write --input=reachable &&
 	graph_read_expect 12
 '
 
@@ -99,7 +99,7 @@ test_expect_success 'fork and fail to base a chain on a commit-graph file' '
 		rm .git/objects/info/commit-graph &&
 		echo "$(pwd)/../.git/objects" >.git/objects/info/alternates &&
 		test_commit new-commit &&
-		git commit-graph write --reachable --split &&
+		git commit-graph write --input=reachable --split &&
 		test_path_is_file $graphdir/commit-graph-chain &&
 		test_line_count = 1 $graphdir/commit-graph-chain &&
 		verify_chain_files_exist $graphdir
@@ -112,7 +112,7 @@ test_expect_success 'add three more commits, write a tip graph' '
 	git merge commits/5 &&
 	git merge merge/2 &&
 	git branch merge/3 &&
-	git commit-graph write --reachable --split &&
+	git commit-graph write --input=reachable --split &&
 	test_path_is_missing $infodir/commit-graph &&
 	test_path_is_file $graphdir/commit-graph-chain &&
 	ls $graphdir/graph-*.graph >graph-files &&
@@ -125,7 +125,7 @@ graph_git_behavior 'split commit-graph: merge 3 vs 2' merge/3 merge/2
 test_expect_success 'add one commit, write a tip graph' '
 	test_commit 11 &&
 	git branch commits/11 &&
-	git commit-graph write --reachable --split &&
+	git commit-graph write --input=reachable --split &&
 	test_path_is_missing $infodir/commit-graph &&
 	test_path_is_file $graphdir/commit-graph-chain &&
 	ls $graphdir/graph-*.graph >graph-files &&
@@ -138,7 +138,7 @@ graph_git_behavior 'three-layer commit-graph: commit 11 vs 6' commits/11 commits
 test_expect_success 'add one commit, write a merged graph' '
 	test_commit 12 &&
 	git branch commits/12 &&
-	git commit-graph write --reachable --split &&
+	git commit-graph write --input=reachable --split &&
 	test_path_is_file $graphdir/commit-graph-chain &&
 	test_line_count = 2 $graphdir/commit-graph-chain &&
 	ls $graphdir/graph-*.graph >graph-files &&
@@ -157,7 +157,7 @@ test_expect_success 'create fork and chain across alternate' '
 		echo "$(pwd)/../.git/objects" >.git/objects/info/alternates &&
 		test_commit 13 &&
 		git branch commits/13 &&
-		git commit-graph write --reachable --split &&
+		git commit-graph write --input=reachable --split &&
 		test_path_is_file $graphdir/commit-graph-chain &&
 		test_line_count = 3 $graphdir/commit-graph-chain &&
 		ls $graphdir/graph-*.graph >graph-files &&
@@ -166,7 +166,7 @@ test_expect_success 'create fork and chain across alternate' '
 		git -c core.commitGraph=false rev-list HEAD >actual &&
 		test_cmp expect actual &&
 		test_commit 14 &&
-		git commit-graph write --reachable --split --object-dir=.git/objects/ &&
+		git commit-graph write --input=reachable --split --object-dir=.git/objects/ &&
 		test_line_count = 3 $graphdir/commit-graph-chain &&
 		ls $graphdir/graph-*.graph >graph-files &&
 		test_line_count = 1 graph-files
@@ -182,7 +182,7 @@ test_expect_success 'test merge stragety constants' '
 		git config core.commitGraph true &&
 		test_line_count = 2 $graphdir/commit-graph-chain &&
 		test_commit 14 &&
-		git commit-graph write --reachable --split --size-multiple=2 &&
+		git commit-graph write --input=reachable --split --size-multiple=2 &&
 		test_line_count = 3 $graphdir/commit-graph-chain
 
 	) &&
@@ -192,7 +192,7 @@ test_expect_success 'test merge stragety constants' '
 		git config core.commitGraph true &&
 		test_line_count = 2 $graphdir/commit-graph-chain &&
 		test_commit 14 &&
-		git commit-graph write --reachable --split --size-multiple=10 &&
+		git commit-graph write --input=reachable --split --size-multiple=10 &&
 		test_line_count = 1 $graphdir/commit-graph-chain &&
 		ls $graphdir/graph-*.graph >graph-files &&
 		test_line_count = 1 graph-files
@@ -203,7 +203,7 @@ test_expect_success 'test merge stragety constants' '
 		git config core.commitGraph true &&
 		test_line_count = 2 $graphdir/commit-graph-chain &&
 		test_commit 15 &&
-		git commit-graph write --reachable --split --size-multiple=10 --expire-time=1980-01-01 &&
+		git commit-graph write --input=reachable --split --size-multiple=10 --expire-time=1980-01-01 &&
 		test_line_count = 1 $graphdir/commit-graph-chain &&
 		ls $graphdir/graph-*.graph >graph-files &&
 		test_line_count = 3 graph-files
@@ -215,7 +215,7 @@ test_expect_success 'test merge stragety constants' '
 		test_line_count = 2 $graphdir/commit-graph-chain &&
 		test_commit 16 &&
 		test_commit 17 &&
-		git commit-graph write --reachable --split --max-commits=1 &&
+		git commit-graph write --input=reachable --split --max-commits=1 &&
 		test_line_count = 1 $graphdir/commit-graph-chain &&
 		ls $graphdir/graph-*.graph >graph-files &&
 		test_line_count = 1 graph-files
@@ -227,7 +227,7 @@ test_expect_success 'remove commit-graph-chain file after flattening' '
 	(
 		cd flatten &&
 		test_line_count = 2 $graphdir/commit-graph-chain &&
-		git commit-graph write --reachable &&
+		git commit-graph write --input=reachable &&
 		test_path_is_missing $graphdir/commit-graph-chain &&
 		ls $graphdir >graph-files &&
 		test_line_count = 0 graph-files
@@ -306,7 +306,7 @@ test_expect_success 'verify across alternates' '
 		echo "$altdir" >.git/objects/info/alternates &&
 		git commit-graph verify --object-dir="$altdir/" &&
 		test_commit extra &&
-		git commit-graph write --reachable --split &&
+		git commit-graph write --input=reachable --split &&
 		tip_file=$graphdir/graph-$(tail -n 1 $graphdir/commit-graph-chain).graph &&
 		corrupt_file "$tip_file" 100 "\01" &&
 		test_must_fail git commit-graph verify --shallow 2>test_err &&
@@ -319,7 +319,7 @@ test_expect_success 'add octopus merge' '
 	git reset --hard commits/10 &&
 	git merge commits/3 commits/4 &&
 	git branch merge/octopus &&
-	git commit-graph write --reachable --split &&
+	git commit-graph write --input=reachable --split &&
 	git commit-graph verify --progress 2>err &&
 	test_line_count = 3 err &&
 	test_i18ngrep ! warning err &&
@@ -329,7 +329,7 @@ test_expect_success 'add octopus merge' '
 graph_git_behavior 'graph exists' merge/octopus commits/12
 
 test_expect_success 'split across alternate where alternate is not split' '
-	git commit-graph write --reachable &&
+	git commit-graph write --input=reachable &&
 	test_path_is_file .git/objects/info/commit-graph &&
 	cp .git/objects/info/commit-graph . &&
 	git clone --no-hardlinks . alt-split &&
@@ -338,7 +338,7 @@ test_expect_success 'split across alternate where alternate is not split' '
 		rm -f .git/objects/info/commit-graph &&
 		echo "$(pwd)"/../.git/objects >.git/objects/info/alternates &&
 		test_commit 18 &&
-		git commit-graph write --reachable --split &&
+		git commit-graph write --input=reachable --split &&
 		test_line_count = 1 $graphdir/commit-graph-chain
 	) &&
 	test_cmp commit-graph .git/objects/info/commit-graph
@@ -351,10 +351,10 @@ test_expect_success '--split=merge-all always merges incrementals' '
 	git rev-list -3 HEAD~4 >a &&
 	git rev-list -2 HEAD~2 >b &&
 	git rev-list -2 HEAD >c &&
-	git commit-graph write --split=no-merge --stdin-commits <a &&
-	git commit-graph write --split=no-merge --stdin-commits <b &&
+	git commit-graph write --split=no-merge --input=stdin-commits <a &&
+	git commit-graph write --split=no-merge --input=stdin-commits <b &&
 	test_line_count = 2 $graphdir/commit-graph-chain &&
-	git commit-graph write --split=merge-all --stdin-commits <c &&
+	git commit-graph write --split=merge-all --input=stdin-commits <c &&
 	test_line_count = 1 $graphdir/commit-graph-chain
 '
 
@@ -364,8 +364,8 @@ test_expect_success '--split=no-merge always writes an incremental' '
 	git reset --hard commits/2 &&
 	git rev-list HEAD~1 >a &&
 	git rev-list HEAD >b &&
-	git commit-graph write --split --stdin-commits <a &&
-	git commit-graph write --split=no-merge --stdin-commits <b &&
+	git commit-graph write --split --input=stdin-commits <a &&
+	git commit-graph write --split=no-merge --input=stdin-commits <b &&
 	test_line_count = 2 $graphdir/commit-graph-chain
 '
 
-- 
2.25.0.dirty


^ permalink raw reply	[flat|nested] 58+ messages in thread

* [PATCH 3/3] builtin/commit-graph.c: support '--input=none'
  2020-01-31  0:28 [PATCH 0/3] builtin/commit-graph.c: new split/merge options Taylor Blau
  2020-01-31  0:28 ` [PATCH 1/3] builtin/commit-graph.c: support '--split[=<strategy>]' Taylor Blau
  2020-01-31  0:28 ` [PATCH 2/3] builtin/commit-graph.c: introduce '--input=<source>' Taylor Blau
@ 2020-01-31  0:28 ` Taylor Blau
  2020-01-31 14:40   ` Derrick Stolee
  2020-01-31 19:45   ` Martin Ågren
  2020-01-31  0:32 ` [PATCH 0/3] builtin/commit-graph.c: new split/merge options Taylor Blau
                   ` (4 subsequent siblings)
  7 siblings, 2 replies; 58+ messages in thread
From: Taylor Blau @ 2020-01-31  0:28 UTC (permalink / raw)
  To: git; +Cc: peff, dstolee, gitster

In the previous commit, we introduced '--[no-]merge', and alluded to the
fact that '--merge' would be useful for callers who wish to always
trigger a merge of an incremental chain.

There is a problem with the above approach, which is that there is no
way to specify to the commit-graph builtin that a caller only wants to
include commits already in the graph. One can specify '--input=append'
to include all commits in the existing graphs, but the absence of
'--input=stdin-{commits,packs}' causes the builtin to call
'fill_oids_from_all_packs()'.

Passing '--input=reachable' (as in 'git commit-graph write
--split=merge-all --input=reachable --input=append') works around this
issue by making '--input=reachable' effectively a no-op, but this can be
prohibitively expensive in large repositories, making it an undesirable
choice for some users.

Teach '--input=none' as an option to behave as if '--input=append' were
given, but to consider no other sources in addition.

This, in conjunction with the option introduced in the previous patch
offers the convenient way to force the commit-graph machinery to
condense a chain of incrementals without requiring any new commits:

  $ git commit-graph write --split=merge-all --input=none

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 Documentation/git-commit-graph.txt | 21 +++++++++++++--------
 builtin/commit-graph.c             | 13 ++++++++++---
 commit-graph.c                     |  6 ++++--
 commit-graph.h                     |  3 ++-
 t/t5324-split-commit-graph.sh      | 26 ++++++++++++++++++++++++++
 5 files changed, 55 insertions(+), 14 deletions(-)

diff --git a/Documentation/git-commit-graph.txt b/Documentation/git-commit-graph.txt
index cbf80226e9..d380c42e82 100644
--- a/Documentation/git-commit-graph.txt
+++ b/Documentation/git-commit-graph.txt
@@ -39,24 +39,29 @@ COMMANDS
 --------
 'write'::
 
-Write a commit-graph file based on the commits found in packfiles.
+Write a commit-graph file based on the commits specified:
+* With the `--input=stdin-packs` option, generate the new commit graph
+by walking objects only in the specified pack-indexes. (Cannot be
+combined with `--input=stdin-commits` or `--input=reachable`.)
 +
-With the `--input=stdin-packs` option, generate the new commit graph by
-walking objects only in the specified pack-indexes. (Cannot be combined
-with `--input=stdin-commits` or `--input=reachable`.)
-+
-With the `--input=stdin-commits` option, generate the new commit graph
+* With the `--input=stdin-commits` option, generate the new commit graph
 by walking commits starting at the commits specified in stdin as a list
 of OIDs in hex, one OID per line. (Cannot be combined with
 `--input=stdin-packs` or `--input=reachable`.)
 +
-With the `--input=reachable` option, generate the new commit graph by
+* With the `--input=reachable` option, generate the new commit graph by
 walking commits starting at all refs. (Cannot be combined with
 `--input=stdin-commits` or `--input=stdin-packs`.)
 +
-With the `--input=append` option, include all commits that are present
+* With the `--input=append` option, include all commits that are present
 in the existing commit-graph file.
 +
+* With the `--input=none` option, behave as if `input=append` were
+given, but do not walk other packs to find additional commits.
+
+If none of the above options are given, then commits found in
+packfiles are specified.
++
 With the `--split[=<strategy>]` option, write the commit-graph as a
 chain of multiple commit-graph files stored in
 `<dir>/info/commit-graphs`. Commit-graph layers are merged based on the
diff --git a/builtin/commit-graph.c b/builtin/commit-graph.c
index 03d815e652..937b98e99e 100644
--- a/builtin/commit-graph.c
+++ b/builtin/commit-graph.c
@@ -10,7 +10,8 @@
 static char const * const builtin_commit_graph_usage[] = {
 	N_("git commit-graph verify [--object-dir <objdir>] [--shallow] [--[no-]progress]"),
 	N_("git commit-graph write [--object-dir <objdir>] [--append] "
-	   "[--split[=<strategy>]] [--input=<reachable|stdin-packs|stdin-commits>] "
+	   "[--split[=<strategy>]] "
+	   "[--input=<reachable|stdin-packs|stdin-commits|none>] "
 	   "[--[no-]progress] <split options>"),
 	NULL
 };
@@ -22,7 +23,8 @@ static const char * const builtin_commit_graph_verify_usage[] = {
 
 static const char * const builtin_commit_graph_write_usage[] = {
 	N_("git commit-graph write [--object-dir <objdir>] [--append] "
-	   "[--split[=<strategy>]] [--input=<reachable|stdin-packs|stdin-commits>] "
+	   "[--split[=<strategy>]] "
+	   "[--input=<reachable|stdin-packs|stdin-commits|none>] "
 	   "[--[no-]progress] <split options>"),
 	NULL
 };
@@ -31,7 +33,8 @@ enum commit_graph_input {
 	COMMIT_GRAPH_INPUT_REACHABLE     = (1 << 1),
 	COMMIT_GRAPH_INPUT_STDIN_PACKS   = (1 << 2),
 	COMMIT_GRAPH_INPUT_STDIN_COMMITS = (1 << 3),
-	COMMIT_GRAPH_INPUT_APPEND        = (1 << 4)
+	COMMIT_GRAPH_INPUT_APPEND        = (1 << 4),
+	COMMIT_GRAPH_INPUT_NONE          = (1 << 5)
 };
 
 static struct opts_commit_graph {
@@ -59,6 +62,8 @@ static int option_parse_input(const struct option *opt, const char *arg,
 		*to |= COMMIT_GRAPH_INPUT_STDIN_COMMITS;
 	else if (!strcmp(arg, "append"))
 		*to |= COMMIT_GRAPH_INPUT_APPEND;
+	else if (!strcmp(arg, "none"))
+		*to |= (COMMIT_GRAPH_INPUT_APPEND | COMMIT_GRAPH_INPUT_NONE);
 	else
 		die(_("unrecognized --input source, %s"), arg);
 	return 0;
@@ -211,6 +216,8 @@ static int graph_write(int argc, const char **argv)
 		opts.obj_dir = get_object_directory();
 	if (opts.input & COMMIT_GRAPH_INPUT_APPEND)
 		flags |= COMMIT_GRAPH_WRITE_APPEND;
+	if (opts.input & COMMIT_GRAPH_INPUT_NONE)
+		flags |= COMMIT_GRAPH_WRITE_NO_INPUT;
 	if (opts.split)
 		flags |= COMMIT_GRAPH_WRITE_SPLIT;
 	if (opts.progress)
diff --git a/commit-graph.c b/commit-graph.c
index 02e6ad9d1f..a5d7624073 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -808,7 +808,8 @@ struct write_commit_graph_context {
 	unsigned append:1,
 		 report_progress:1,
 		 split:1,
-		 check_oids:1;
+		 check_oids:1,
+		 no_input:1;
 
 	const struct split_commit_graph_opts *split_opts;
 };
@@ -1802,6 +1803,7 @@ int write_commit_graph(struct object_directory *odb,
 	ctx->split = flags & COMMIT_GRAPH_WRITE_SPLIT ? 1 : 0;
 	ctx->check_oids = flags & COMMIT_GRAPH_WRITE_CHECK_OIDS ? 1 : 0;
 	ctx->split_opts = split_opts;
+	ctx->no_input = flags & COMMIT_GRAPH_WRITE_NO_INPUT ? 1 : 0;
 
 	if (ctx->split) {
 		struct commit_graph *g;
@@ -1860,7 +1862,7 @@ int write_commit_graph(struct object_directory *odb,
 			goto cleanup;
 	}
 
-	if (!pack_indexes && !commit_hex)
+	if (!ctx->no_input && !pack_indexes && !commit_hex)
 		fill_oids_from_all_packs(ctx);
 
 	close_reachable(ctx);
diff --git a/commit-graph.h b/commit-graph.h
index dadcc03808..dd8c00a2d8 100644
--- a/commit-graph.h
+++ b/commit-graph.h
@@ -81,7 +81,8 @@ enum commit_graph_write_flags {
 	COMMIT_GRAPH_WRITE_PROGRESS   = (1 << 1),
 	COMMIT_GRAPH_WRITE_SPLIT      = (1 << 2),
 	/* Make sure that each OID in the input is a valid commit OID. */
-	COMMIT_GRAPH_WRITE_CHECK_OIDS = (1 << 3)
+	COMMIT_GRAPH_WRITE_CHECK_OIDS = (1 << 3),
+	COMMIT_GRAPH_WRITE_NO_INPUT   = (1 << 4)
 };
 
 enum commit_graph_split_flags {
diff --git a/t/t5324-split-commit-graph.sh b/t/t5324-split-commit-graph.sh
index dd74295885..296b5a9185 100755
--- a/t/t5324-split-commit-graph.sh
+++ b/t/t5324-split-commit-graph.sh
@@ -369,4 +369,30 @@ test_expect_success '--split=no-merge always writes an incremental' '
 	test_line_count = 2 $graphdir/commit-graph-chain
 '
 
+test_expect_success '--split=no-merge, --input=none writes nothing' '
+	test_when_finished rm -rf a graphs.before graphs.after &&
+	rm -rf $graphdir &&
+	git reset --hard commits/2 &&
+	git rev-list -1 HEAD~1 >a &&
+	git commit-graph write --split=no-merge --input=stdin-commits <a &&
+	ls $graphdir/graph-*.graph >graphs.before &&
+	test_line_count = 1 $graphdir/commit-graph-chain &&
+	git commit-graph write --split --input=none &&
+	ls $graphdir/graph-*.graph >graphs.after &&
+	test_cmp graphs.before graphs.after
+'
+
+test_expect_success '--split=merge-all, --input=none merges the chain' '
+	test_when_finished rm -rf a b &&
+	rm -rf $graphdir &&
+	git reset --hard commits/2 &&
+	git rev-list -1 HEAD~1 >a &&
+	git rev-list -1 HEAD >b &&
+	git commit-graph write --split=no-merge --input=stdin-commits <a &&
+	git commit-graph write --split=no-merge --input=stdin-commits <b &&
+	test_line_count = 2 $graphdir/commit-graph-chain &&
+	git commit-graph write --split=merge-all --input=none &&
+	test_line_count = 1 $graphdir/commit-graph-chain
+'
+
 test_done
-- 
2.25.0.dirty

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 0/3] builtin/commit-graph.c: new split/merge options
  2020-01-31  0:28 [PATCH 0/3] builtin/commit-graph.c: new split/merge options Taylor Blau
                   ` (2 preceding siblings ...)
  2020-01-31  0:28 ` [PATCH 3/3] builtin/commit-graph.c: support '--input=none' Taylor Blau
@ 2020-01-31  0:32 ` Taylor Blau
  2020-01-31 13:26   ` Derrick Stolee
  2020-01-31 14:41 ` Derrick Stolee
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 58+ messages in thread
From: Taylor Blau @ 2020-01-31  0:32 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, peff, dstolee, gitster

On Thu, Jan 30, 2020 at 04:28:14PM -0800, Taylor Blau wrote:
> Hi,
>
> Here are another few patches that came out of working on GitHub's
> deployment of incremental commit-graphs. These three patches introduce
> two new options: '--split[=<merge-all|no-merge>]' and
> '--input=<source>'.

I should have mentioned: these patches are based on top of my
'tb/commit-graph-use-odb' branch, which I sent to the list in:

  https://lore.kernel.org/git/cover.1580424766.git.me@ttaylorr.com

I think that it makes sense to queue the above other topic first before
this one is applied, at least since these patches are based on that
branch. I can prepare a version that is based on 'master' if that is
preferable, there are only a handful of conflicts to resolve around use
of '->obj_dir' vs '->odb->path' in changing the order.

The two just felt a little too large and disjoint to send as one larger
series.

> [...]

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 0/3] builtin/commit-graph.c: new split/merge options
  2020-01-31  0:32 ` [PATCH 0/3] builtin/commit-graph.c: new split/merge options Taylor Blau
@ 2020-01-31 13:26   ` Derrick Stolee
  0 siblings, 0 replies; 58+ messages in thread
From: Derrick Stolee @ 2020-01-31 13:26 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, peff, dstolee, gitster

On 1/30/2020 7:32 PM, Taylor Blau wrote:
> On Thu, Jan 30, 2020 at 04:28:14PM -0800, Taylor Blau wrote:
>> Hi,
>>
>> Here are another few patches that came out of working on GitHub's
>> deployment of incremental commit-graphs. These three patches introduce
>> two new options: '--split[=<merge-all|no-merge>]' and
>> '--input=<source>'.
> 
> I should have mentioned: these patches are based on top of my
> 'tb/commit-graph-use-odb' branch, which I sent to the list in:
> 
>   https://lore.kernel.org/git/cover.1580424766.git.me@ttaylorr.com
> 
> I think that it makes sense to queue the above other topic first before
> this one is applied, at least since these patches are based on that
> branch. I can prepare a version that is based on 'master' if that is
> preferable, there are only a handful of conflicts to resolve around use
> of '->obj_dir' vs '->odb->path' in changing the order.
> 
> The two just felt a little too large and disjoint to send as one larger
> series.

I think the order you recommend is good, especially because the obj_dir
cleanup is less likely to be controversial. Command-line interface changes
should have more time to cook.

Thanks,
-Stolee


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 1/3] builtin/commit-graph.c: support '--split[=<strategy>]'
  2020-01-31  0:28 ` [PATCH 1/3] builtin/commit-graph.c: support '--split[=<strategy>]' Taylor Blau
@ 2020-01-31 14:19   ` Derrick Stolee
  2020-02-04  3:47     ` Taylor Blau
  2020-01-31 19:27   ` Martin Ågren
  2020-01-31 23:34   ` SZEDER Gábor
  2 siblings, 1 reply; 58+ messages in thread
From: Derrick Stolee @ 2020-01-31 14:19 UTC (permalink / raw)
  To: Taylor Blau, git; +Cc: peff, dstolee, gitster

On 1/30/2020 7:28 PM, Taylor Blau wrote:
> With '--split', the commit-graph machinery writes new commits in another
> incremental commit-graph which is part of the existing chain, and
> optionally decides to condense the chain into a single commit-graph.
> This is done to ensure that the aysmptotic behavior of looking up a
> commit in an incremental chain is dominated by the number of
> incrementals in that chain. It can be controlled by the '--max-commits'
> and '--size-multiple' options.
> 
> On occasion, callers may want to ensure that 'git commit-graph write
> --split' always writes an incremental, and never spends effort
> condensing the incremental chain [1]. Previously, this was possible by
> passing '--size-multiple=0', but this no longer the case following
> 63020f175f (commit-graph: prefer default size_mult when given zero,
> 2020-01-02).
> 
> Reintroduce a less-magical variant of the above with a new pair of
> arguments to '--split': '--split=no-merge' and '--split=merge-all'. When
> '--split=no-merge' is given, the commit-graph machinery will never
> condense an existing chain and will always write a new incremental.
> Conversely, if '--split=merge-all' is given, any invocation including it
> will always condense a chain if one exists.  If '--split' is given with
> no arguments, it behaves as before and defers to '--size-multiple', and
> so on.
> 
> [1]: This might occur when, for example, a server administrator running
> some program after each push may want to ensure that each job runs
> proportional in time to the size of the push, and does not "jump" when
> the commit-graph machinery decides to trigger a merge.
> 
> Signed-off-by: Taylor Blau <me@ttaylorr.com>
> ---
>  Documentation/git-commit-graph.txt | 18 +++++++++++-----
>  builtin/commit-graph.c             | 33 ++++++++++++++++++++++++++----
>  commit-graph.c                     | 19 +++++++++--------
>  commit-graph.h                     |  7 +++++++
>  t/t5324-split-commit-graph.sh      | 25 ++++++++++++++++++++++
>  5 files changed, 85 insertions(+), 17 deletions(-)
> 
> diff --git a/Documentation/git-commit-graph.txt b/Documentation/git-commit-graph.txt
> index 28d1fee505..8d61ba9f56 100644
> --- a/Documentation/git-commit-graph.txt
> +++ b/Documentation/git-commit-graph.txt
> @@ -57,11 +57,19 @@ or `--stdin-packs`.)
>  With the `--append` option, include all commits that are present in the
>  existing commit-graph file.
>  +
> -With the `--split` option, write the commit-graph as a chain of multiple
> -commit-graph files stored in `<dir>/info/commit-graphs`. The new commits
> -not already in the commit-graph are added in a new "tip" file. This file
> -is merged with the existing file if the following merge conditions are
> -met:
> +With the `--split[=<strategy>]` option, write the commit-graph as a
> +chain of multiple commit-graph files stored in
> +`<dir>/info/commit-graphs`. Commit-graph layers are merged based on the
> +strategy and other splitting options. The new commits not already in the
> +commit-graph are added in a new "tip" file. This file is merged with the
> +existing file if the following merge conditions are met:
> +* If `--split=merge-always` is specified, then a merge is always
> +conducted, and the remaining options are ignored. Conversely, if
> +`--split=no-merge` is specified, a merge is never performed, and the
> +remaining options are ignored. A bare `--split` defers to the remaining
> +options. (Note that merging a chain of commit graphs replaces the
> +existing chain with a length-1 chain where the first and only
> +incremental holds the entire graph).
>  +
>  * If `--size-multiple=<X>` is not specified, let `X` equal 2. If the new
>  tip file would have `N` commits and the previous tip has `M` commits and
> diff --git a/builtin/commit-graph.c b/builtin/commit-graph.c
> index de321c71ad..f03b46d627 100644
> --- a/builtin/commit-graph.c
> +++ b/builtin/commit-graph.c
> @@ -9,7 +9,9 @@
>  
>  static char const * const builtin_commit_graph_usage[] = {
>  	N_("git commit-graph verify [--object-dir <objdir>] [--shallow] [--[no-]progress]"),
> -	N_("git commit-graph write [--object-dir <objdir>] [--append|--split] [--reachable|--stdin-packs|--stdin-commits] [--[no-]progress] <split options>"),
> +	N_("git commit-graph write [--object-dir <objdir>] [--append] "
> +	   "[--split[=<strategy>]] [--reachable|--stdin-packs|--stdin-commits] "
> +	   "[--[no-]progress] <split options>"),
>  	NULL
>  };
>  
> @@ -19,7 +21,9 @@ static const char * const builtin_commit_graph_verify_usage[] = {
>  };
>  
>  static const char * const builtin_commit_graph_write_usage[] = {
> -	N_("git commit-graph write [--object-dir <objdir>] [--append|--split] [--reachable|--stdin-packs|--stdin-commits] [--[no-]progress] <split options>"),
> +	N_("git commit-graph write [--object-dir <objdir>] [--append] "
> +	   "[--split[=<strategy>]] [--reachable|--stdin-packs|--stdin-commits] "
> +	   "[--[no-]progress] <split options>"),
>  	NULL
>  };
>  
> @@ -101,6 +105,25 @@ static int graph_verify(int argc, const char **argv)
>  extern int read_replace_refs;
>  static struct split_commit_graph_opts split_opts;
>  
> +static int write_option_parse_split(const struct option *opt, const char *arg,
> +				    int unset)
> +{
> +	enum commit_graph_split_flags *flags = opt->value;
> +
> +	opts.split = 1;
> +	if (!arg)
> +		return 0;

This allows `--split` to continue working as-is. But should we also
set "*flags = COMMIT_GRAPH_SPLIT_UNSPECIFIED" here? This allows one
to run "git commit-graph write --split=no-merge --split" (which could
happen if "--split=no-merge" is inside an alias).

> +test_expect_success '--split=merge-all always merges incrementals' '
> +	test_when_finished rm -rf a b c &&
> +	rm -rf $graphdir $infodir/commit-graph &&
> +	git reset --hard commits/10 &&
> +	git rev-list -3 HEAD~4 >a &&
> +	git rev-list -2 HEAD~2 >b &&
> +	git rev-list -2 HEAD >c &&
> +	git commit-graph write --split=no-merge --stdin-commits <a &&
> +	git commit-graph write --split=no-merge --stdin-commits <b &&
> +	test_line_count = 2 $graphdir/commit-graph-chain &&
> +	git commit-graph write --split=merge-all --stdin-commits <c &&
> +	test_line_count = 1 $graphdir/commit-graph-chain
> +'
> +
> +test_expect_success '--split=no-merge always writes an incremental' '
> +	test_when_finished rm -rf a b &&
> +	rm -rf $graphdir &&
> +	git reset --hard commits/2 &&
> +	git rev-list HEAD~1 >a &&
> +	git rev-list HEAD >b &&
> +	git commit-graph write --split --stdin-commits <a &&
> +	git commit-graph write --split=no-merge --stdin-commits <b &&
> +	test_line_count = 2 $graphdir/commit-graph-chain
> +'
> +
>  test_done

Good tests!

Thanks,
-Stolee


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 3/3] builtin/commit-graph.c: support '--input=none'
  2020-01-31  0:28 ` [PATCH 3/3] builtin/commit-graph.c: support '--input=none' Taylor Blau
@ 2020-01-31 14:40   ` Derrick Stolee
  2020-01-31 19:45   ` Martin Ågren
  1 sibling, 0 replies; 58+ messages in thread
From: Derrick Stolee @ 2020-01-31 14:40 UTC (permalink / raw)
  To: Taylor Blau, git; +Cc: peff, dstolee, gitster

On 1/30/2020 7:28 PM, Taylor Blau wrote:
> In the previous commit, we introduced '--[no-]merge', and alluded to the
> fact that '--merge' would be useful for callers who wish to always
> trigger a merge of an incremental chain.
> 
> There is a problem with the above approach, which is that there is no
> way to specify to the commit-graph builtin that a caller only wants to
> include commits already in the graph. One can specify '--input=append'
> to include all commits in the existing graphs, but the absence of
> '--input=stdin-{commits,packs}' causes the builtin to call
> 'fill_oids_from_all_packs()'.
> 
> Passing '--input=reachable' (as in 'git commit-graph write
> --split=merge-all --input=reachable --input=append') works around this
> issue by making '--input=reachable' effectively a no-op, but this can be
> prohibitively expensive in large repositories, making it an undesirable
> choice for some users.
> 
> Teach '--input=none' as an option to behave as if '--input=append' were
> given, but to consider no other sources in addition.

The code change looks good.

> +test_expect_success '--split=no-merge, --input=none writes nothing' '
> +	test_when_finished rm -rf a graphs.before graphs.after &&
> +	rm -rf $graphdir &&
> +	git reset --hard commits/2 &&
> +	git rev-list -1 HEAD~1 >a &&
> +	git commit-graph write --split=no-merge --input=stdin-commits <a &&
> +	ls $graphdir/graph-*.graph >graphs.before &&
> +	test_line_count = 1 $graphdir/commit-graph-chain &&
> +	git commit-graph write --split --input=none &&
> +	ls $graphdir/graph-*.graph >graphs.after &&
> +	test_cmp graphs.before graphs.after
> +'
> +
> +test_expect_success '--split=merge-all, --input=none merges the chain' '
> +	test_when_finished rm -rf a b &&
> +	rm -rf $graphdir &&
> +	git reset --hard commits/2 &&
> +	git rev-list -1 HEAD~1 >a &&
> +	git rev-list -1 HEAD >b &&
> +	git commit-graph write --split=no-merge --input=stdin-commits <a &&
> +	git commit-graph write --split=no-merge --input=stdin-commits <b &&
> +	test_line_count = 2 $graphdir/commit-graph-chain &&
> +	git commit-graph write --split=merge-all --input=none &&
> +	test_line_count = 1 $graphdir/commit-graph-chain
> +'

And these tests demonstrate the value quite clearly. Thanks!

-Stolee

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 2/3] builtin/commit-graph.c: introduce '--input=<source>'
  2020-01-31  0:28 ` [PATCH 2/3] builtin/commit-graph.c: introduce '--input=<source>' Taylor Blau
@ 2020-01-31 14:40   ` Derrick Stolee
  2020-02-04  4:21     ` Taylor Blau
  2020-01-31 19:34   ` Martin Ågren
  1 sibling, 1 reply; 58+ messages in thread
From: Derrick Stolee @ 2020-01-31 14:40 UTC (permalink / raw)
  To: Taylor Blau, git; +Cc: peff, dstolee, gitster

On 1/30/2020 7:28 PM, Taylor Blau wrote:
> The 'write' mode of the 'commit-graph' supports input from a number of
> different sources: pack indexes over stdin, commits over stdin, commits
> reachable from all references, and so on. Each of these options are
> specified with a unique option: '--stdin-packs', '--stdin-commits', etc.
> 
> Similar to our replacement of 'git config [--<type>]' with 'git config
> [--type=<type>]' (c.f., fb0dc3bac1 (builtin/config.c: support
> `--type=<type>` as preferred alias for `--<type>`, 2018-04-18)), softly
> deprecate '[--<input>]' in favor of '[--input=<source>]'.
> 
> This makes it more clear to implement new options that are combinations
> of other options (such as, for example, "none", a combination of the old
> "--append" and a new sentinel to specify to _not_ look in other packs,
> which we will implement in a future patch).
> 
> Unfortunately, the new enumerated type is a bitfield, even though it
> makes much more sense as '0, 1, 2, ...'. Even though *almost* all
> options are pairwise exclusive, '--stdin-{packs,commits}' *is*
> compatible with '--append'. For this reason, use a bitfield.
> 
> Signed-off-by: Taylor Blau <me@ttaylorr.com>
> ---
>  Documentation/git-commit-graph.txt | 26 +++++-----
>  builtin/commit-graph.c             | 77 ++++++++++++++++++++++--------
>  t/t5318-commit-graph.sh            | 46 +++++++++---------
>  t/t5324-split-commit-graph.sh      | 44 ++++++++---------
>  4 files changed, 114 insertions(+), 79 deletions(-)
> 
> diff --git a/Documentation/git-commit-graph.txt b/Documentation/git-commit-graph.txt
> index 8d61ba9f56..cbf80226e9 100644
> --- a/Documentation/git-commit-graph.txt
> +++ b/Documentation/git-commit-graph.txt
> @@ -41,21 +41,21 @@ COMMANDS
>  
>  Write a commit-graph file based on the commits found in packfiles.
>  +
> -With the `--stdin-packs` option, generate the new commit graph by
> +With the `--input=stdin-packs` option, generate the new commit graph by
>  walking objects only in the specified pack-indexes. (Cannot be combined
> -with `--stdin-commits` or `--reachable`.)
> +with `--input=stdin-commits` or `--input=reachable`.)
>  +
> -With the `--stdin-commits` option, generate the new commit graph by
> -walking commits starting at the commits specified in stdin as a list
> +With the `--input=stdin-commits` option, generate the new commit graph
> +by walking commits starting at the commits specified in stdin as a list
>  of OIDs in hex, one OID per line. (Cannot be combined with
> -`--stdin-packs` or `--reachable`.)
> +`--input=stdin-packs` or `--input=reachable`.)
>  +
> -With the `--reachable` option, generate the new commit graph by walking
> -commits starting at all refs. (Cannot be combined with `--stdin-commits`
> -or `--stdin-packs`.)
> +With the `--input=reachable` option, generate the new commit graph by
> +walking commits starting at all refs. (Cannot be combined with
> +`--input=stdin-commits` or `--input=stdin-packs`.)
>  +
> -With the `--append` option, include all commits that are present in the
> -existing commit-graph file.
> +With the `--input=append` option, include all commits that are present
> +in the existing commit-graph file.
>  +
>  With the `--split[=<strategy>]` option, write the commit-graph as a
>  chain of multiple commit-graph files stored in
> @@ -107,20 +107,20 @@ $ git commit-graph write
>    using commits in `<pack-index>`.
>  +
>  ------------------------------------------------
> -$ echo <pack-index> | git commit-graph write --stdin-packs
> +$ echo <pack-index> | git commit-graph write --input=stdin-packs
>  ------------------------------------------------
>  
>  * Write a commit-graph file containing all reachable commits.
>  +
>  ------------------------------------------------
> -$ git show-ref -s | git commit-graph write --stdin-commits
> +$ git show-ref -s | git commit-graph write --input=stdin-commits
>  ------------------------------------------------
>  
>  * Write a commit-graph file containing all commits in the current
>    commit-graph file along with those reachable from `HEAD`.
>  +
>  ------------------------------------------------
> -$ git rev-parse HEAD | git commit-graph write --stdin-commits --append
> +$ git rev-parse HEAD | git commit-graph write --input=stdin-commits --input=append
>  ------------------------------------------------
>  
>  
> diff --git a/builtin/commit-graph.c b/builtin/commit-graph.c
> index f03b46d627..03d815e652 100644
> --- a/builtin/commit-graph.c
> +++ b/builtin/commit-graph.c
> @@ -10,7 +10,7 @@
>  static char const * const builtin_commit_graph_usage[] = {
>  	N_("git commit-graph verify [--object-dir <objdir>] [--shallow] [--[no-]progress]"),
>  	N_("git commit-graph write [--object-dir <objdir>] [--append] "
> -	   "[--split[=<strategy>]] [--reachable|--stdin-packs|--stdin-commits] "
> +	   "[--split[=<strategy>]] [--input=<reachable|stdin-packs|stdin-commits>] "
>  	   "[--[no-]progress] <split options>"),
>  	NULL
>  };
> @@ -22,22 +22,48 @@ static const char * const builtin_commit_graph_verify_usage[] = {
>  
>  static const char * const builtin_commit_graph_write_usage[] = {
>  	N_("git commit-graph write [--object-dir <objdir>] [--append] "
> -	   "[--split[=<strategy>]] [--reachable|--stdin-packs|--stdin-commits] "
> +	   "[--split[=<strategy>]] [--input=<reachable|stdin-packs|stdin-commits>] "
>  	   "[--[no-]progress] <split options>"),
>  	NULL
>  };
>  
> +enum commit_graph_input {
> +	COMMIT_GRAPH_INPUT_REACHABLE     = (1 << 1),
> +	COMMIT_GRAPH_INPUT_STDIN_PACKS   = (1 << 2),
> +	COMMIT_GRAPH_INPUT_STDIN_COMMITS = (1 << 3),
> +	COMMIT_GRAPH_INPUT_APPEND        = (1 << 4)
> +};
> +
>  static struct opts_commit_graph {
>  	const char *obj_dir;
> -	int reachable;
> -	int stdin_packs;
> -	int stdin_commits;
> -	int append;
> +	enum commit_graph_input input;
>  	int split;
>  	int shallow;
>  	int progress;
>  } opts;
>  
> +static int option_parse_input(const struct option *opt, const char *arg,
> +			      int unset)
> +{
> +	enum commit_graph_input *to = opt->value;
> +	if (unset || !strcmp(arg, "packs")) {
> +		*to = 0;
> +		return 0;
> +	}

Here, you _do_ clear the bitfield, allowing "--input=reachable --input"
to do the correct override. Thanks!

> +
> +	if (!strcmp(arg, "reachable"))
> +		*to |= COMMIT_GRAPH_INPUT_REACHABLE;
> +	else if (!strcmp(arg, "stdin-packs"))
> +		*to |= COMMIT_GRAPH_INPUT_STDIN_PACKS;
> +	else if (!strcmp(arg, "stdin-commits"))
> +		*to |= COMMIT_GRAPH_INPUT_STDIN_COMMITS;
> +	else if (!strcmp(arg, "append"))
> +		*to |= COMMIT_GRAPH_INPUT_APPEND;
> +	else
> +		die(_("unrecognized --input source, %s"), arg);
> +	return 0;
> +}
> +
>  static struct object_directory *find_odb_or_die(struct repository *r,
>  						const char *obj_dir)
>  {
> @@ -137,14 +163,21 @@ static int graph_write(int argc, const char **argv)
>  		OPT_STRING(0, "object-dir", &opts.obj_dir,
>  			N_("dir"),
>  			N_("The object directory to store the graph")),
> -		OPT_BOOL(0, "reachable", &opts.reachable,
> -			N_("start walk at all refs")),
> -		OPT_BOOL(0, "stdin-packs", &opts.stdin_packs,
> -			N_("scan pack-indexes listed by stdin for commits")),
> -		OPT_BOOL(0, "stdin-commits", &opts.stdin_commits,
> -			N_("start walk at commits listed by stdin")),
> -		OPT_BOOL(0, "append", &opts.append,
> -			N_("include all commits already in the commit-graph file")),
> +		OPT_CALLBACK(0, "input", &opts.input, NULL,
> +			N_("include commits from this source in the graph"),
> +			option_parse_input),
> +		OPT_BIT(0, "reachable", &opts.input,
> +			N_("start walk at all refs"),
> +			COMMIT_GRAPH_INPUT_REACHABLE),
> +		OPT_BIT(0, "stdin-packs", &opts.input,
> +			N_("scan pack-indexes listed by stdin for commits"),
> +			COMMIT_GRAPH_INPUT_STDIN_PACKS),
> +		OPT_BIT(0, "stdin-commits", &opts.input,
> +			N_("start walk at commits listed by stdin"),
> +			COMMIT_GRAPH_INPUT_STDIN_COMMITS),
> +		OPT_BIT(0, "append", &opts.input,
> +			N_("include all commits already in the commit-graph file"),
> +			COMMIT_GRAPH_INPUT_APPEND),

Since you are rewriting how we interpret the deprecated options, perhaps we
should keep some tests around that call these versions? It would make the
test diff be a bit smaller. These options can be removed from the tests if/when
we actually remove the options.

> @@ -351,10 +351,10 @@ test_expect_success '--split=merge-all always merges incrementals' '
>  	git rev-list -3 HEAD~4 >a &&
>  	git rev-list -2 HEAD~2 >b &&
>  	git rev-list -2 HEAD >c &&
> -	git commit-graph write --split=no-merge --stdin-commits <a &&
> -	git commit-graph write --split=no-merge --stdin-commits <b &&
> +	git commit-graph write --split=no-merge --input=stdin-commits <a &&
> +	git commit-graph write --split=no-merge --input=stdin-commits <b &&
>  	test_line_count = 2 $graphdir/commit-graph-chain &&
> -	git commit-graph write --split=merge-all --stdin-commits <c &&
> +	git commit-graph write --split=merge-all --input=stdin-commits <c &&
>  	test_line_count = 1 $graphdir/commit-graph-chain
>  '
>  
> @@ -364,8 +364,8 @@ test_expect_success '--split=no-merge always writes an incremental' '
>  	git reset --hard commits/2 &&
>  	git rev-list HEAD~1 >a &&
>  	git rev-list HEAD >b &&
> -	git commit-graph write --split --stdin-commits <a &&
> -	git commit-graph write --split=no-merge --stdin-commits <b &&
> +	git commit-graph write --split --input=stdin-commits <a &&
> +	git commit-graph write --split=no-merge --input=stdin-commits <b &&
>  	test_line_count = 2 $graphdir/commit-graph-chain
>  '

Updating these new tests with the given options is good. Perhaps convert only one
of the old tests for each of the stdin-packs, reachable, "", and "append" options?

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 0/3] builtin/commit-graph.c: new split/merge options
  2020-01-31  0:28 [PATCH 0/3] builtin/commit-graph.c: new split/merge options Taylor Blau
                   ` (3 preceding siblings ...)
  2020-01-31  0:32 ` [PATCH 0/3] builtin/commit-graph.c: new split/merge options Taylor Blau
@ 2020-01-31 14:41 ` Derrick Stolee
  2020-02-04 23:44 ` Junio C Hamano
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 58+ messages in thread
From: Derrick Stolee @ 2020-01-31 14:41 UTC (permalink / raw)
  To: Taylor Blau, git; +Cc: peff, dstolee, gitster

On 1/30/2020 7:28 PM, Taylor Blau wrote:
> Hi,
> 
> Here are another few patches that came out of working on GitHub's
> deployment of incremental commit-graphs. These three patches introduce
> two new options: '--split[=<merge-all|no-merge>]' and
> '--input=<source>'.
> 
> The former controls whether or not commit-graph's split machinery should
> either write an incremental commit graph, squash the chain of
> incrementals, or defer to the other options.
> 
> (This comes from GitHub's desire to have more fine-grained control over
> the commit-graph chain's behavior. We run short jobs after every push
> that we would like to limit the running time of, and hence we do not
> want to ever merge a long chain of incrementals unless we specifically
> opt into that.)

I can imagine many scenarios that require the amount of work to be
predictable, which the current merge strategies do not guarantee.

> The latter of the two new options does two things:
> 
>   * It cleans up the many options that specify input sources (e.g.,
>     '--stdin-commits', '--stdin-packs', '--reachable' and so on) under
>     one unifying name.
> 
>   * It allows us to introduce a new argument '--input=none', to prevent
>     walking each packfile when neither '--stdin-commits' nor
>     '--stdin-packs' was given.

I'm happy with these goals.

> Together, these have the combined effect of being able to write the
> following two new invocations:
> 
>   $ git commit-graph write --split=merge-all --input=none
> 
>   $ git commit-graph write --split=no-merge --input=stdin-packs
> 
> to (1) merge the chain, and (2) write a single new incremental.

This is much cleaner than adding yet another option to the builtin,
and allows more flexibility in future extensions.

I have a few comments on the patches, but they are minor.

Thanks!
-Stolee


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 1/3] builtin/commit-graph.c: support '--split[=<strategy>]'
  2020-01-31  0:28 ` [PATCH 1/3] builtin/commit-graph.c: support '--split[=<strategy>]' Taylor Blau
  2020-01-31 14:19   ` Derrick Stolee
@ 2020-01-31 19:27   ` Martin Ågren
  2020-02-04  4:06     ` Taylor Blau
  2020-01-31 23:34   ` SZEDER Gábor
  2 siblings, 1 reply; 58+ messages in thread
From: Martin Ågren @ 2020-01-31 19:27 UTC (permalink / raw)
  To: Taylor Blau; +Cc: Git Mailing List, Jeff King, Derrick Stolee, Junio C Hamano

On Fri, 31 Jan 2020 at 01:29, Taylor Blau <me@ttaylorr.com> wrote:
> With '--split', the commit-graph machinery writes new commits in another
> incremental commit-graph which is part of the existing chain, and
> optionally decides to condense the chain into a single commit-graph.
> This is done to ensure that the aysmptotic behavior of looking up a

asymptotic

> commit in an incremental chain is dominated by the number of
> incrementals in that chain. It can be controlled by the '--max-commits'
> and '--size-multiple' options.
>
> On occasion, callers may want to ensure that 'git commit-graph write
> --split' always writes an incremental, and never spends effort
> condensing the incremental chain [1]. Previously, this was possible by
> passing '--size-multiple=0', but this no longer the case following
> 63020f175f (commit-graph: prefer default size_mult when given zero,
> 2020-01-02).
>
> Reintroduce a less-magical variant of the above with a new pair of
> arguments to '--split': '--split=no-merge' and '--split=merge-all'. When
> '--split=no-merge' is given, the commit-graph machinery will never
> condense an existing chain and will always write a new incremental.
> Conversely, if '--split=merge-all' is given, any invocation including it
> will always condense a chain if one exists.  If '--split' is given with
> no arguments, it behaves as before and defers to '--size-multiple', and
> so on.

I understand your motivation for doing this -- it all seems quite sound
to me. Not being too familiar with this commit-graph splitting and
merging, I had a hard time groking this terminology though. From what I
understand, before this patch, `--split` means "write the commit-graph
using the 'split' file-format / in a split fashion". Ok, that makes
sense.

From seeing `--split=no-merge`, I have no idea how to even parse that.
Of course I don't want to merge, I want to split! Well not split, but
write split files.

I think it would help me (and others like me) if we could somehow
separate "I want to use 'split' files" from "and here's how I want you
to decide on the merging". That is, which "strategy" to use. Obviously,
talking about a "merge strategy" would be stupid and "split strategy"
also seems a bit odd. "Coalescing strategy"? "Joining strategy"?

Or can you convince me otherwise? From which angle should I look at
this?

> -With the `--split` option, write the commit-graph as a chain of multiple
> -commit-graph files stored in `<dir>/info/commit-graphs`. The new commits
> -not already in the commit-graph are added in a new "tip" file. This file
> -is merged with the existing file if the following merge conditions are
> -met:
> +With the `--split[=<strategy>]` option, write the commit-graph as a
> +chain of multiple commit-graph files stored in
> +`<dir>/info/commit-graphs`. Commit-graph layers are merged based on the
> +strategy and other splitting options. The new commits not already in the
> +commit-graph are added in a new "tip" file. This file is merged with the
> +existing file if the following merge conditions are met:
> +* If `--split=merge-always` is specified, then a merge is always
> +conducted, and the remaining options are ignored. Conversely, if
> +`--split=no-merge` is specified, a merge is never performed, and the
> +remaining options are ignored. A bare `--split` defers to the remaining
> +options. (Note that merging a chain of commit graphs replaces the
> +existing chain with a length-1 chain where the first and only
> +incremental holds the entire graph).

To better understand the background for this patch, I read the manpage
as it stands today. From the section on `--split`, I got this
impression: Let's say that `--max-commits` is huge, so all that matters
is the `--size-multiple`. Let's say it's two. If the current tip
contains three commits and we're about to write one with two, then 2*2 >
3 so we will merge, i.e., write a tip file with five commits. Unless of
course *that* is more than half the size of the file before. And so on.
We might just merge once, or maybe "many" files in an avalanche effect.
Every now and then, such avalanches will go all the way back to the
first file.

Now this says something different, namely that once we decide to merge,
we do it all the way back, no matter what.

The commit message of 1771be90c8 ("commit-graph: merge commit-graph
chains", 2019-06-18) seems to support my original understanding, at
least for `--size-multiple`, but not `--max-commits`, curiously enough.

Can you clarify?

> -               OPT_BOOL(0, "split", &opts.split,
> -                       N_("allow writing an incremental commit-graph file")),
> +               OPT_CALLBACK_F(0, "split", &split_opts.flags, NULL,
> +                       N_("allow writing an incremental commit-graph file"),

This still sounds very boolean. Cramming in the "strategy" might be hard
-- is this an argument in favor of having two separate options? ;-)

> +enum commit_graph_split_flags {
> +       COMMIT_GRAPH_SPLIT_UNSPECIFIED      = 0,
> +       COMMIT_GRAPH_SPLIT_MERGE_REQUIRED   = 1,
> +       COMMIT_GRAPH_SPLIT_MERGE_PROHIBITED = 2
> +};

I wonder if this should be "MERGE_AUTO" rather than "UNSPECIFIED". This
is related to Stolee's comment, I think.


Martin

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 2/3] builtin/commit-graph.c: introduce '--input=<source>'
  2020-01-31  0:28 ` [PATCH 2/3] builtin/commit-graph.c: introduce '--input=<source>' Taylor Blau
  2020-01-31 14:40   ` Derrick Stolee
@ 2020-01-31 19:34   ` Martin Ågren
  2020-02-04  4:51     ` Taylor Blau
  1 sibling, 1 reply; 58+ messages in thread
From: Martin Ågren @ 2020-01-31 19:34 UTC (permalink / raw)
  To: Taylor Blau; +Cc: Git Mailing List, Jeff King, Derrick Stolee, Junio C Hamano

On Fri, 31 Jan 2020 at 01:30, Taylor Blau <me@ttaylorr.com> wrote:
> The 'write' mode of the 'commit-graph' supports input from a number of
> different sources: pack indexes over stdin, commits over stdin, commits
> reachable from all references, and so on. Each of these options are
> specified with a unique option: '--stdin-packs', '--stdin-commits', etc.
>
> Similar to our replacement of 'git config [--<type>]' with 'git config
> [--type=<type>]' (c.f., fb0dc3bac1 (builtin/config.c: support
> `--type=<type>` as preferred alias for `--<type>`, 2018-04-18)), softly
> deprecate '[--<input>]' in favor of '[--input=<source>]'.
>
> This makes it more clear to implement new options that are combinations
> of other options (such as, for example, "none", a combination of the old
> "--append" and a new sentinel to specify to _not_ look in other packs,
> which we will implement in a future patch).

Makes sense.

> Unfortunately, the new enumerated type is a bitfield, even though it
> makes much more sense as '0, 1, 2, ...'. Even though *almost* all
> options are pairwise exclusive, '--stdin-{packs,commits}' *is*
> compatible with '--append'. For this reason, use a bitfield.

> -With the `--append` option, include all commits that are present in the
> -existing commit-graph file.
> +With the `--input=append` option, include all commits that are present
> +in the existing commit-graph file.

Would it be too crazy to call this `--input=existing` instead, and have
it be the same as `--append`? I find that `--append` makes a lot of
sense (it's a mode we can turn on or off), whereas "input = append"
seems more odd.

From the next commit message, we learn that a long `--input=append`
triggers `fill_oids_from_all_packs()`, which wouldn't match my expecting
from "--input=existing". So...

Does this hint that we could leave `--append` alone? We'd have lots of
different inputs to choose from using `--input`, and an `--append` mode
on top of that. That would make your inputs truly mutually exclusive and
you don't need the bitfield anymore, as you mention above.  Hmm?

Would that mean that the falling back to `fill_oids_from_all_packs()`
would follow from "is there an --input?", as opposed to from "is there
an --input except --input=append?"?

(I don't know whether these inputs really *have* to be exclusive, or if
that's more of an implementation detail. That is, even without an
"append" input, might we one day be able to handle more inputs at once?
Maybe this is not the time to worry about that.)


Martin

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 3/3] builtin/commit-graph.c: support '--input=none'
  2020-01-31  0:28 ` [PATCH 3/3] builtin/commit-graph.c: support '--input=none' Taylor Blau
  2020-01-31 14:40   ` Derrick Stolee
@ 2020-01-31 19:45   ` Martin Ågren
  2020-02-04  5:01     ` Taylor Blau
  1 sibling, 1 reply; 58+ messages in thread
From: Martin Ågren @ 2020-01-31 19:45 UTC (permalink / raw)
  To: Taylor Blau; +Cc: Git Mailing List, Jeff King, Derrick Stolee, Junio C Hamano

On Fri, 31 Jan 2020 at 01:30, Taylor Blau <me@ttaylorr.com> wrote:
> In the previous commit, we introduced '--[no-]merge', and alluded to the
> fact that '--merge' would be useful for callers who wish to always
> trigger a merge of an incremental chain.

Hmmm. So it looks like you've already had similar thoughts as I did
about patch 1/3. At some point, you had a separate `--merge=...` option,
then later made that `--split=...`. :-) Could you say something about why
you changed your mind?

> There is a problem with the above approach, which is that there is no
> way to specify to the commit-graph builtin that a caller only wants to
> include commits already in the graph. One can specify '--input=append'
> to include all commits in the existing graphs, but the absence of
> '--input=stdin-{commits,packs}' causes the builtin to call
> 'fill_oids_from_all_packs()'.

(Use one of those options with an empty stdin? Anyway, let's read on.)

> Passing '--input=reachable' (as in 'git commit-graph write
> --split=merge-all --input=reachable --input=append') works around this
> issue by making '--input=reachable' effectively a no-op, but this can be
> prohibitively expensive in large repositories, making it an undesirable
> choice for some users.
>
> Teach '--input=none' as an option to behave as if '--input=append' were
> given, but to consider no other sources in addition.

`--input=none` almost makes me wonder if it should produce an empty
commit-graph. But there wouldn't be much point in that... I guess
another way of defining this would be that it "uses no input, and
implies `--append`".

> This, in conjunction with the option introduced in the previous patch
> offers the convenient way to force the commit-graph machinery to
> condense a chain of incrementals without requiring any new commits:
>
>   $ git commit-graph write --split=merge-all --input=none

Right.

> --- a/Documentation/git-commit-graph.txt
> +++ b/Documentation/git-commit-graph.txt
> @@ -39,24 +39,29 @@ COMMANDS
>  --------
>  'write'::
>
> -Write a commit-graph file based on the commits found in packfiles.
> +Write a commit-graph file based on the commits specified:
> +* With the `--input=stdin-packs` option, generate the new commit graph
> +by walking objects only in the specified pack-indexes. (Cannot be
> +combined with `--input=stdin-commits` or `--input=reachable`.)
>  +
> -With the `--input=stdin-packs` option, generate the new commit graph by
> -walking objects only in the specified pack-indexes. (Cannot be combined
> -with `--input=stdin-commits` or `--input=reachable`.)
> -+
> -With the `--input=stdin-commits` option, generate the new commit graph
> +* With the `--input=stdin-commits` option, generate the new commit graph
>  by walking commits starting at the commits specified in stdin as a list
>  of OIDs in hex, one OID per line. (Cannot be combined with
>  `--input=stdin-packs` or `--input=reachable`.)
>  +
> -With the `--input=reachable` option, generate the new commit graph by
> +* With the `--input=reachable` option, generate the new commit graph by
>  walking commits starting at all refs. (Cannot be combined with
>  `--input=stdin-commits` or `--input=stdin-packs`.)
>  +
> -With the `--input=append` option, include all commits that are present
> +* With the `--input=append` option, include all commits that are present
>  in the existing commit-graph file.

Do these changes above really belong in this commit?

> +* With the `--input=none` option, behave as if `input=append` were
> +given, but do not walk other packs to find additional commits.
> +
> +If none of the above options are given, then commits found in
> +packfiles are specified.

"specified"? Plus, that also happens for `--input=append` right? (It
really seems like "append" is an odd one among all the inputs.)

>         N_("git commit-graph write [--object-dir <objdir>] [--append] "
> -          "[--split[=<strategy>]] [--input=<reachable|stdin-packs|stdin-commits>] "
> +          "[--split[=<strategy>]] "
> +          "[--input=<reachable|stdin-packs|stdin-commits|none>] "
>            "[--[no-]progress] <split options>"),

Hmm, you've left "--append" the old way.


Martin

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 1/3] builtin/commit-graph.c: support '--split[=<strategy>]'
  2020-01-31  0:28 ` [PATCH 1/3] builtin/commit-graph.c: support '--split[=<strategy>]' Taylor Blau
  2020-01-31 14:19   ` Derrick Stolee
  2020-01-31 19:27   ` Martin Ågren
@ 2020-01-31 23:34   ` SZEDER Gábor
  2020-02-01 21:25     ` Johannes Schindelin
  2020-02-04  3:59     ` Taylor Blau
  2 siblings, 2 replies; 58+ messages in thread
From: SZEDER Gábor @ 2020-01-31 23:34 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, peff, dstolee, gitster

On Thu, Jan 30, 2020 at 04:28:17PM -0800, Taylor Blau wrote:
> diff --git a/commit-graph.c b/commit-graph.c
> index 6d34829f57..02e6ad9d1f 100644
> --- a/commit-graph.c
> +++ b/commit-graph.c
> @@ -1565,15 +1565,18 @@ static void split_graph_merge_strategy(struct write_commit_graph_context *ctx)
>  	num_commits = ctx->commits.nr;
>  	ctx->num_commit_graphs_after = ctx->num_commit_graphs_before + 1;
>  
> -	while (g && (g->num_commits <= size_mult * num_commits ||
> -		    (max_commits && num_commits > max_commits))) {
> -		if (g->odb != ctx->odb)
> -			break;
> +	if (ctx->split_opts->flags != COMMIT_GRAPH_SPLIT_MERGE_PROHIBITED) {

This line segfaults in the tests 'fetch.writeCommitGraph' and
'fetch.writeCommitGraph with submodules' in 't5510-fetch.sh', because
'git fetch' doesn't pass a 'split_opts' to the commit-graph functions.

Thread 1 "git" received signal SIGSEGV, Segmentation fault.
0x00000000005113dd in split_graph_merge_strategy (ctx=0x9ca930)
    at commit-graph.c:1568
1568            if (ctx->split_opts->flags != COMMIT_GRAPH_SPLIT_MERGE_PROHIBITED) {
(gdb) p *ctx
$1 = {r = 0x9ae2a0 <the_repo>, odb = 0x9c0df0, graph_name = 0x0, oids = {
    list = 0xa02660, nr = 12, alloc = 1024}, commits = {list = 0x9caa00, 
    nr = 6, alloc = 6}, num_extra_edges = 0, approx_nr_objects = 0, 
  progress = 0x0, progress_done = 0, progress_cnt = 0, base_graph_name = 0x0, 
  num_commit_graphs_before = 0, num_commit_graphs_after = 1, 
  commit_graph_filenames_before = 0x0, commit_graph_filenames_after = 0x0, 
  commit_graph_hash_after = 0x0, new_num_commits_in_base = 0, 
  new_base_graph = 0x0, append = 0, report_progress = 1, split = 1, 
  check_oids = 0, split_opts = 0x0}
                  ^^^^^^^^^^^^^^^^
(gdb) bt
#0  0x00000000005113dd in split_graph_merge_strategy (ctx=0x9ca930)
    at commit-graph.c:1568
#1  0x0000000000512446 in write_commit_graph (odb=0x9c0df0, pack_indexes=0x0, 
    commit_hex=0x7fffffffd550, 
    flags=(COMMIT_GRAPH_WRITE_PROGRESS | COMMIT_GRAPH_WRITE_SPLIT), 
    split_opts=0x0) at commit-graph.c:1891
#2  0x000000000050fd86 in write_commit_graph_reachable (odb=0x9c0df0, 
    flags=(COMMIT_GRAPH_WRITE_PROGRESS | COMMIT_GRAPH_WRITE_SPLIT), 
    split_opts=0x0) at commit-graph.c:1174
    ^^^^^^^^^^^^^^
#3  0x0000000000444ea4 in cmd_fetch (argc=0, argv=0x7fffffffd9b8, prefix=0x0)
    at builtin/fetch.c:1873
#4  0x0000000000406154 in run_builtin (p=0x969a88 <commands+840>, argc=2, 
    argv=0x7fffffffd9b0) at git.c:444
#5  0x00000000004064a4 in handle_builtin (argc=2, argv=0x7fffffffd9b0)
    at git.c:674
#6  0x000000000040674c in run_argv (argcp=0x7fffffffd84c, argv=0x7fffffffd840)
    at git.c:741
#7  0x0000000000406bbd in cmd_main (argc=2, argv=0x7fffffffd9b0) at git.c:872
#8  0x00000000004cfd4e in main (argc=5, argv=0x7fffffffd998)
    at common-main.c:52

Note that this function split_graph_merge_strategy() does look at
various fields in 'ctx->split_opts' a bit earlier, but those accesses
are protected by a 'if (ctx->split_opts)' condition.
expire_commit_graphs() does the same.


> +		while (g && (g->num_commits <= size_mult * num_commits ||
> +			    (max_commits && num_commits > max_commits) ||
> +			    (ctx->split_opts->flags == COMMIT_GRAPH_SPLIT_MERGE_REQUIRED))) {
> +			if (g->odb != ctx->odb)
> +				break;
>  
> -		num_commits += g->num_commits;
> -		g = g->base_graph;
> +			num_commits += g->num_commits;
> +			g = g->base_graph;
>  
> -		ctx->num_commit_graphs_after--;
> +			ctx->num_commit_graphs_after--;
> +		}
>  	}
>  

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 1/3] builtin/commit-graph.c: support '--split[=<strategy>]'
  2020-01-31 23:34   ` SZEDER Gábor
@ 2020-02-01 21:25     ` Johannes Schindelin
  2020-02-03 10:47       ` SZEDER Gábor
  2020-02-04  3:59       ` Taylor Blau
  2020-02-04  3:59     ` Taylor Blau
  1 sibling, 2 replies; 58+ messages in thread
From: Johannes Schindelin @ 2020-02-01 21:25 UTC (permalink / raw)
  To: SZEDER Gábor; +Cc: Taylor Blau, git, peff, dstolee, gitster

[-- Attachment #1: Type: text/plain, Size: 4698 bytes --]

Hi,

On Sat, 1 Feb 2020, SZEDER Gábor wrote:

> On Thu, Jan 30, 2020 at 04:28:17PM -0800, Taylor Blau wrote:
> > diff --git a/commit-graph.c b/commit-graph.c
> > index 6d34829f57..02e6ad9d1f 100644
> > --- a/commit-graph.c
> > +++ b/commit-graph.c
> > @@ -1565,15 +1565,18 @@ static void split_graph_merge_strategy(struct write_commit_graph_context *ctx)
> >  	num_commits = ctx->commits.nr;
> >  	ctx->num_commit_graphs_after = ctx->num_commit_graphs_before + 1;
> >
> > -	while (g && (g->num_commits <= size_mult * num_commits ||
> > -		    (max_commits && num_commits > max_commits))) {
> > -		if (g->odb != ctx->odb)
> > -			break;
> > +	if (ctx->split_opts->flags != COMMIT_GRAPH_SPLIT_MERGE_PROHIBITED) {
>
> This line segfaults in the tests 'fetch.writeCommitGraph' and
> 'fetch.writeCommitGraph with submodules' in 't5510-fetch.sh', because
> 'git fetch' doesn't pass a 'split_opts' to the commit-graph functions.

I noticed the same. This patch seems to fix it for me:

-- snip --
diff --git a/commit-graph.c b/commit-graph.c
index a5d7624073f..af5c58833cf 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -1566,7 +1566,8 @@ static void split_graph_merge_strategy(struct write_commit_graph_context *ctx)
 	num_commits = ctx->commits.nr;
 	ctx->num_commit_graphs_after = ctx->num_commit_graphs_before + 1;

-	if (ctx->split_opts->flags != COMMIT_GRAPH_SPLIT_MERGE_PROHIBITED) {
+	if (ctx->split_opts &&
+	    ctx->split_opts->flags != COMMIT_GRAPH_SPLIT_MERGE_PROHIBITED) {
 		while (g && (g->num_commits <= size_mult * num_commits ||
 			    (max_commits && num_commits > max_commits) ||
 			    (ctx->split_opts->flags == COMMIT_GRAPH_SPLIT_MERGE_REQUIRED))) {
-- snap --

For your convenience, I also pushed this up as
`tb/commit-graph-split-merge` to https://github.com/dscho/git

Thanks,
Dscho


>
> Thread 1 "git" received signal SIGSEGV, Segmentation fault.
> 0x00000000005113dd in split_graph_merge_strategy (ctx=0x9ca930)
>     at commit-graph.c:1568
> 1568            if (ctx->split_opts->flags != COMMIT_GRAPH_SPLIT_MERGE_PROHIBITED) {
> (gdb) p *ctx
> $1 = {r = 0x9ae2a0 <the_repo>, odb = 0x9c0df0, graph_name = 0x0, oids = {
>     list = 0xa02660, nr = 12, alloc = 1024}, commits = {list = 0x9caa00,
>     nr = 6, alloc = 6}, num_extra_edges = 0, approx_nr_objects = 0,
>   progress = 0x0, progress_done = 0, progress_cnt = 0, base_graph_name = 0x0,
>   num_commit_graphs_before = 0, num_commit_graphs_after = 1,
>   commit_graph_filenames_before = 0x0, commit_graph_filenames_after = 0x0,
>   commit_graph_hash_after = 0x0, new_num_commits_in_base = 0,
>   new_base_graph = 0x0, append = 0, report_progress = 1, split = 1,
>   check_oids = 0, split_opts = 0x0}
>                   ^^^^^^^^^^^^^^^^
> (gdb) bt
> #0  0x00000000005113dd in split_graph_merge_strategy (ctx=0x9ca930)
>     at commit-graph.c:1568
> #1  0x0000000000512446 in write_commit_graph (odb=0x9c0df0, pack_indexes=0x0,
>     commit_hex=0x7fffffffd550,
>     flags=(COMMIT_GRAPH_WRITE_PROGRESS | COMMIT_GRAPH_WRITE_SPLIT),
>     split_opts=0x0) at commit-graph.c:1891
> #2  0x000000000050fd86 in write_commit_graph_reachable (odb=0x9c0df0,
>     flags=(COMMIT_GRAPH_WRITE_PROGRESS | COMMIT_GRAPH_WRITE_SPLIT),
>     split_opts=0x0) at commit-graph.c:1174
>     ^^^^^^^^^^^^^^
> #3  0x0000000000444ea4 in cmd_fetch (argc=0, argv=0x7fffffffd9b8, prefix=0x0)
>     at builtin/fetch.c:1873
> #4  0x0000000000406154 in run_builtin (p=0x969a88 <commands+840>, argc=2,
>     argv=0x7fffffffd9b0) at git.c:444
> #5  0x00000000004064a4 in handle_builtin (argc=2, argv=0x7fffffffd9b0)
>     at git.c:674
> #6  0x000000000040674c in run_argv (argcp=0x7fffffffd84c, argv=0x7fffffffd840)
>     at git.c:741
> #7  0x0000000000406bbd in cmd_main (argc=2, argv=0x7fffffffd9b0) at git.c:872
> #8  0x00000000004cfd4e in main (argc=5, argv=0x7fffffffd998)
>     at common-main.c:52
>
> Note that this function split_graph_merge_strategy() does look at
> various fields in 'ctx->split_opts' a bit earlier, but those accesses
> are protected by a 'if (ctx->split_opts)' condition.
> expire_commit_graphs() does the same.
>
>
> > +		while (g && (g->num_commits <= size_mult * num_commits ||
> > +			    (max_commits && num_commits > max_commits) ||
> > +			    (ctx->split_opts->flags == COMMIT_GRAPH_SPLIT_MERGE_REQUIRED))) {
> > +			if (g->odb != ctx->odb)
> > +				break;
> >
> > -		num_commits += g->num_commits;
> > -		g = g->base_graph;
> > +			num_commits += g->num_commits;
> > +			g = g->base_graph;
> >
> > -		ctx->num_commit_graphs_after--;
> > +			ctx->num_commit_graphs_after--;
> > +		}
> >  	}
> >
>

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 1/3] builtin/commit-graph.c: support '--split[=<strategy>]'
  2020-02-01 21:25     ` Johannes Schindelin
@ 2020-02-03 10:47       ` SZEDER Gábor
  2020-02-03 11:11         ` Jeff King
  2020-02-04  3:59       ` Taylor Blau
  1 sibling, 1 reply; 58+ messages in thread
From: SZEDER Gábor @ 2020-02-03 10:47 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Taylor Blau, git, peff, dstolee, gitster

On Sat, Feb 01, 2020 at 10:25:36PM +0100, Johannes Schindelin wrote:
> Hi,
> 
> On Sat, 1 Feb 2020, SZEDER Gábor wrote:
> 
> > On Thu, Jan 30, 2020 at 04:28:17PM -0800, Taylor Blau wrote:
> > > diff --git a/commit-graph.c b/commit-graph.c
> > > index 6d34829f57..02e6ad9d1f 100644
> > > --- a/commit-graph.c
> > > +++ b/commit-graph.c
> > > @@ -1565,15 +1565,18 @@ static void split_graph_merge_strategy(struct write_commit_graph_context *ctx)
> > >  	num_commits = ctx->commits.nr;
> > >  	ctx->num_commit_graphs_after = ctx->num_commit_graphs_before + 1;
> > >
> > > -	while (g && (g->num_commits <= size_mult * num_commits ||
> > > -		    (max_commits && num_commits > max_commits))) {
> > > -		if (g->odb != ctx->odb)
> > > -			break;
> > > +	if (ctx->split_opts->flags != COMMIT_GRAPH_SPLIT_MERGE_PROHIBITED) {
> >
> > This line segfaults in the tests 'fetch.writeCommitGraph' and
> > 'fetch.writeCommitGraph with submodules' in 't5510-fetch.sh', because
> > 'git fetch' doesn't pass a 'split_opts' to the commit-graph functions.
> 
> I noticed the same. This patch seems to fix it for me:
> 
> -- snip --
> diff --git a/commit-graph.c b/commit-graph.c
> index a5d7624073f..af5c58833cf 100644
> --- a/commit-graph.c
> +++ b/commit-graph.c
> @@ -1566,7 +1566,8 @@ static void split_graph_merge_strategy(struct write_commit_graph_context *ctx)
>  	num_commits = ctx->commits.nr;
>  	ctx->num_commit_graphs_after = ctx->num_commit_graphs_before + 1;
> 
> -	if (ctx->split_opts->flags != COMMIT_GRAPH_SPLIT_MERGE_PROHIBITED) {
> +	if (ctx->split_opts &&
> +	    ctx->split_opts->flags != COMMIT_GRAPH_SPLIT_MERGE_PROHIBITED) {
>  		while (g && (g->num_commits <= size_mult * num_commits ||
>  			    (max_commits && num_commits > max_commits) ||
>  			    (ctx->split_opts->flags == COMMIT_GRAPH_SPLIT_MERGE_REQUIRED))) {
> -- snap --

Yeah, that's what I noted below, but I'm not sure that this is the
right solution.  Why doesn't cmd_fetch() call
write_commit_graph_reachable() with a real 'split_opts' parameter in
the first place?  Wouldn't it be better if it did?


> > Thread 1 "git" received signal SIGSEGV, Segmentation fault.
> > 0x00000000005113dd in split_graph_merge_strategy (ctx=0x9ca930)
> >     at commit-graph.c:1568
> > 1568            if (ctx->split_opts->flags != COMMIT_GRAPH_SPLIT_MERGE_PROHIBITED) {
> > (gdb) p *ctx
> > $1 = {r = 0x9ae2a0 <the_repo>, odb = 0x9c0df0, graph_name = 0x0, oids = {
> >     list = 0xa02660, nr = 12, alloc = 1024}, commits = {list = 0x9caa00,
> >     nr = 6, alloc = 6}, num_extra_edges = 0, approx_nr_objects = 0,
> >   progress = 0x0, progress_done = 0, progress_cnt = 0, base_graph_name = 0x0,
> >   num_commit_graphs_before = 0, num_commit_graphs_after = 1,
> >   commit_graph_filenames_before = 0x0, commit_graph_filenames_after = 0x0,
> >   commit_graph_hash_after = 0x0, new_num_commits_in_base = 0,
> >   new_base_graph = 0x0, append = 0, report_progress = 1, split = 1,
> >   check_oids = 0, split_opts = 0x0}
> >                   ^^^^^^^^^^^^^^^^
> > (gdb) bt
> > #0  0x00000000005113dd in split_graph_merge_strategy (ctx=0x9ca930)
> >     at commit-graph.c:1568
> > #1  0x0000000000512446 in write_commit_graph (odb=0x9c0df0, pack_indexes=0x0,
> >     commit_hex=0x7fffffffd550,
> >     flags=(COMMIT_GRAPH_WRITE_PROGRESS | COMMIT_GRAPH_WRITE_SPLIT),
> >     split_opts=0x0) at commit-graph.c:1891
> > #2  0x000000000050fd86 in write_commit_graph_reachable (odb=0x9c0df0,
> >     flags=(COMMIT_GRAPH_WRITE_PROGRESS | COMMIT_GRAPH_WRITE_SPLIT),
> >     split_opts=0x0) at commit-graph.c:1174
> >     ^^^^^^^^^^^^^^
> > #3  0x0000000000444ea4 in cmd_fetch (argc=0, argv=0x7fffffffd9b8, prefix=0x0)
> >     at builtin/fetch.c:1873
> > #4  0x0000000000406154 in run_builtin (p=0x969a88 <commands+840>, argc=2,
> >     argv=0x7fffffffd9b0) at git.c:444
> > #5  0x00000000004064a4 in handle_builtin (argc=2, argv=0x7fffffffd9b0)
> >     at git.c:674
> > #6  0x000000000040674c in run_argv (argcp=0x7fffffffd84c, argv=0x7fffffffd840)
> >     at git.c:741
> > #7  0x0000000000406bbd in cmd_main (argc=2, argv=0x7fffffffd9b0) at git.c:872
> > #8  0x00000000004cfd4e in main (argc=5, argv=0x7fffffffd998)
> >     at common-main.c:52
> >
> > Note that this function split_graph_merge_strategy() does look at
> > various fields in 'ctx->split_opts' a bit earlier, but those accesses
> > are protected by a 'if (ctx->split_opts)' condition.
> > expire_commit_graphs() does the same.
> >
> >
> > > +		while (g && (g->num_commits <= size_mult * num_commits ||
> > > +			    (max_commits && num_commits > max_commits) ||
> > > +			    (ctx->split_opts->flags == COMMIT_GRAPH_SPLIT_MERGE_REQUIRED))) {
> > > +			if (g->odb != ctx->odb)
> > > +				break;
> > >
> > > -		num_commits += g->num_commits;
> > > -		g = g->base_graph;
> > > +			num_commits += g->num_commits;
> > > +			g = g->base_graph;
> > >
> > > -		ctx->num_commit_graphs_after--;
> > > +			ctx->num_commit_graphs_after--;
> > > +		}
> > >  	}
> > >
> >


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 1/3] builtin/commit-graph.c: support '--split[=<strategy>]'
  2020-02-03 10:47       ` SZEDER Gábor
@ 2020-02-03 11:11         ` Jeff King
  2020-02-04  3:58           ` Taylor Blau
  0 siblings, 1 reply; 58+ messages in thread
From: Jeff King @ 2020-02-03 11:11 UTC (permalink / raw)
  To: SZEDER Gábor; +Cc: Johannes Schindelin, Taylor Blau, git, dstolee, gitster

On Mon, Feb 03, 2020 at 11:47:30AM +0100, SZEDER Gábor wrote:

> > -- snip --
> > diff --git a/commit-graph.c b/commit-graph.c
> > index a5d7624073f..af5c58833cf 100644
> > --- a/commit-graph.c
> > +++ b/commit-graph.c
> > @@ -1566,7 +1566,8 @@ static void split_graph_merge_strategy(struct write_commit_graph_context *ctx)
> >  	num_commits = ctx->commits.nr;
> >  	ctx->num_commit_graphs_after = ctx->num_commit_graphs_before + 1;
> > 
> > -	if (ctx->split_opts->flags != COMMIT_GRAPH_SPLIT_MERGE_PROHIBITED) {
> > +	if (ctx->split_opts &&
> > +	    ctx->split_opts->flags != COMMIT_GRAPH_SPLIT_MERGE_PROHIBITED) {
> >  		while (g && (g->num_commits <= size_mult * num_commits ||
> >  			    (max_commits && num_commits > max_commits) ||
> >  			    (ctx->split_opts->flags == COMMIT_GRAPH_SPLIT_MERGE_REQUIRED))) {
> > -- snap --
> 
> Yeah, that's what I noted below, but I'm not sure that this is the
> right solution.  Why doesn't cmd_fetch() call
> write_commit_graph_reachable() with a real 'split_opts' parameter in
> the first place?  Wouldn't it be better if it did?

It used to provide a "blank" split_opts until 63020f175f (commit-graph:
prefer default size_mult when given zero, 2020-01-02), which caused a
bug. That bug was since fixed, but the idea was to keep things
convenient for callers.

Whether that's a good idea or not I guess is up for debate, but it
certainly is what the commit-graph code has tried to provide for some
time. If we're not going to follow that in this new code, then we should
presumably also rip out all of the other "if (ctx->split_opts)" lines.

-Peff

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 1/3] builtin/commit-graph.c: support '--split[=<strategy>]'
  2020-01-31 14:19   ` Derrick Stolee
@ 2020-02-04  3:47     ` Taylor Blau
  0 siblings, 0 replies; 58+ messages in thread
From: Taylor Blau @ 2020-02-04  3:47 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: Taylor Blau, git, peff, dstolee, gitster

On Fri, Jan 31, 2020 at 09:19:19AM -0500, Derrick Stolee wrote:
> On 1/30/2020 7:28 PM, Taylor Blau wrote:
> > With '--split', the commit-graph machinery writes new commits in another
> > incremental commit-graph which is part of the existing chain, and
> > optionally decides to condense the chain into a single commit-graph.
> > This is done to ensure that the aysmptotic behavior of looking up a
> > commit in an incremental chain is dominated by the number of
> > incrementals in that chain. It can be controlled by the '--max-commits'
> > and '--size-multiple' options.
> >
> > On occasion, callers may want to ensure that 'git commit-graph write
> > --split' always writes an incremental, and never spends effort
> > condensing the incremental chain [1]. Previously, this was possible by
> > passing '--size-multiple=0', but this no longer the case following
> > 63020f175f (commit-graph: prefer default size_mult when given zero,
> > 2020-01-02).
> >
> > Reintroduce a less-magical variant of the above with a new pair of
> > arguments to '--split': '--split=no-merge' and '--split=merge-all'. When
> > '--split=no-merge' is given, the commit-graph machinery will never
> > condense an existing chain and will always write a new incremental.
> > Conversely, if '--split=merge-all' is given, any invocation including it
> > will always condense a chain if one exists.  If '--split' is given with
> > no arguments, it behaves as before and defers to '--size-multiple', and
> > so on.
> >
> > [1]: This might occur when, for example, a server administrator running
> > some program after each push may want to ensure that each job runs
> > proportional in time to the size of the push, and does not "jump" when
> > the commit-graph machinery decides to trigger a merge.
> >
> > Signed-off-by: Taylor Blau <me@ttaylorr.com>
> > ---
> >  Documentation/git-commit-graph.txt | 18 +++++++++++-----
> >  builtin/commit-graph.c             | 33 ++++++++++++++++++++++++++----
> >  commit-graph.c                     | 19 +++++++++--------
> >  commit-graph.h                     |  7 +++++++
> >  t/t5324-split-commit-graph.sh      | 25 ++++++++++++++++++++++
> >  5 files changed, 85 insertions(+), 17 deletions(-)
> >
> > diff --git a/Documentation/git-commit-graph.txt b/Documentation/git-commit-graph.txt
> > index 28d1fee505..8d61ba9f56 100644
> > --- a/Documentation/git-commit-graph.txt
> > +++ b/Documentation/git-commit-graph.txt
> > @@ -57,11 +57,19 @@ or `--stdin-packs`.)
> >  With the `--append` option, include all commits that are present in the
> >  existing commit-graph file.
> >  +
> > -With the `--split` option, write the commit-graph as a chain of multiple
> > -commit-graph files stored in `<dir>/info/commit-graphs`. The new commits
> > -not already in the commit-graph are added in a new "tip" file. This file
> > -is merged with the existing file if the following merge conditions are
> > -met:
> > +With the `--split[=<strategy>]` option, write the commit-graph as a
> > +chain of multiple commit-graph files stored in
> > +`<dir>/info/commit-graphs`. Commit-graph layers are merged based on the
> > +strategy and other splitting options. The new commits not already in the
> > +commit-graph are added in a new "tip" file. This file is merged with the
> > +existing file if the following merge conditions are met:
> > +* If `--split=merge-always` is specified, then a merge is always
> > +conducted, and the remaining options are ignored. Conversely, if
> > +`--split=no-merge` is specified, a merge is never performed, and the
> > +remaining options are ignored. A bare `--split` defers to the remaining
> > +options. (Note that merging a chain of commit graphs replaces the
> > +existing chain with a length-1 chain where the first and only
> > +incremental holds the entire graph).
> >  +
> >  * If `--size-multiple=<X>` is not specified, let `X` equal 2. If the new
> >  tip file would have `N` commits and the previous tip has `M` commits and
> > diff --git a/builtin/commit-graph.c b/builtin/commit-graph.c
> > index de321c71ad..f03b46d627 100644
> > --- a/builtin/commit-graph.c
> > +++ b/builtin/commit-graph.c
> > @@ -9,7 +9,9 @@
> >
> >  static char const * const builtin_commit_graph_usage[] = {
> >  	N_("git commit-graph verify [--object-dir <objdir>] [--shallow] [--[no-]progress]"),
> > -	N_("git commit-graph write [--object-dir <objdir>] [--append|--split] [--reachable|--stdin-packs|--stdin-commits] [--[no-]progress] <split options>"),
> > +	N_("git commit-graph write [--object-dir <objdir>] [--append] "
> > +	   "[--split[=<strategy>]] [--reachable|--stdin-packs|--stdin-commits] "
> > +	   "[--[no-]progress] <split options>"),
> >  	NULL
> >  };
> >
> > @@ -19,7 +21,9 @@ static const char * const builtin_commit_graph_verify_usage[] = {
> >  };
> >
> >  static const char * const builtin_commit_graph_write_usage[] = {
> > -	N_("git commit-graph write [--object-dir <objdir>] [--append|--split] [--reachable|--stdin-packs|--stdin-commits] [--[no-]progress] <split options>"),
> > +	N_("git commit-graph write [--object-dir <objdir>] [--append] "
> > +	   "[--split[=<strategy>]] [--reachable|--stdin-packs|--stdin-commits] "
> > +	   "[--[no-]progress] <split options>"),
> >  	NULL
> >  };
> >
> > @@ -101,6 +105,25 @@ static int graph_verify(int argc, const char **argv)
> >  extern int read_replace_refs;
> >  static struct split_commit_graph_opts split_opts;
> >
> > +static int write_option_parse_split(const struct option *opt, const char *arg,
> > +				    int unset)
> > +{
> > +	enum commit_graph_split_flags *flags = opt->value;
> > +
> > +	opts.split = 1;
> > +	if (!arg)
> > +		return 0;
>
> This allows `--split` to continue working as-is. But should we also
> set "*flags = COMMIT_GRAPH_SPLIT_UNSPECIFIED" here? This allows one
> to run "git commit-graph write --split=no-merge --split" (which could
> happen if "--split=no-merge" is inside an alias).

Yeah, this is an oversight on my part. I think that we should set the
split option to 'COMMIT_GRAPH_SPLIT_UNSPECIFIED' when '--split' is
given, for exactly the reason you outlined above. Thanks for the
suggestion!

> > +test_expect_success '--split=merge-all always merges incrementals' '
> > +	test_when_finished rm -rf a b c &&
> > +	rm -rf $graphdir $infodir/commit-graph &&
> > +	git reset --hard commits/10 &&
> > +	git rev-list -3 HEAD~4 >a &&
> > +	git rev-list -2 HEAD~2 >b &&
> > +	git rev-list -2 HEAD >c &&
> > +	git commit-graph write --split=no-merge --stdin-commits <a &&
> > +	git commit-graph write --split=no-merge --stdin-commits <b &&
> > +	test_line_count = 2 $graphdir/commit-graph-chain &&
> > +	git commit-graph write --split=merge-all --stdin-commits <c &&
> > +	test_line_count = 1 $graphdir/commit-graph-chain
> > +'
> > +
> > +test_expect_success '--split=no-merge always writes an incremental' '
> > +	test_when_finished rm -rf a b &&
> > +	rm -rf $graphdir &&
> > +	git reset --hard commits/2 &&
> > +	git rev-list HEAD~1 >a &&
> > +	git rev-list HEAD >b &&
> > +	git commit-graph write --split --stdin-commits <a &&
> > +	git commit-graph write --split=no-merge --stdin-commits <b &&
> > +	test_line_count = 2 $graphdir/commit-graph-chain
> > +'
> > +
> >  test_done
>
> Good tests!

Thanks :-).

> Thanks,
> -Stolee

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 1/3] builtin/commit-graph.c: support '--split[=<strategy>]'
  2020-02-03 11:11         ` Jeff King
@ 2020-02-04  3:58           ` Taylor Blau
  2020-02-04 14:14             ` Jeff King
  0 siblings, 1 reply; 58+ messages in thread
From: Taylor Blau @ 2020-02-04  3:58 UTC (permalink / raw)
  To: Jeff King
  Cc: SZEDER Gábor, Johannes Schindelin, Taylor Blau, git,
	dstolee, gitster

On Mon, Feb 03, 2020 at 06:11:17AM -0500, Jeff King wrote:
> On Mon, Feb 03, 2020 at 11:47:30AM +0100, SZEDER Gábor wrote:
>
> > > -- snip --
> > > diff --git a/commit-graph.c b/commit-graph.c
> > > index a5d7624073f..af5c58833cf 100644
> > > --- a/commit-graph.c
> > > +++ b/commit-graph.c
> > > @@ -1566,7 +1566,8 @@ static void split_graph_merge_strategy(struct write_commit_graph_context *ctx)
> > >  	num_commits = ctx->commits.nr;
> > >  	ctx->num_commit_graphs_after = ctx->num_commit_graphs_before + 1;
> > >
> > > -	if (ctx->split_opts->flags != COMMIT_GRAPH_SPLIT_MERGE_PROHIBITED) {
> > > +	if (ctx->split_opts &&
> > > +	    ctx->split_opts->flags != COMMIT_GRAPH_SPLIT_MERGE_PROHIBITED) {
> > >  		while (g && (g->num_commits <= size_mult * num_commits ||
> > >  			    (max_commits && num_commits > max_commits) ||
> > >  			    (ctx->split_opts->flags == COMMIT_GRAPH_SPLIT_MERGE_REQUIRED))) {
> > > -- snap --
> >
> > Yeah, that's what I noted below, but I'm not sure that this is the
> > right solution.  Why doesn't cmd_fetch() call
> > write_commit_graph_reachable() with a real 'split_opts' parameter in
> > the first place?  Wouldn't it be better if it did?
>
> It used to provide a "blank" split_opts until 63020f175f (commit-graph:
> prefer default size_mult when given zero, 2020-01-02), which caused a
> bug. That bug was since fixed, but the idea was to keep things
> convenient for callers.
>
> Whether that's a good idea or not I guess is up for debate, but it
> certainly is what the commit-graph code has tried to provide for some
> time. If we're not going to follow that in this new code, then we should
> presumably also rip out all of the other "if (ctx->split_opts)" lines.

I think that this seems like a good step that we should probably take,
but I don't think that it's necessary for the series at hand. The
pattern in this function is to define a local variable which has the
same value as in split_opts, or a reasonable default if split_opts is
NULL (c.f., 'max_commits' and 'size_mult').

So, I think that a safe thing to do which prevents the segv and doesn't
change the pattern too much is to write:

  enum commit_graph_split_flags flags = COMMIT_GRAPH_SPLIT_MERGE_AUTO;
  if (ctx->split_opts) {
    /* ... */
    flags = ctx->split_opts->flags;
  }

  /* ... */

  if (flags != COMMIT_GRAPH_SPLIT_MERGE_PROHIBITED) {
    while ( ... ) { ... }
  }

This is adding another local variable, which seems like an odd thing to
do *every* time that we add another member to split_opts. So for that
reason it seems like in the longer-term we should either force the
caller to pass in a blank, or do something else that doesn't require
this, but I think that the intermediate cost isn't too bad.

> -Peff

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 1/3] builtin/commit-graph.c: support '--split[=<strategy>]'
  2020-01-31 23:34   ` SZEDER Gábor
  2020-02-01 21:25     ` Johannes Schindelin
@ 2020-02-04  3:59     ` Taylor Blau
  1 sibling, 0 replies; 58+ messages in thread
From: Taylor Blau @ 2020-02-04  3:59 UTC (permalink / raw)
  To: SZEDER Gábor; +Cc: Taylor Blau, git, peff, dstolee, gitster

On Sat, Feb 01, 2020 at 12:34:34AM +0100, SZEDER Gábor wrote:
> On Thu, Jan 30, 2020 at 04:28:17PM -0800, Taylor Blau wrote:
> > diff --git a/commit-graph.c b/commit-graph.c
> > index 6d34829f57..02e6ad9d1f 100644
> > --- a/commit-graph.c
> > +++ b/commit-graph.c
> > @@ -1565,15 +1565,18 @@ static void split_graph_merge_strategy(struct write_commit_graph_context *ctx)
> >  	num_commits = ctx->commits.nr;
> >  	ctx->num_commit_graphs_after = ctx->num_commit_graphs_before + 1;
> >
> > -	while (g && (g->num_commits <= size_mult * num_commits ||
> > -		    (max_commits && num_commits > max_commits))) {
> > -		if (g->odb != ctx->odb)
> > -			break;
> > +	if (ctx->split_opts->flags != COMMIT_GRAPH_SPLIT_MERGE_PROHIBITED) {
>
> This line segfaults in the tests 'fetch.writeCommitGraph' and
> 'fetch.writeCommitGraph with submodules' in 't5510-fetch.sh', because
> 'git fetch' doesn't pass a 'split_opts' to the commit-graph functions.

Thanks for pointing it out. I responded in more detail lower in the
thread with the fix, but I appreciate your testing each patch. Clearly,
I just should have run the full suite myself before sending!

> Thread 1 "git" received signal SIGSEGV, Segmentation fault.
> 0x00000000005113dd in split_graph_merge_strategy (ctx=0x9ca930)
>     at commit-graph.c:1568
> 1568            if (ctx->split_opts->flags != COMMIT_GRAPH_SPLIT_MERGE_PROHIBITED) {
> (gdb) p *ctx
> $1 = {r = 0x9ae2a0 <the_repo>, odb = 0x9c0df0, graph_name = 0x0, oids = {
>     list = 0xa02660, nr = 12, alloc = 1024}, commits = {list = 0x9caa00,
>     nr = 6, alloc = 6}, num_extra_edges = 0, approx_nr_objects = 0,
>   progress = 0x0, progress_done = 0, progress_cnt = 0, base_graph_name = 0x0,
>   num_commit_graphs_before = 0, num_commit_graphs_after = 1,
>   commit_graph_filenames_before = 0x0, commit_graph_filenames_after = 0x0,
>   commit_graph_hash_after = 0x0, new_num_commits_in_base = 0,
>   new_base_graph = 0x0, append = 0, report_progress = 1, split = 1,
>   check_oids = 0, split_opts = 0x0}
>                   ^^^^^^^^^^^^^^^^
> (gdb) bt
> #0  0x00000000005113dd in split_graph_merge_strategy (ctx=0x9ca930)
>     at commit-graph.c:1568
> #1  0x0000000000512446 in write_commit_graph (odb=0x9c0df0, pack_indexes=0x0,
>     commit_hex=0x7fffffffd550,
>     flags=(COMMIT_GRAPH_WRITE_PROGRESS | COMMIT_GRAPH_WRITE_SPLIT),
>     split_opts=0x0) at commit-graph.c:1891
> #2  0x000000000050fd86 in write_commit_graph_reachable (odb=0x9c0df0,
>     flags=(COMMIT_GRAPH_WRITE_PROGRESS | COMMIT_GRAPH_WRITE_SPLIT),
>     split_opts=0x0) at commit-graph.c:1174
>     ^^^^^^^^^^^^^^
> #3  0x0000000000444ea4 in cmd_fetch (argc=0, argv=0x7fffffffd9b8, prefix=0x0)
>     at builtin/fetch.c:1873
> #4  0x0000000000406154 in run_builtin (p=0x969a88 <commands+840>, argc=2,
>     argv=0x7fffffffd9b0) at git.c:444
> #5  0x00000000004064a4 in handle_builtin (argc=2, argv=0x7fffffffd9b0)
>     at git.c:674
> #6  0x000000000040674c in run_argv (argcp=0x7fffffffd84c, argv=0x7fffffffd840)
>     at git.c:741
> #7  0x0000000000406bbd in cmd_main (argc=2, argv=0x7fffffffd9b0) at git.c:872
> #8  0x00000000004cfd4e in main (argc=5, argv=0x7fffffffd998)
>     at common-main.c:52
>
> Note that this function split_graph_merge_strategy() does look at
> various fields in 'ctx->split_opts' a bit earlier, but those accesses
> are protected by a 'if (ctx->split_opts)' condition.
> expire_commit_graphs() does the same.
>
>
> > +		while (g && (g->num_commits <= size_mult * num_commits ||
> > +			    (max_commits && num_commits > max_commits) ||
> > +			    (ctx->split_opts->flags == COMMIT_GRAPH_SPLIT_MERGE_REQUIRED))) {
> > +			if (g->odb != ctx->odb)
> > +				break;
> >
> > -		num_commits += g->num_commits;
> > -		g = g->base_graph;
> > +			num_commits += g->num_commits;
> > +			g = g->base_graph;
> >
> > -		ctx->num_commit_graphs_after--;
> > +			ctx->num_commit_graphs_after--;
> > +		}
> >  	}
> >

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 1/3] builtin/commit-graph.c: support '--split[=<strategy>]'
  2020-02-01 21:25     ` Johannes Schindelin
  2020-02-03 10:47       ` SZEDER Gábor
@ 2020-02-04  3:59       ` Taylor Blau
  1 sibling, 0 replies; 58+ messages in thread
From: Taylor Blau @ 2020-02-04  3:59 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: SZEDER Gábor, Taylor Blau, git, peff, dstolee, gitster

On Sat, Feb 01, 2020 at 10:25:36PM +0100, Johannes Schindelin wrote:
> Hi,
>
> On Sat, 1 Feb 2020, SZEDER Gábor wrote:
>
> > On Thu, Jan 30, 2020 at 04:28:17PM -0800, Taylor Blau wrote:
> > > diff --git a/commit-graph.c b/commit-graph.c
> > > index 6d34829f57..02e6ad9d1f 100644
> > > --- a/commit-graph.c
> > > +++ b/commit-graph.c
> > > @@ -1565,15 +1565,18 @@ static void split_graph_merge_strategy(struct write_commit_graph_context *ctx)
> > >  	num_commits = ctx->commits.nr;
> > >  	ctx->num_commit_graphs_after = ctx->num_commit_graphs_before + 1;
> > >
> > > -	while (g && (g->num_commits <= size_mult * num_commits ||
> > > -		    (max_commits && num_commits > max_commits))) {
> > > -		if (g->odb != ctx->odb)
> > > -			break;
> > > +	if (ctx->split_opts->flags != COMMIT_GRAPH_SPLIT_MERGE_PROHIBITED) {
> >
> > This line segfaults in the tests 'fetch.writeCommitGraph' and
> > 'fetch.writeCommitGraph with submodules' in 't5510-fetch.sh', because
> > 'git fetch' doesn't pass a 'split_opts' to the commit-graph functions.
>
> I noticed the same. This patch seems to fix it for me:
>
> -- snip --
> diff --git a/commit-graph.c b/commit-graph.c
> index a5d7624073f..af5c58833cf 100644
> --- a/commit-graph.c
> +++ b/commit-graph.c
> @@ -1566,7 +1566,8 @@ static void split_graph_merge_strategy(struct write_commit_graph_context *ctx)
>  	num_commits = ctx->commits.nr;
>  	ctx->num_commit_graphs_after = ctx->num_commit_graphs_before + 1;
>
> -	if (ctx->split_opts->flags != COMMIT_GRAPH_SPLIT_MERGE_PROHIBITED) {
> +	if (ctx->split_opts &&
> +	    ctx->split_opts->flags != COMMIT_GRAPH_SPLIT_MERGE_PROHIBITED) {
>  		while (g && (g->num_commits <= size_mult * num_commits ||
>  			    (max_commits && num_commits > max_commits) ||
>  			    (ctx->split_opts->flags == COMMIT_GRAPH_SPLIT_MERGE_REQUIRED))) {
> -- snap --
>
> For your convenience, I also pushed this up as
> `tb/commit-graph-split-merge` to https://github.com/dscho/git

Thanks, Dscho. I also published it under the same name on my fork at
'https://github.com/ttaylorr/git'.

> Thanks,
> Dscho
>
>
> >
> > Thread 1 "git" received signal SIGSEGV, Segmentation fault.
> > 0x00000000005113dd in split_graph_merge_strategy (ctx=0x9ca930)
> >     at commit-graph.c:1568
> > 1568            if (ctx->split_opts->flags != COMMIT_GRAPH_SPLIT_MERGE_PROHIBITED) {
> > (gdb) p *ctx
> > $1 = {r = 0x9ae2a0 <the_repo>, odb = 0x9c0df0, graph_name = 0x0, oids = {
> >     list = 0xa02660, nr = 12, alloc = 1024}, commits = {list = 0x9caa00,
> >     nr = 6, alloc = 6}, num_extra_edges = 0, approx_nr_objects = 0,
> >   progress = 0x0, progress_done = 0, progress_cnt = 0, base_graph_name = 0x0,
> >   num_commit_graphs_before = 0, num_commit_graphs_after = 1,
> >   commit_graph_filenames_before = 0x0, commit_graph_filenames_after = 0x0,
> >   commit_graph_hash_after = 0x0, new_num_commits_in_base = 0,
> >   new_base_graph = 0x0, append = 0, report_progress = 1, split = 1,
> >   check_oids = 0, split_opts = 0x0}
> >                   ^^^^^^^^^^^^^^^^
> > (gdb) bt
> > #0  0x00000000005113dd in split_graph_merge_strategy (ctx=0x9ca930)
> >     at commit-graph.c:1568
> > #1  0x0000000000512446 in write_commit_graph (odb=0x9c0df0, pack_indexes=0x0,
> >     commit_hex=0x7fffffffd550,
> >     flags=(COMMIT_GRAPH_WRITE_PROGRESS | COMMIT_GRAPH_WRITE_SPLIT),
> >     split_opts=0x0) at commit-graph.c:1891
> > #2  0x000000000050fd86 in write_commit_graph_reachable (odb=0x9c0df0,
> >     flags=(COMMIT_GRAPH_WRITE_PROGRESS | COMMIT_GRAPH_WRITE_SPLIT),
> >     split_opts=0x0) at commit-graph.c:1174
> >     ^^^^^^^^^^^^^^
> > #3  0x0000000000444ea4 in cmd_fetch (argc=0, argv=0x7fffffffd9b8, prefix=0x0)
> >     at builtin/fetch.c:1873
> > #4  0x0000000000406154 in run_builtin (p=0x969a88 <commands+840>, argc=2,
> >     argv=0x7fffffffd9b0) at git.c:444
> > #5  0x00000000004064a4 in handle_builtin (argc=2, argv=0x7fffffffd9b0)
> >     at git.c:674
> > #6  0x000000000040674c in run_argv (argcp=0x7fffffffd84c, argv=0x7fffffffd840)
> >     at git.c:741
> > #7  0x0000000000406bbd in cmd_main (argc=2, argv=0x7fffffffd9b0) at git.c:872
> > #8  0x00000000004cfd4e in main (argc=5, argv=0x7fffffffd998)
> >     at common-main.c:52
> >
> > Note that this function split_graph_merge_strategy() does look at
> > various fields in 'ctx->split_opts' a bit earlier, but those accesses
> > are protected by a 'if (ctx->split_opts)' condition.
> > expire_commit_graphs() does the same.
> >
> >
> > > +		while (g && (g->num_commits <= size_mult * num_commits ||
> > > +			    (max_commits && num_commits > max_commits) ||
> > > +			    (ctx->split_opts->flags == COMMIT_GRAPH_SPLIT_MERGE_REQUIRED))) {
> > > +			if (g->odb != ctx->odb)
> > > +				break;
> > >
> > > -		num_commits += g->num_commits;
> > > -		g = g->base_graph;
> > > +			num_commits += g->num_commits;
> > > +			g = g->base_graph;
> > >
> > > -		ctx->num_commit_graphs_after--;
> > > +			ctx->num_commit_graphs_after--;
> > > +		}
> > >  	}
> > >
> >

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 1/3] builtin/commit-graph.c: support '--split[=<strategy>]'
  2020-01-31 19:27   ` Martin Ågren
@ 2020-02-04  4:06     ` Taylor Blau
  2020-02-06 19:15       ` Martin Ågren
  0 siblings, 1 reply; 58+ messages in thread
From: Taylor Blau @ 2020-02-04  4:06 UTC (permalink / raw)
  To: Martin Ågren
  Cc: Taylor Blau, Git Mailing List, Jeff King, Derrick Stolee, Junio C Hamano

On Fri, Jan 31, 2020 at 08:27:27PM +0100, Martin Ågren wrote:
> On Fri, 31 Jan 2020 at 01:29, Taylor Blau <me@ttaylorr.com> wrote:
> > With '--split', the commit-graph machinery writes new commits in another
> > incremental commit-graph which is part of the existing chain, and
> > optionally decides to condense the chain into a single commit-graph.
> > This is done to ensure that the aysmptotic behavior of looking up a
>
> asymptotic

Thanks, fixed.

> > commit in an incremental chain is dominated by the number of
> > incrementals in that chain. It can be controlled by the '--max-commits'
> > and '--size-multiple' options.
> >
> > On occasion, callers may want to ensure that 'git commit-graph write
> > --split' always writes an incremental, and never spends effort
> > condensing the incremental chain [1]. Previously, this was possible by
> > passing '--size-multiple=0', but this no longer the case following
> > 63020f175f (commit-graph: prefer default size_mult when given zero,
> > 2020-01-02).
> >
> > Reintroduce a less-magical variant of the above with a new pair of
> > arguments to '--split': '--split=no-merge' and '--split=merge-all'. When
> > '--split=no-merge' is given, the commit-graph machinery will never
> > condense an existing chain and will always write a new incremental.
> > Conversely, if '--split=merge-all' is given, any invocation including it
> > will always condense a chain if one exists.  If '--split' is given with
> > no arguments, it behaves as before and defers to '--size-multiple', and
> > so on.
>
> I understand your motivation for doing this -- it all seems quite sound
> to me. Not being too familiar with this commit-graph splitting and
> merging, I had a hard time groking this terminology though. From what I
> understand, before this patch, `--split` means "write the commit-graph
> using the 'split' file-format / in a split fashion". Ok, that makes
> sense.
>
> >From seeing `--split=no-merge`, I have no idea how to even parse that.
> Of course I don't want to merge, I want to split! Well not split, but
> write split files.
>
> I think it would help me (and others like me) if we could somehow
> separate "I want to use 'split' files" from "and here's how I want you
> to decide on the merging". That is, which "strategy" to use. Obviously,
> talking about a "merge strategy" would be stupid and "split strategy"
> also seems a bit odd. "Coalescing strategy"? "Joining strategy"?
>
> Or can you convince me otherwise? From which angle should I look at
> this?

Heh. This is all very reminiscent of an off-list discussion that I had
with Peff and Stolee before sending this upstream. Originally, I had
implemented this as:

  $ git commit-graph write --split --[no-]merge

but we decided that this '--merge' and '--no-merge' requiring '--split'
seemed to indicate that this was better off as an argument to '--split'.
Of course, there's no getting around that it is... odd to say
'--split=no-merge' for exactly the reason you suggest.

Here's another way of looking at it: the presence of '--split' means
"work with split graph files" and the '[no-]merge' argument means:
"always/never condense multiple layers".

For me, this not only makes the new option language jive, but makes it
clearer to me than the combination of '--split', '--split --no-merge'
and '--split --merge', where the third one is truly bizarre. At least
condensing the second '--' and making 'merge' an argument to 'split'
makes it clear that the two work together somehow.

If you have a different suggestion, I'd certainly love to hear about it
and discuss. But, at least as far as our internal discussions have gone,
this is by far the best option that we have been able to come up with.

> > -With the `--split` option, write the commit-graph as a chain of multiple
> > -commit-graph files stored in `<dir>/info/commit-graphs`. The new commits
> > -not already in the commit-graph are added in a new "tip" file. This file
> > -is merged with the existing file if the following merge conditions are
> > -met:
> > +With the `--split[=<strategy>]` option, write the commit-graph as a
> > +chain of multiple commit-graph files stored in
> > +`<dir>/info/commit-graphs`. Commit-graph layers are merged based on the
> > +strategy and other splitting options. The new commits not already in the
> > +commit-graph are added in a new "tip" file. This file is merged with the
> > +existing file if the following merge conditions are met:
> > +* If `--split=merge-always` is specified, then a merge is always
> > +conducted, and the remaining options are ignored. Conversely, if
> > +`--split=no-merge` is specified, a merge is never performed, and the
> > +remaining options are ignored. A bare `--split` defers to the remaining
> > +options. (Note that merging a chain of commit graphs replaces the
> > +existing chain with a length-1 chain where the first and only
> > +incremental holds the entire graph).
>
> To better understand the background for this patch, I read the manpage
> as it stands today. From the section on `--split`, I got this
> impression: Let's say that `--max-commits` is huge, so all that matters
> is the `--size-multiple`. Let's say it's two. If the current tip
> contains three commits and we're about to write one with two, then 2*2 >
> 3 so we will merge, i.e., write a tip file with five commits. Unless of
> course *that* is more than half the size of the file before. And so on.
> We might just merge once, or maybe "many" files in an avalanche effect.
> Every now and then, such avalanches will go all the way back to the
> first file.
>
> Now this says something different, namely that once we decide to merge,
> we do it all the way back, no matter what.
>
> The commit message of 1771be90c8 ("commit-graph: merge commit-graph
> chains", 2019-06-18) seems to support my original understanding, at
> least for `--size-multiple`, but not `--max-commits`, curiously enough.
>
> Can you clarify?

1771be90c8 is right, and this documentation is wrong. Upon re-reading
it, I found the contents of this documentation between those two
parenthesis to be confusing rather than helpful. For that reason, I
simply removed it.

> > -               OPT_BOOL(0, "split", &opts.split,
> > -                       N_("allow writing an incremental commit-graph file")),
> > +               OPT_CALLBACK_F(0, "split", &split_opts.flags, NULL,
> > +                       N_("allow writing an incremental commit-graph file"),
>
> This still sounds very boolean. Cramming in the "strategy" might be hard
> -- is this an argument in favor of having two separate options? ;-)

Heh. Exactly how we started these patches when I originally wrote
them...

> > +enum commit_graph_split_flags {
> > +       COMMIT_GRAPH_SPLIT_UNSPECIFIED      = 0,
> > +       COMMIT_GRAPH_SPLIT_MERGE_REQUIRED   = 1,
> > +       COMMIT_GRAPH_SPLIT_MERGE_PROHIBITED = 2
> > +};
>
> I wonder if this should be "MERGE_AUTO" rather than "UNSPECIFIED". This
> is related to Stolee's comment, I think.

I think you're right. I changed it in my local v2.
>
> Martin

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 2/3] builtin/commit-graph.c: introduce '--input=<source>'
  2020-01-31 14:40   ` Derrick Stolee
@ 2020-02-04  4:21     ` Taylor Blau
  0 siblings, 0 replies; 58+ messages in thread
From: Taylor Blau @ 2020-02-04  4:21 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: Taylor Blau, git, peff, dstolee, gitster

On Fri, Jan 31, 2020 at 09:40:23AM -0500, Derrick Stolee wrote:
> On 1/30/2020 7:28 PM, Taylor Blau wrote:
> > The 'write' mode of the 'commit-graph' supports input from a number of
> > different sources: pack indexes over stdin, commits over stdin, commits
> > reachable from all references, and so on. Each of these options are
> > specified with a unique option: '--stdin-packs', '--stdin-commits', etc.
> >
> > Similar to our replacement of 'git config [--<type>]' with 'git config
> > [--type=<type>]' (c.f., fb0dc3bac1 (builtin/config.c: support
> > `--type=<type>` as preferred alias for `--<type>`, 2018-04-18)), softly
> > deprecate '[--<input>]' in favor of '[--input=<source>]'.
> >
> > This makes it more clear to implement new options that are combinations
> > of other options (such as, for example, "none", a combination of the old
> > "--append" and a new sentinel to specify to _not_ look in other packs,
> > which we will implement in a future patch).
> >
> > Unfortunately, the new enumerated type is a bitfield, even though it
> > makes much more sense as '0, 1, 2, ...'. Even though *almost* all
> > options are pairwise exclusive, '--stdin-{packs,commits}' *is*
> > compatible with '--append'. For this reason, use a bitfield.
> >
> > Signed-off-by: Taylor Blau <me@ttaylorr.com>
> > ---
> >  Documentation/git-commit-graph.txt | 26 +++++-----
> >  builtin/commit-graph.c             | 77 ++++++++++++++++++++++--------
> >  t/t5318-commit-graph.sh            | 46 +++++++++---------
> >  t/t5324-split-commit-graph.sh      | 44 ++++++++---------
> >  4 files changed, 114 insertions(+), 79 deletions(-)
> >
> > diff --git a/Documentation/git-commit-graph.txt b/Documentation/git-commit-graph.txt
> > index 8d61ba9f56..cbf80226e9 100644
> > --- a/Documentation/git-commit-graph.txt
> > +++ b/Documentation/git-commit-graph.txt
> > @@ -41,21 +41,21 @@ COMMANDS
> >
> >  Write a commit-graph file based on the commits found in packfiles.
> >  +
> > -With the `--stdin-packs` option, generate the new commit graph by
> > +With the `--input=stdin-packs` option, generate the new commit graph by
> >  walking objects only in the specified pack-indexes. (Cannot be combined
> > -with `--stdin-commits` or `--reachable`.)
> > +with `--input=stdin-commits` or `--input=reachable`.)
> >  +
> > -With the `--stdin-commits` option, generate the new commit graph by
> > -walking commits starting at the commits specified in stdin as a list
> > +With the `--input=stdin-commits` option, generate the new commit graph
> > +by walking commits starting at the commits specified in stdin as a list
> >  of OIDs in hex, one OID per line. (Cannot be combined with
> > -`--stdin-packs` or `--reachable`.)
> > +`--input=stdin-packs` or `--input=reachable`.)
> >  +
> > -With the `--reachable` option, generate the new commit graph by walking
> > -commits starting at all refs. (Cannot be combined with `--stdin-commits`
> > -or `--stdin-packs`.)
> > +With the `--input=reachable` option, generate the new commit graph by
> > +walking commits starting at all refs. (Cannot be combined with
> > +`--input=stdin-commits` or `--input=stdin-packs`.)
> >  +
> > -With the `--append` option, include all commits that are present in the
> > -existing commit-graph file.
> > +With the `--input=append` option, include all commits that are present
> > +in the existing commit-graph file.
> >  +
> >  With the `--split[=<strategy>]` option, write the commit-graph as a
> >  chain of multiple commit-graph files stored in
> > @@ -107,20 +107,20 @@ $ git commit-graph write
> >    using commits in `<pack-index>`.
> >  +
> >  ------------------------------------------------
> > -$ echo <pack-index> | git commit-graph write --stdin-packs
> > +$ echo <pack-index> | git commit-graph write --input=stdin-packs
> >  ------------------------------------------------
> >
> >  * Write a commit-graph file containing all reachable commits.
> >  +
> >  ------------------------------------------------
> > -$ git show-ref -s | git commit-graph write --stdin-commits
> > +$ git show-ref -s | git commit-graph write --input=stdin-commits
> >  ------------------------------------------------
> >
> >  * Write a commit-graph file containing all commits in the current
> >    commit-graph file along with those reachable from `HEAD`.
> >  +
> >  ------------------------------------------------
> > -$ git rev-parse HEAD | git commit-graph write --stdin-commits --append
> > +$ git rev-parse HEAD | git commit-graph write --input=stdin-commits --input=append
> >  ------------------------------------------------
> >
> >
> > diff --git a/builtin/commit-graph.c b/builtin/commit-graph.c
> > index f03b46d627..03d815e652 100644
> > --- a/builtin/commit-graph.c
> > +++ b/builtin/commit-graph.c
> > @@ -10,7 +10,7 @@
> >  static char const * const builtin_commit_graph_usage[] = {
> >  	N_("git commit-graph verify [--object-dir <objdir>] [--shallow] [--[no-]progress]"),
> >  	N_("git commit-graph write [--object-dir <objdir>] [--append] "
> > -	   "[--split[=<strategy>]] [--reachable|--stdin-packs|--stdin-commits] "
> > +	   "[--split[=<strategy>]] [--input=<reachable|stdin-packs|stdin-commits>] "
> >  	   "[--[no-]progress] <split options>"),
> >  	NULL
> >  };
> > @@ -22,22 +22,48 @@ static const char * const builtin_commit_graph_verify_usage[] = {
> >
> >  static const char * const builtin_commit_graph_write_usage[] = {
> >  	N_("git commit-graph write [--object-dir <objdir>] [--append] "
> > -	   "[--split[=<strategy>]] [--reachable|--stdin-packs|--stdin-commits] "
> > +	   "[--split[=<strategy>]] [--input=<reachable|stdin-packs|stdin-commits>] "
> >  	   "[--[no-]progress] <split options>"),
> >  	NULL
> >  };
> >
> > +enum commit_graph_input {
> > +	COMMIT_GRAPH_INPUT_REACHABLE     = (1 << 1),
> > +	COMMIT_GRAPH_INPUT_STDIN_PACKS   = (1 << 2),
> > +	COMMIT_GRAPH_INPUT_STDIN_COMMITS = (1 << 3),
> > +	COMMIT_GRAPH_INPUT_APPEND        = (1 << 4)
> > +};
> > +
> >  static struct opts_commit_graph {
> >  	const char *obj_dir;
> > -	int reachable;
> > -	int stdin_packs;
> > -	int stdin_commits;
> > -	int append;
> > +	enum commit_graph_input input;
> >  	int split;
> >  	int shallow;
> >  	int progress;
> >  } opts;
> >
> > +static int option_parse_input(const struct option *opt, const char *arg,
> > +			      int unset)
> > +{
> > +	enum commit_graph_input *to = opt->value;
> > +	if (unset || !strcmp(arg, "packs")) {
> > +		*to = 0;
> > +		return 0;
> > +	}
>
> Here, you _do_ clear the bitfield, allowing "--input=reachable --input"
> to do the correct override. Thanks!
>
> > +
> > +	if (!strcmp(arg, "reachable"))
> > +		*to |= COMMIT_GRAPH_INPUT_REACHABLE;
> > +	else if (!strcmp(arg, "stdin-packs"))
> > +		*to |= COMMIT_GRAPH_INPUT_STDIN_PACKS;
> > +	else if (!strcmp(arg, "stdin-commits"))
> > +		*to |= COMMIT_GRAPH_INPUT_STDIN_COMMITS;
> > +	else if (!strcmp(arg, "append"))
> > +		*to |= COMMIT_GRAPH_INPUT_APPEND;
> > +	else
> > +		die(_("unrecognized --input source, %s"), arg);
> > +	return 0;
> > +}
> > +
> >  static struct object_directory *find_odb_or_die(struct repository *r,
> >  						const char *obj_dir)
> >  {
> > @@ -137,14 +163,21 @@ static int graph_write(int argc, const char **argv)
> >  		OPT_STRING(0, "object-dir", &opts.obj_dir,
> >  			N_("dir"),
> >  			N_("The object directory to store the graph")),
> > -		OPT_BOOL(0, "reachable", &opts.reachable,
> > -			N_("start walk at all refs")),
> > -		OPT_BOOL(0, "stdin-packs", &opts.stdin_packs,
> > -			N_("scan pack-indexes listed by stdin for commits")),
> > -		OPT_BOOL(0, "stdin-commits", &opts.stdin_commits,
> > -			N_("start walk at commits listed by stdin")),
> > -		OPT_BOOL(0, "append", &opts.append,
> > -			N_("include all commits already in the commit-graph file")),
> > +		OPT_CALLBACK(0, "input", &opts.input, NULL,
> > +			N_("include commits from this source in the graph"),
> > +			option_parse_input),
> > +		OPT_BIT(0, "reachable", &opts.input,
> > +			N_("start walk at all refs"),
> > +			COMMIT_GRAPH_INPUT_REACHABLE),
> > +		OPT_BIT(0, "stdin-packs", &opts.input,
> > +			N_("scan pack-indexes listed by stdin for commits"),
> > +			COMMIT_GRAPH_INPUT_STDIN_PACKS),
> > +		OPT_BIT(0, "stdin-commits", &opts.input,
> > +			N_("start walk at commits listed by stdin"),
> > +			COMMIT_GRAPH_INPUT_STDIN_COMMITS),
> > +		OPT_BIT(0, "append", &opts.input,
> > +			N_("include all commits already in the commit-graph file"),
> > +			COMMIT_GRAPH_INPUT_APPEND),
>
> Since you are rewriting how we interpret the deprecated options, perhaps we
> should keep some tests around that call these versions? It would make the
> test diff be a bit smaller. These options can be removed from the tests if/when
> we actually remove the options.

That sounds good. I thought about doing this in the original round, but
I talked myself out of it because it wasn't clear to me which tests were
the ones worth converting and which should be left alone.

But, since you think it's good, so do I. I picked the ones to convert
mostly at random, and left the new ones as-is using the '--input=' form.

> > @@ -351,10 +351,10 @@ test_expect_success '--split=merge-all always merges incrementals' '
> >  	git rev-list -3 HEAD~4 >a &&
> >  	git rev-list -2 HEAD~2 >b &&
> >  	git rev-list -2 HEAD >c &&
> > -	git commit-graph write --split=no-merge --stdin-commits <a &&
> > -	git commit-graph write --split=no-merge --stdin-commits <b &&
> > +	git commit-graph write --split=no-merge --input=stdin-commits <a &&
> > +	git commit-graph write --split=no-merge --input=stdin-commits <b &&
> >  	test_line_count = 2 $graphdir/commit-graph-chain &&
> > -	git commit-graph write --split=merge-all --stdin-commits <c &&
> > +	git commit-graph write --split=merge-all --input=stdin-commits <c &&
> >  	test_line_count = 1 $graphdir/commit-graph-chain
> >  '
> >
> > @@ -364,8 +364,8 @@ test_expect_success '--split=no-merge always writes an incremental' '
> >  	git reset --hard commits/2 &&
> >  	git rev-list HEAD~1 >a &&
> >  	git rev-list HEAD >b &&
> > -	git commit-graph write --split --stdin-commits <a &&
> > -	git commit-graph write --split=no-merge --stdin-commits <b &&
> > +	git commit-graph write --split --input=stdin-commits <a &&
> > +	git commit-graph write --split=no-merge --input=stdin-commits <b &&
> >  	test_line_count = 2 $graphdir/commit-graph-chain
> >  '
>
> Updating these new tests with the given options is good. Perhaps convert only one
> of the old tests for each of the stdin-packs, reachable, "", and "append" options?

Yup, thanks.

> Thanks,
> -Stolee

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 2/3] builtin/commit-graph.c: introduce '--input=<source>'
  2020-01-31 19:34   ` Martin Ågren
@ 2020-02-04  4:51     ` Taylor Blau
  2020-02-13 11:33       ` SZEDER Gábor
  0 siblings, 1 reply; 58+ messages in thread
From: Taylor Blau @ 2020-02-04  4:51 UTC (permalink / raw)
  To: Martin Ågren
  Cc: Taylor Blau, Git Mailing List, Jeff King, Derrick Stolee, Junio C Hamano

On Fri, Jan 31, 2020 at 08:34:41PM +0100, Martin Ågren wrote:
> On Fri, 31 Jan 2020 at 01:30, Taylor Blau <me@ttaylorr.com> wrote:
> > The 'write' mode of the 'commit-graph' supports input from a number of
> > different sources: pack indexes over stdin, commits over stdin, commits
> > reachable from all references, and so on. Each of these options are
> > specified with a unique option: '--stdin-packs', '--stdin-commits', etc.
> >
> > Similar to our replacement of 'git config [--<type>]' with 'git config
> > [--type=<type>]' (c.f., fb0dc3bac1 (builtin/config.c: support
> > `--type=<type>` as preferred alias for `--<type>`, 2018-04-18)), softly
> > deprecate '[--<input>]' in favor of '[--input=<source>]'.
> >
> > This makes it more clear to implement new options that are combinations
> > of other options (such as, for example, "none", a combination of the old
> > "--append" and a new sentinel to specify to _not_ look in other packs,
> > which we will implement in a future patch).
>
> Makes sense.
>
> > Unfortunately, the new enumerated type is a bitfield, even though it
> > makes much more sense as '0, 1, 2, ...'. Even though *almost* all
> > options are pairwise exclusive, '--stdin-{packs,commits}' *is*
> > compatible with '--append'. For this reason, use a bitfield.
>
> > -With the `--append` option, include all commits that are present in the
> > -existing commit-graph file.
> > +With the `--input=append` option, include all commits that are present
> > +in the existing commit-graph file.
>
> Would it be too crazy to call this `--input=existing` instead, and have
> it be the same as `--append`? I find that `--append` makes a lot of
> sense (it's a mode we can turn on or off), whereas "input = append"
> seems more odd.

Hmm. When I wrote this, I was thinking of introducing equivalent options
that are identical in name and functionality as '--input=<mode>' instead
of '--<mode>'. So, I guess that is to say that I didn't spend an awful
amount of time thinking about whether or not '--input=append' made sense
given anything else.

So, I don't think that '--input=existing' is a bad idea at all, but I do
worry about advertising this deprecation as "'--<mode>' becomes
'--input=<mode>', except when your mode is 'append', in which case it
becomes '--input=existing'".

I suppose that, on the other hand, if we *were* to introduce such a
change, now would be the time to do it, before '--input=<mode>' is on
master and tagged in a release, but I'm not sure that '--input=append'
is so much worse.

I'm inclined to leave it as is, unless there are others that feel
strongly, in which case we can/should come back to it before this moves
towards being queued.

> >From the next commit message, we learn that a long `--input=append`
> triggers `fill_oids_from_all_packs()`, which wouldn't match my expecting
> from "--input=existing". So...
>
> Does this hint that we could leave `--append` alone? We'd have lots of
> different inputs to choose from using `--input`, and an `--append` mode
> on top of that. That would make your inputs truly mutually exclusive and
> you don't need the bitfield anymore, as you mention above.  Hmm?
>
> Would that mean that the falling back to `fill_oids_from_all_packs()`
> would follow from "is there an --input?", as opposed to from "is there
> an --input except --input=append?"?
>
> (I don't know whether these inputs really *have* to be exclusive, or if
> that's more of an implementation detail. That is, even without an
> "append" input, might we one day be able to handle more inputs at once?
> Maybe this is not the time to worry about that.)
>
> Martin

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 3/3] builtin/commit-graph.c: support '--input=none'
  2020-01-31 19:45   ` Martin Ågren
@ 2020-02-04  5:01     ` Taylor Blau
  0 siblings, 0 replies; 58+ messages in thread
From: Taylor Blau @ 2020-02-04  5:01 UTC (permalink / raw)
  To: Martin Ågren
  Cc: Taylor Blau, Git Mailing List, Jeff King, Derrick Stolee, Junio C Hamano

On Fri, Jan 31, 2020 at 08:45:59PM +0100, Martin Ågren wrote:
> On Fri, 31 Jan 2020 at 01:30, Taylor Blau <me@ttaylorr.com> wrote:
> > In the previous commit, we introduced '--[no-]merge', and alluded to the
> > fact that '--merge' would be useful for callers who wish to always
> > trigger a merge of an incremental chain.
>
> Hmmm. So it looks like you've already had similar thoughts as I did
> about patch 1/3. At some point, you had a separate `--merge=...` option,
> then later made that `--split=...`. :-) Could you say something about why
> you changed your mind?

Heh :-). Left overs from an earlier version of this series. I think that
I already talked about why this was changed further up in the thread.

> > There is a problem with the above approach, which is that there is no
> > way to specify to the commit-graph builtin that a caller only wants to
> > include commits already in the graph. One can specify '--input=append'
> > to include all commits in the existing graphs, but the absence of
> > '--input=stdin-{commits,packs}' causes the builtin to call
> > 'fill_oids_from_all_packs()'.
>
> (Use one of those options with an empty stdin? Anyway, let's read on.)
>
> > Passing '--input=reachable' (as in 'git commit-graph write
> > --split=merge-all --input=reachable --input=append') works around this
> > issue by making '--input=reachable' effectively a no-op, but this can be
> > prohibitively expensive in large repositories, making it an undesirable
> > choice for some users.
> >
> > Teach '--input=none' as an option to behave as if '--input=append' were
> > given, but to consider no other sources in addition.
>
> `--input=none` almost makes me wonder if it should produce an empty
> commit-graph. But there wouldn't be much point in that... I guess
> another way of defining this would be that it "uses no input, and
> implies `--append`".

I suppose, although (like you) I can't imagine why anybody would want to
do that.

> > This, in conjunction with the option introduced in the previous patch
> > offers the convenient way to force the commit-graph machinery to
> > condense a chain of incrementals without requiring any new commits:
> >
> >   $ git commit-graph write --split=merge-all --input=none
>
> Right.
>
> > --- a/Documentation/git-commit-graph.txt
> > +++ b/Documentation/git-commit-graph.txt
> > @@ -39,24 +39,29 @@ COMMANDS
> >  --------
> >  'write'::
> >
> > -Write a commit-graph file based on the commits found in packfiles.
> > +Write a commit-graph file based on the commits specified:
> > +* With the `--input=stdin-packs` option, generate the new commit graph
> > +by walking objects only in the specified pack-indexes. (Cannot be
> > +combined with `--input=stdin-commits` or `--input=reachable`.)
> >  +
> > -With the `--input=stdin-packs` option, generate the new commit graph by
> > -walking objects only in the specified pack-indexes. (Cannot be combined
> > -with `--input=stdin-commits` or `--input=reachable`.)
> > -+
> > -With the `--input=stdin-commits` option, generate the new commit graph
> > +* With the `--input=stdin-commits` option, generate the new commit graph
> >  by walking commits starting at the commits specified in stdin as a list
> >  of OIDs in hex, one OID per line. (Cannot be combined with
> >  `--input=stdin-packs` or `--input=reachable`.)
> >  +
> > -With the `--input=reachable` option, generate the new commit graph by
> > +* With the `--input=reachable` option, generate the new commit graph by
> >  walking commits starting at all refs. (Cannot be combined with
> >  `--input=stdin-commits` or `--input=stdin-packs`.)
> >  +
> > -With the `--input=append` option, include all commits that are present
> > +* With the `--input=append` option, include all commits that are present
> >  in the existing commit-graph file.
>
> Do these changes above really belong in this commit?

I think so. My thought here was to leave this documentation as-is until
this patch, when adding '--input=none' would... somehow change this, but
trying to construct a reply, I can't seem to come up with why I thought
that this was a good idea in the first place ;-).

> > +* With the `--input=none` option, behave as if `input=append` were
> > +given, but do not walk other packs to find additional commits.
> > +
> > +If none of the above options are given, then commits found in
> > +packfiles are specified.
>
> "specified"? Plus, that also happens for `--input=append` right? (It
> really seems like "append" is an odd one among all the inputs.)

I reworded this slightly to not use "specified", which I agree is indeed
weird.

>
> >         N_("git commit-graph write [--object-dir <objdir>] [--append] "
> > -          "[--split[=<strategy>]] [--input=<reachable|stdin-packs|stdin-commits>] "
> > +          "[--split[=<strategy>]] "
> > +          "[--input=<reachable|stdin-packs|stdin-commits|none>] "
> >            "[--[no-]progress] <split options>"),
>
> Hmm, you've left "--append" the old way.

Fixed, and thanks for noticing.
>
> Martin

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 1/3] builtin/commit-graph.c: support '--split[=<strategy>]'
  2020-02-04  3:58           ` Taylor Blau
@ 2020-02-04 14:14             ` Jeff King
  0 siblings, 0 replies; 58+ messages in thread
From: Jeff King @ 2020-02-04 14:14 UTC (permalink / raw)
  To: Taylor Blau; +Cc: SZEDER Gábor, Johannes Schindelin, git, dstolee, gitster

On Mon, Feb 03, 2020 at 07:58:21PM -0800, Taylor Blau wrote:

> I think that this seems like a good step that we should probably take,
> but I don't think that it's necessary for the series at hand. The
> pattern in this function is to define a local variable which has the
> same value as in split_opts, or a reasonable default if split_opts is
> NULL (c.f., 'max_commits' and 'size_mult').
> 
> So, I think that a safe thing to do which prevents the segv and doesn't
> change the pattern too much is to write:
> 
>   enum commit_graph_split_flags flags = COMMIT_GRAPH_SPLIT_MERGE_AUTO;
>   if (ctx->split_opts) {
>     /* ... */
>     flags = ctx->split_opts->flags;
>   }
> 
>   /* ... */
> 
>   if (flags != COMMIT_GRAPH_SPLIT_MERGE_PROHIBITED) {
>     while ( ... ) { ... }
>   }
> 
> This is adding another local variable, which seems like an odd thing to
> do *every* time that we add another member to split_opts. So for that
> reason it seems like in the longer-term we should either force the
> caller to pass in a blank, or do something else that doesn't require
> this, but I think that the intermediate cost isn't too bad.

It would perhaps be simpler to turn NULL into a _single_ default-filled
split_opts variable, and then just use it throughout the function. And
since presumably zero-initialization would yield good defaults (or we'd
define an INIT macro for the convenience of callers), it would be a
one-liner that we'd only have to do once.

But I think that can wait; the "if" solution discussed seems like a
straightforward way to make this patch correct both on top of master,
and when merged with next.

-Peff

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 0/3] builtin/commit-graph.c: new split/merge options
  2020-01-31  0:28 [PATCH 0/3] builtin/commit-graph.c: new split/merge options Taylor Blau
                   ` (4 preceding siblings ...)
  2020-01-31 14:41 ` Derrick Stolee
@ 2020-02-04 23:44 ` Junio C Hamano
  2020-02-05  0:30   ` Taylor Blau
  2020-02-05  0:28 ` [PATCH v2 " Taylor Blau
  2020-02-12  5:47 ` [PATCH v3 " Taylor Blau
  7 siblings, 1 reply; 58+ messages in thread
From: Junio C Hamano @ 2020-02-04 23:44 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, peff, dstolee

As the topoic this depends on has been updated, I tried to rebase
this on top, but I am seeing segfaults in tests.  We'd probably need
a fresh round of this one to replace it.

Thanks.


^ permalink raw reply	[flat|nested] 58+ messages in thread

* [PATCH v2 0/3] builtin/commit-graph.c: new split/merge options
  2020-01-31  0:28 [PATCH 0/3] builtin/commit-graph.c: new split/merge options Taylor Blau
                   ` (5 preceding siblings ...)
  2020-02-04 23:44 ` Junio C Hamano
@ 2020-02-05  0:28 ` " Taylor Blau
  2020-02-05  0:28   ` [PATCH v2 1/3] builtin/commit-graph.c: support '--split[=<strategy>]' Taylor Blau
                     ` (3 more replies)
  2020-02-12  5:47 ` [PATCH v3 " Taylor Blau
  7 siblings, 4 replies; 58+ messages in thread
From: Taylor Blau @ 2020-02-05  0:28 UTC (permalink / raw)
  To: git; +Cc: peff, dstolee, gitster, martin.agren

Hi,

Here is an updated 'v2' of my series to introduce new splitting and
merging options to the 'commit-graph write' builtin.

These patches are updated to be based on the latest changes in the
series upon which this is based (tb/commit-graph-use-odb), and contain
some other fixes that I picked up during the last round of review. For
convenience, I included a range-diff against 'v1' below.

Thanks as always for your review.

Taylor

Taylor Blau (3):
  builtin/commit-graph.c: support '--split[=<strategy>]'
  builtin/commit-graph.c: introduce '--input=<source>'
  builtin/commit-graph.c: support '--input=none'

 Documentation/git-commit-graph.txt |  50 ++++++++-----
 builtin/commit-graph.c             | 115 +++++++++++++++++++++++------
 commit-graph.c                     |  28 ++++---
 commit-graph.h                     |  10 ++-
 t/t5318-commit-graph.sh            |   4 +-
 t/t5324-split-commit-graph.sh      |  53 ++++++++++++-
 6 files changed, 204 insertions(+), 56 deletions(-)

Range-diff against v1:
1:  470bbe3cef ! 1:  3e19d50148 builtin/commit-graph.c: support '--split[=<strategy>]'
    @@ Commit message
         With '--split', the commit-graph machinery writes new commits in another
         incremental commit-graph which is part of the existing chain, and
         optionally decides to condense the chain into a single commit-graph.
    -    This is done to ensure that the aysmptotic behavior of looking up a
    +    This is done to ensure that the asymptotic behavior of looking up a
         commit in an incremental chain is dominated by the number of
         incrementals in that chain. It can be controlled by the '--max-commits'
         and '--size-multiple' options.
    @@ Documentation/git-commit-graph.txt: or `--stdin-packs`.)
     +conducted, and the remaining options are ignored. Conversely, if
     +`--split=no-merge` is specified, a merge is never performed, and the
     +remaining options are ignored. A bare `--split` defers to the remaining
    -+options. (Note that merging a chain of commit graphs replaces the
    -+existing chain with a length-1 chain where the first and only
    -+incremental holds the entire graph).
    ++options.
      +
      * If `--size-multiple=<X>` is not specified, let `X` equal 2. If the new
      tip file would have `N` commits and the previous tip has `M` commits and
    @@ builtin/commit-graph.c: static int graph_verify(int argc, const char **argv)
     +	enum commit_graph_split_flags *flags = opt->value;
     +
     +	opts.split = 1;
    -+	if (!arg)
    ++	if (!arg) {
    ++		*flags = COMMIT_GRAPH_SPLIT_MERGE_AUTO;
     +		return 0;
    ++	}
     +
     +	if (!strcmp(arg, "merge-all"))
     +		*flags = COMMIT_GRAPH_SPLIT_MERGE_REQUIRED;
    @@ builtin/commit-graph.c: static int graph_write(int argc, const char **argv)

      ## commit-graph.c ##
     @@ commit-graph.c: static void split_graph_merge_strategy(struct write_commit_graph_context *ctx)
    +
    + 	int max_commits = 0;
    + 	int size_mult = 2;
    ++	enum commit_graph_split_flags flags = COMMIT_GRAPH_SPLIT_MERGE_AUTO;
    +
    + 	if (ctx->split_opts) {
    + 		max_commits = ctx->split_opts->max_commits;
    +
    + 		if (ctx->split_opts->size_multiple)
    + 			size_mult = ctx->split_opts->size_multiple;
    ++
    ++		flags = ctx->split_opts->flags;
    + 	}
    +
    + 	g = ctx->r->objects->commit_graph;
      	num_commits = ctx->commits.nr;
      	ctx->num_commit_graphs_after = ctx->num_commit_graphs_before + 1;

    @@ commit-graph.c: static void split_graph_merge_strategy(struct write_commit_graph
     -		    (max_commits && num_commits > max_commits))) {
     -		if (g->odb != ctx->odb)
     -			break;
    -+	if (ctx->split_opts->flags != COMMIT_GRAPH_SPLIT_MERGE_PROHIBITED) {
    ++	if (flags != COMMIT_GRAPH_SPLIT_MERGE_PROHIBITED) {
     +		while (g && (g->num_commits <= size_mult * num_commits ||
     +			    (max_commits && num_commits > max_commits) ||
    -+			    (ctx->split_opts->flags == COMMIT_GRAPH_SPLIT_MERGE_REQUIRED))) {
    ++			    (flags == COMMIT_GRAPH_SPLIT_MERGE_REQUIRED))) {
     +			if (g->odb != ctx->odb)
     +				break;

    @@ commit-graph.c: int write_commit_graph(struct object_directory *odb,
      	}

     -	if (!ctx->commits.nr)
    -+	if (!ctx->commits.nr && (!ctx->split || ctx->split_opts->flags != COMMIT_GRAPH_SPLIT_MERGE_REQUIRED))
    ++	if (!ctx->commits.nr && (!ctx->split_opts || ctx->split_opts->flags != COMMIT_GRAPH_SPLIT_MERGE_REQUIRED))
      		goto cleanup;

      	if (ctx->split) {
    @@ commit-graph.h: enum commit_graph_write_flags {
      };

     +enum commit_graph_split_flags {
    -+	COMMIT_GRAPH_SPLIT_UNSPECIFIED      = 0,
    ++	COMMIT_GRAPH_SPLIT_MERGE_AUTO       = 0,
     +	COMMIT_GRAPH_SPLIT_MERGE_REQUIRED   = 1,
     +	COMMIT_GRAPH_SPLIT_MERGE_PROHIBITED = 2
     +};
2:  1b585fe4e7 ! 2:  1589bc1d69 builtin/commit-graph.c: introduce '--input=<source>'
    @@ Documentation/git-commit-graph.txt: $ git commit-graph write

      ## builtin/commit-graph.c ##
     @@
    +
      static char const * const builtin_commit_graph_usage[] = {
      	N_("git commit-graph verify [--object-dir <objdir>] [--shallow] [--[no-]progress]"),
    - 	N_("git commit-graph write [--object-dir <objdir>] [--append] "
    +-	N_("git commit-graph write [--object-dir <objdir>] [--append] "
     -	   "[--split[=<strategy>]] [--reachable|--stdin-packs|--stdin-commits] "
    -+	   "[--split[=<strategy>]] [--input=<reachable|stdin-packs|stdin-commits>] "
    ++	N_("git commit-graph write [--object-dir <objdir>] "
    ++	   "[--split[=<strategy>]] "
    ++	   "[--input=<reachable|stdin-packs|stdin-commits|append>] "
      	   "[--[no-]progress] <split options>"),
      	NULL
      };
     @@ builtin/commit-graph.c: static const char * const builtin_commit_graph_verify_usage[] = {
    + };

      static const char * const builtin_commit_graph_write_usage[] = {
    - 	N_("git commit-graph write [--object-dir <objdir>] [--append] "
    +-	N_("git commit-graph write [--object-dir <objdir>] [--append] "
     -	   "[--split[=<strategy>]] [--reachable|--stdin-packs|--stdin-commits] "
    -+	   "[--split[=<strategy>]] [--input=<reachable|stdin-packs|stdin-commits>] "
    ++	N_("git commit-graph write [--object-dir <objdir>] "
    ++	   "[--split[=<strategy>]] "
    ++	   "[--input=<reachable|stdin-packs|stdin-commits|append>] "
      	   "[--[no-]progress] <split options>"),
      	NULL
      };
    @@ builtin/commit-graph.c: static int graph_write(int argc, const char **argv)
      		}

      ## t/t5318-commit-graph.sh ##
    -@@ t/t5318-commit-graph.sh: test_expect_success 'write graph with no packs' '
    - 	test_path_is_missing $objdir/info/commit-graph
    - '
    -
    --test_expect_success 'exit with correct error on bad input to --stdin-packs' '
    -+test_expect_success 'exit with correct error on bad input to --input=stdin-packs' '
    - 	cd "$TRASH_DIRECTORY/full" &&
    - 	echo doesnotexist >in &&
    --	test_expect_code 1 git commit-graph write --stdin-packs <in 2>stderr &&
    -+	test_expect_code 1 git commit-graph write --input=stdin-packs <in 2>stderr &&
    - 	test_i18ngrep "error adding pack" stderr
    - '
    -
    -@@ t/t5318-commit-graph.sh: test_expect_success 'create commits and repack' '
    - 	git repack
    - '
    -
    --test_expect_success 'exit with correct error on bad input to --stdin-commits' '
    -+test_expect_success 'exit with correct error on bad input to --input=stdin-commits' '
    - 	cd "$TRASH_DIRECTORY/full" &&
    --	echo HEAD | test_expect_code 1 git commit-graph write --stdin-commits 2>stderr &&
    -+	echo HEAD | test_expect_code 1 git commit-graph write --input=stdin-commits 2>stderr &&
    - 	test_i18ngrep "invalid commit object id" stderr &&
    - 	# valid tree OID, but not a commit OID
    --	git rev-parse HEAD^{tree} | test_expect_code 1 git commit-graph write --stdin-commits 2>stderr &&
    -+	git rev-parse HEAD^{tree} | test_expect_code 1 git commit-graph write --input=stdin-commits 2>stderr &&
    - 	test_i18ngrep "invalid commit object id" stderr
    - '
    -
     @@ t/t5318-commit-graph.sh: graph_git_behavior 'cleared graph, commit 8 vs merge 2' full commits/8 merge/2

      test_expect_success 'build graph from latest pack with closure' '
    @@ t/t5318-commit-graph.sh: test_expect_success 'build graph from commits with clos
      	test_path_is_file $objdir/info/commit-graph &&
      	graph_read_expect "6"
      '
    -@@ t/t5318-commit-graph.sh: graph_git_behavior 'graph from commits, commit 8 vs merge 2' full commits/8 merg
    -
    - test_expect_success 'build graph from commits with append' '
    - 	cd "$TRASH_DIRECTORY/full" &&
    --	git rev-parse merge/3 | git commit-graph write --stdin-commits --append &&
    -+	git rev-parse merge/3 | git commit-graph write --input=stdin-commits --input=append &&
    - 	test_path_is_file $objdir/info/commit-graph &&
    - 	graph_read_expect "10" "extra_edges"
    - '
    -@@ t/t5318-commit-graph.sh: graph_git_behavior 'append graph, commit 8 vs merge 2' full commits/8 merge/2
    -
    - test_expect_success 'build graph using --reachable' '
    - 	cd "$TRASH_DIRECTORY/full" &&
    --	git commit-graph write --reachable &&
    -+	git commit-graph write --input=reachable &&
    - 	test_path_is_file $objdir/info/commit-graph &&
    - 	graph_read_expect "11" "extra_edges"
    - '
    -@@ t/t5318-commit-graph.sh: test_expect_success 'perform fast-forward merge in full repo' '
    - test_expect_success 'check that gc computes commit-graph' '
    - 	cd "$TRASH_DIRECTORY/full" &&
    - 	git commit --allow-empty -m "blank" &&
    --	git commit-graph write --reachable &&
    -+	git commit-graph write --input=reachable &&
    - 	cp $objdir/info/commit-graph commit-graph-before-gc &&
    - 	git reset --hard HEAD~1 &&
    - 	git config gc.writeCommitGraph true &&
    - 	git gc &&
    - 	cp $objdir/info/commit-graph commit-graph-after-gc &&
    - 	! test_cmp_bin commit-graph-before-gc commit-graph-after-gc &&
    --	git commit-graph write --reachable &&
    -+	git commit-graph write --input=reachable &&
    - 	test_cmp_bin commit-graph-after-gc $objdir/info/commit-graph
    - '
    -
    -@@ t/t5318-commit-graph.sh: test_expect_success 'replace-objects invalidates commit-graph' '
    - 	git clone full replace &&
    - 	(
    - 		cd replace &&
    --		git commit-graph write --reachable &&
    -+		git commit-graph write --input=reachable &&
    - 		test_path_is_file .git/objects/info/commit-graph &&
    - 		git replace HEAD~1 HEAD~2 &&
    - 		git -c core.commitGraph=false log >expect &&
    - 		git -c core.commitGraph=true log >actual &&
    - 		test_cmp expect actual &&
    --		git commit-graph write --reachable &&
    -+		git commit-graph write --input=reachable &&
    - 		git -c core.commitGraph=false --no-replace-objects log >expect &&
    - 		git -c core.commitGraph=true --no-replace-objects log >actual &&
    - 		test_cmp expect actual &&
    - 		rm -rf .git/objects/info/commit-graph &&
    --		git commit-graph write --reachable &&
    -+		git commit-graph write --input=reachable &&
    - 		test_path_is_file .git/objects/info/commit-graph
    - 	)
    - '
    -@@ t/t5318-commit-graph.sh: test_expect_success 'commit grafts invalidate commit-graph' '
    - 	git clone full graft &&
    - 	(
    - 		cd graft &&
    --		git commit-graph write --reachable &&
    -+		git commit-graph write --input=reachable &&
    - 		test_path_is_file .git/objects/info/commit-graph &&
    - 		H1=$(git rev-parse --verify HEAD~1) &&
    - 		H3=$(git rev-parse --verify HEAD~3) &&
    -@@ t/t5318-commit-graph.sh: test_expect_success 'commit grafts invalidate commit-graph' '
    - 		git -c core.commitGraph=false log >expect &&
    - 		git -c core.commitGraph=true log >actual &&
    - 		test_cmp expect actual &&
    --		git commit-graph write --reachable &&
    -+		git commit-graph write --input=reachable &&
    - 		git -c core.commitGraph=false --no-replace-objects log >expect &&
    - 		git -c core.commitGraph=true --no-replace-objects log >actual &&
    - 		test_cmp expect actual &&
    - 		rm -rf .git/objects/info/commit-graph &&
    --		git commit-graph write --reachable &&
    -+		git commit-graph write --input=reachable &&
    - 		test_path_is_missing .git/objects/info/commit-graph
    - 	)
    - '
    -@@ t/t5318-commit-graph.sh: test_expect_success 'replace-objects invalidates commit-graph' '
    - 	git clone --depth 2 "file://$TRASH_DIRECTORY/full" shallow &&
    - 	(
    - 		cd shallow &&
    --		git commit-graph write --reachable &&
    -+		git commit-graph write --input=reachable &&
    - 		test_path_is_missing .git/objects/info/commit-graph &&
    - 		git fetch origin --unshallow &&
    --		git commit-graph write --reachable &&
    -+		git commit-graph write --input=reachable &&
    - 		test_path_is_file .git/objects/info/commit-graph
    - 	)
    - '
    -@@ t/t5318-commit-graph.sh: test_expect_success 'replace-objects invalidates commit-graph' '
    -
    - test_expect_success 'git commit-graph verify' '
    - 	cd "$TRASH_DIRECTORY/full" &&
    --	git rev-parse commits/8 | git commit-graph write --stdin-commits &&
    -+	git rev-parse commits/8 | git commit-graph write --input=stdin-commits &&
    - 	git commit-graph verify >output
    - '
    -
    -@@ t/t5318-commit-graph.sh: test_expect_success 'setup non-the_repository tests' '
    - 	test_commit -C repo two &&
    - 	git -C repo config core.commitGraph true &&
    - 	git -C repo rev-parse two | \
    --		git -C repo commit-graph write --stdin-commits
    -+		git -C repo commit-graph write --input=stdin-commits
    - '
    -
    - test_expect_success 'parse_commit_in_graph works for non-the_repository' '
    -@@ t/t5318-commit-graph.sh: test_expect_success 'corrupt commit-graph write (broken parent)' '
    - 		EOF
    - 		broken="$(git hash-object -w -t commit --literally broken)" &&
    - 		git commit-tree -p "$broken" -m "good commit" "$empty" >good &&
    --		test_must_fail git commit-graph write --stdin-commits \
    -+		test_must_fail git commit-graph write --input=stdin-commits \
    - 			<good 2>test_err &&
    - 		test_i18ngrep "unable to parse commit" test_err
    - 	)
    -@@ t/t5318-commit-graph.sh: test_expect_success 'corrupt commit-graph write (missing tree)' '
    - 		EOF
    - 		broken="$(git hash-object -w -t commit --literally broken)" &&
    - 		git commit-tree -p "$broken" -m "good" "$tree" >good &&
    --		test_must_fail git commit-graph write --stdin-commits \
    -+		test_must_fail git commit-graph write --input=stdin-commits \
    - 			<good 2>test_err &&
    - 		test_i18ngrep "unable to parse commit" test_err
    - 	)

      ## t/t5324-split-commit-graph.sh ##
     @@ t/t5324-split-commit-graph.sh: test_expect_success 'create commits and write commit-graph' '
    @@ t/t5324-split-commit-graph.sh: test_expect_success 'create commits and write com
      	test_path_is_file $infodir/commit-graph &&
      	graph_read_expect 3
      '
    -@@ t/t5324-split-commit-graph.sh: test_expect_success 'add more commits, and write a new base graph' '
    - 	git reset --hard commits/4 &&
    - 	git merge commits/6 &&
    - 	git branch merge/2 &&
    --	git commit-graph write --reachable &&
    -+	git commit-graph write --input=reachable &&
    - 	graph_read_expect 12
    - '
    -
    -@@ t/t5324-split-commit-graph.sh: test_expect_success 'fork and fail to base a chain on a commit-graph file' '
    - 		rm .git/objects/info/commit-graph &&
    - 		echo "$(pwd)/../.git/objects" >.git/objects/info/alternates &&
    - 		test_commit new-commit &&
    --		git commit-graph write --reachable --split &&
    -+		git commit-graph write --input=reachable --split &&
    - 		test_path_is_file $graphdir/commit-graph-chain &&
    - 		test_line_count = 1 $graphdir/commit-graph-chain &&
    - 		verify_chain_files_exist $graphdir
    -@@ t/t5324-split-commit-graph.sh: test_expect_success 'add three more commits, write a tip graph' '
    - 	git merge commits/5 &&
    - 	git merge merge/2 &&
    - 	git branch merge/3 &&
    --	git commit-graph write --reachable --split &&
    -+	git commit-graph write --input=reachable --split &&
    - 	test_path_is_missing $infodir/commit-graph &&
    - 	test_path_is_file $graphdir/commit-graph-chain &&
    - 	ls $graphdir/graph-*.graph >graph-files &&
    -@@ t/t5324-split-commit-graph.sh: graph_git_behavior 'split commit-graph: merge 3 vs 2' merge/3 merge/2
    - test_expect_success 'add one commit, write a tip graph' '
    - 	test_commit 11 &&
    - 	git branch commits/11 &&
    --	git commit-graph write --reachable --split &&
    -+	git commit-graph write --input=reachable --split &&
    - 	test_path_is_missing $infodir/commit-graph &&
    - 	test_path_is_file $graphdir/commit-graph-chain &&
    - 	ls $graphdir/graph-*.graph >graph-files &&
    -@@ t/t5324-split-commit-graph.sh: graph_git_behavior 'three-layer commit-graph: commit 11 vs 6' commits/11 commits
    - test_expect_success 'add one commit, write a merged graph' '
    - 	test_commit 12 &&
    - 	git branch commits/12 &&
    --	git commit-graph write --reachable --split &&
    -+	git commit-graph write --input=reachable --split &&
    - 	test_path_is_file $graphdir/commit-graph-chain &&
    - 	test_line_count = 2 $graphdir/commit-graph-chain &&
    - 	ls $graphdir/graph-*.graph >graph-files &&
    -@@ t/t5324-split-commit-graph.sh: test_expect_success 'create fork and chain across alternate' '
    - 		echo "$(pwd)/../.git/objects" >.git/objects/info/alternates &&
    - 		test_commit 13 &&
    - 		git branch commits/13 &&
    --		git commit-graph write --reachable --split &&
    -+		git commit-graph write --input=reachable --split &&
    - 		test_path_is_file $graphdir/commit-graph-chain &&
    - 		test_line_count = 3 $graphdir/commit-graph-chain &&
    - 		ls $graphdir/graph-*.graph >graph-files &&
    -@@ t/t5324-split-commit-graph.sh: test_expect_success 'create fork and chain across alternate' '
    - 		git -c core.commitGraph=false rev-list HEAD >actual &&
    - 		test_cmp expect actual &&
    - 		test_commit 14 &&
    --		git commit-graph write --reachable --split --object-dir=.git/objects/ &&
    -+		git commit-graph write --input=reachable --split --object-dir=.git/objects/ &&
    - 		test_line_count = 3 $graphdir/commit-graph-chain &&
    - 		ls $graphdir/graph-*.graph >graph-files &&
    - 		test_line_count = 1 graph-files
    -@@ t/t5324-split-commit-graph.sh: test_expect_success 'test merge stragety constants' '
    - 		git config core.commitGraph true &&
    - 		test_line_count = 2 $graphdir/commit-graph-chain &&
    - 		test_commit 14 &&
    --		git commit-graph write --reachable --split --size-multiple=2 &&
    -+		git commit-graph write --input=reachable --split --size-multiple=2 &&
    - 		test_line_count = 3 $graphdir/commit-graph-chain
    -
    - 	) &&
    -@@ t/t5324-split-commit-graph.sh: test_expect_success 'test merge stragety constants' '
    - 		git config core.commitGraph true &&
    - 		test_line_count = 2 $graphdir/commit-graph-chain &&
    - 		test_commit 14 &&
    --		git commit-graph write --reachable --split --size-multiple=10 &&
    -+		git commit-graph write --input=reachable --split --size-multiple=10 &&
    - 		test_line_count = 1 $graphdir/commit-graph-chain &&
    - 		ls $graphdir/graph-*.graph >graph-files &&
    - 		test_line_count = 1 graph-files
    -@@ t/t5324-split-commit-graph.sh: test_expect_success 'test merge stragety constants' '
    - 		git config core.commitGraph true &&
    - 		test_line_count = 2 $graphdir/commit-graph-chain &&
    - 		test_commit 15 &&
    --		git commit-graph write --reachable --split --size-multiple=10 --expire-time=1980-01-01 &&
    -+		git commit-graph write --input=reachable --split --size-multiple=10 --expire-time=1980-01-01 &&
    - 		test_line_count = 1 $graphdir/commit-graph-chain &&
    - 		ls $graphdir/graph-*.graph >graph-files &&
    - 		test_line_count = 3 graph-files
    -@@ t/t5324-split-commit-graph.sh: test_expect_success 'test merge stragety constants' '
    - 		test_line_count = 2 $graphdir/commit-graph-chain &&
    - 		test_commit 16 &&
    - 		test_commit 17 &&
    --		git commit-graph write --reachable --split --max-commits=1 &&
    -+		git commit-graph write --input=reachable --split --max-commits=1 &&
    - 		test_line_count = 1 $graphdir/commit-graph-chain &&
    - 		ls $graphdir/graph-*.graph >graph-files &&
    - 		test_line_count = 1 graph-files
    -@@ t/t5324-split-commit-graph.sh: test_expect_success 'remove commit-graph-chain file after flattening' '
    - 	(
    - 		cd flatten &&
    - 		test_line_count = 2 $graphdir/commit-graph-chain &&
    --		git commit-graph write --reachable &&
    -+		git commit-graph write --input=reachable &&
    - 		test_path_is_missing $graphdir/commit-graph-chain &&
    - 		ls $graphdir >graph-files &&
    - 		test_line_count = 0 graph-files
    -@@ t/t5324-split-commit-graph.sh: test_expect_success 'verify across alternates' '
    - 		echo "$altdir" >.git/objects/info/alternates &&
    - 		git commit-graph verify --object-dir="$altdir/" &&
    - 		test_commit extra &&
    --		git commit-graph write --reachable --split &&
    -+		git commit-graph write --input=reachable --split &&
    - 		tip_file=$graphdir/graph-$(tail -n 1 $graphdir/commit-graph-chain).graph &&
    - 		corrupt_file "$tip_file" 100 "\01" &&
    - 		test_must_fail git commit-graph verify --shallow 2>test_err &&
    -@@ t/t5324-split-commit-graph.sh: test_expect_success 'add octopus merge' '
    - 	git reset --hard commits/10 &&
    - 	git merge commits/3 commits/4 &&
    - 	git branch merge/octopus &&
    --	git commit-graph write --reachable --split &&
    -+	git commit-graph write --input=reachable --split &&
    - 	git commit-graph verify --progress 2>err &&
    - 	test_line_count = 3 err &&
    - 	test_i18ngrep ! warning err &&
    -@@ t/t5324-split-commit-graph.sh: test_expect_success 'add octopus merge' '
    - graph_git_behavior 'graph exists' merge/octopus commits/12
    -
    - test_expect_success 'split across alternate where alternate is not split' '
    --	git commit-graph write --reachable &&
    -+	git commit-graph write --input=reachable &&
    - 	test_path_is_file .git/objects/info/commit-graph &&
    - 	cp .git/objects/info/commit-graph . &&
    - 	git clone --no-hardlinks . alt-split &&
    -@@ t/t5324-split-commit-graph.sh: test_expect_success 'split across alternate where alternate is not split' '
    - 		rm -f .git/objects/info/commit-graph &&
    - 		echo "$(pwd)"/../.git/objects >.git/objects/info/alternates &&
    - 		test_commit 18 &&
    --		git commit-graph write --reachable --split &&
    -+		git commit-graph write --input=reachable --split &&
    - 		test_line_count = 1 $graphdir/commit-graph-chain
    - 	) &&
    - 	test_cmp commit-graph .git/objects/info/commit-graph
    -@@ t/t5324-split-commit-graph.sh: test_expect_success '--split=merge-all always merges incrementals' '
    - 	git rev-list -3 HEAD~4 >a &&
    - 	git rev-list -2 HEAD~2 >b &&
    - 	git rev-list -2 HEAD >c &&
    --	git commit-graph write --split=no-merge --stdin-commits <a &&
    --	git commit-graph write --split=no-merge --stdin-commits <b &&
    -+	git commit-graph write --split=no-merge --input=stdin-commits <a &&
    -+	git commit-graph write --split=no-merge --input=stdin-commits <b &&
    - 	test_line_count = 2 $graphdir/commit-graph-chain &&
    --	git commit-graph write --split=merge-all --stdin-commits <c &&
    -+	git commit-graph write --split=merge-all --input=stdin-commits <c &&
    - 	test_line_count = 1 $graphdir/commit-graph-chain
    - '
    -
    -@@ t/t5324-split-commit-graph.sh: test_expect_success '--split=no-merge always writes an incremental' '
    - 	git reset --hard commits/2 &&
    - 	git rev-list HEAD~1 >a &&
    - 	git rev-list HEAD >b &&
    --	git commit-graph write --split --stdin-commits <a &&
    --	git commit-graph write --split=no-merge --stdin-commits <b &&
    -+	git commit-graph write --split --input=stdin-commits <a &&
    -+	git commit-graph write --split=no-merge --input=stdin-commits <b &&
    - 	test_line_count = 2 $graphdir/commit-graph-chain
    - '
    -
3:  a5d3367788 ! 3:  4c6425f0da builtin/commit-graph.c: support '--input=none'
    @@ Metadata
      ## Commit message ##
         builtin/commit-graph.c: support '--input=none'

    -    In the previous commit, we introduced '--[no-]merge', and alluded to the
    -    fact that '--merge' would be useful for callers who wish to always
    -    trigger a merge of an incremental chain.
    +    In the previous commit, we introduced '--split=<no-merge|merge-all>',
    +    and alluded to the fact that '--split=merge-all' would be useful for
    +    callers who wish to always trigger a merge of an incremental chain.

         There is a problem with the above approach, which is that there is no
         way to specify to the commit-graph builtin that a caller only wants to
    @@ Documentation/git-commit-graph.txt: COMMANDS
      'write'::

     -Write a commit-graph file based on the commits found in packfiles.
    -+Write a commit-graph file based on the commits specified:
    -+* With the `--input=stdin-packs` option, generate the new commit graph
    -+by walking objects only in the specified pack-indexes. (Cannot be
    -+combined with `--input=stdin-commits` or `--input=reachable`.)
    ++Write a commit-graph file based on the specified sources of input:
      +
    --With the `--input=stdin-packs` option, generate the new commit graph by
    --walking objects only in the specified pack-indexes. (Cannot be combined
    --with `--input=stdin-commits` or `--input=reachable`.)
    --+
    --With the `--input=stdin-commits` option, generate the new commit graph
    -+* With the `--input=stdin-commits` option, generate the new commit graph
    - by walking commits starting at the commits specified in stdin as a list
    - of OIDs in hex, one OID per line. (Cannot be combined with
    - `--input=stdin-packs` or `--input=reachable`.)
    - +
    --With the `--input=reachable` option, generate the new commit graph by
    -+* With the `--input=reachable` option, generate the new commit graph by
    - walking commits starting at all refs. (Cannot be combined with
    - `--input=stdin-commits` or `--input=stdin-packs`.)
    - +
    --With the `--input=append` option, include all commits that are present
    -+* With the `--input=append` option, include all commits that are present
    + With the `--input=stdin-packs` option, generate the new commit graph by
    + walking objects only in the specified pack-indexes. (Cannot be combined
    +@@ Documentation/git-commit-graph.txt: walking commits starting at all refs. (Cannot be combined with
    + With the `--input=append` option, include all commits that are present
      in the existing commit-graph file.
      +
    -+* With the `--input=none` option, behave as if `input=append` were
    ++With the `--input=none` option, behave as if `--input=append` were
     +given, but do not walk other packs to find additional commits.
     +
    -+If none of the above options are given, then commits found in
    -+packfiles are specified.
    ++If none of the above options are given, then generate the new
    ++commit-graph by walking over all pack-indexes.
     ++
      With the `--split[=<strategy>]` option, write the commit-graph as a
      chain of multiple commit-graph files stored in
      `<dir>/info/commit-graphs`. Commit-graph layers are merged based on the

      ## builtin/commit-graph.c ##
    -@@
    - static char const * const builtin_commit_graph_usage[] = {
    +@@ builtin/commit-graph.c: static char const * const builtin_commit_graph_usage[] = {
      	N_("git commit-graph verify [--object-dir <objdir>] [--shallow] [--[no-]progress]"),
    - 	N_("git commit-graph write [--object-dir <objdir>] [--append] "
    --	   "[--split[=<strategy>]] [--input=<reachable|stdin-packs|stdin-commits>] "
    -+	   "[--split[=<strategy>]] "
    -+	   "[--input=<reachable|stdin-packs|stdin-commits|none>] "
    + 	N_("git commit-graph write [--object-dir <objdir>] "
    + 	   "[--split[=<strategy>]] "
    +-	   "[--input=<reachable|stdin-packs|stdin-commits|append>] "
    ++	   "[--input=<reachable|stdin-packs|stdin-commits|append|none>] "
      	   "[--[no-]progress] <split options>"),
      	NULL
      };
     @@ builtin/commit-graph.c: static const char * const builtin_commit_graph_verify_usage[] = {
    -
      static const char * const builtin_commit_graph_write_usage[] = {
    - 	N_("git commit-graph write [--object-dir <objdir>] [--append] "
    --	   "[--split[=<strategy>]] [--input=<reachable|stdin-packs|stdin-commits>] "
    -+	   "[--split[=<strategy>]] "
    -+	   "[--input=<reachable|stdin-packs|stdin-commits|none>] "
    + 	N_("git commit-graph write [--object-dir <objdir>] "
    + 	   "[--split[=<strategy>]] "
    +-	   "[--input=<reachable|stdin-packs|stdin-commits|append>] "
    ++	   "[--input=<reachable|stdin-packs|stdin-commits|append|none>] "
      	   "[--[no-]progress] <split options>"),
      	NULL
      };
--
2.25.0.119.gaa12b7378b

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [PATCH v2 1/3] builtin/commit-graph.c: support '--split[=<strategy>]'
  2020-02-05  0:28 ` [PATCH v2 " Taylor Blau
@ 2020-02-05  0:28   ` Taylor Blau
  2020-02-06 19:41     ` Martin Ågren
  2020-02-05  0:28   ` [PATCH v2 2/3] builtin/commit-graph.c: introduce '--input=<source>' Taylor Blau
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 58+ messages in thread
From: Taylor Blau @ 2020-02-05  0:28 UTC (permalink / raw)
  To: git; +Cc: peff, dstolee, gitster, martin.agren

With '--split', the commit-graph machinery writes new commits in another
incremental commit-graph which is part of the existing chain, and
optionally decides to condense the chain into a single commit-graph.
This is done to ensure that the asymptotic behavior of looking up a
commit in an incremental chain is dominated by the number of
incrementals in that chain. It can be controlled by the '--max-commits'
and '--size-multiple' options.

On occasion, callers may want to ensure that 'git commit-graph write
--split' always writes an incremental, and never spends effort
condensing the incremental chain [1]. Previously, this was possible by
passing '--size-multiple=0', but this no longer the case following
63020f175f (commit-graph: prefer default size_mult when given zero,
2020-01-02).

Reintroduce a less-magical variant of the above with a new pair of
arguments to '--split': '--split=no-merge' and '--split=merge-all'. When
'--split=no-merge' is given, the commit-graph machinery will never
condense an existing chain and will always write a new incremental.
Conversely, if '--split=merge-all' is given, any invocation including it
will always condense a chain if one exists.  If '--split' is given with
no arguments, it behaves as before and defers to '--size-multiple', and
so on.

[1]: This might occur when, for example, a server administrator running
some program after each push may want to ensure that each job runs
proportional in time to the size of the push, and does not "jump" when
the commit-graph machinery decides to trigger a merge.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 Documentation/git-commit-graph.txt | 16 +++++++++-----
 builtin/commit-graph.c             | 35 ++++++++++++++++++++++++++----
 commit-graph.c                     | 22 ++++++++++++-------
 commit-graph.h                     |  7 ++++++
 t/t5324-split-commit-graph.sh      | 25 +++++++++++++++++++++
 5 files changed, 88 insertions(+), 17 deletions(-)

diff --git a/Documentation/git-commit-graph.txt b/Documentation/git-commit-graph.txt
index 28d1fee505..b7fe65ef21 100644
--- a/Documentation/git-commit-graph.txt
+++ b/Documentation/git-commit-graph.txt
@@ -57,11 +57,17 @@ or `--stdin-packs`.)
 With the `--append` option, include all commits that are present in the
 existing commit-graph file.
 +
-With the `--split` option, write the commit-graph as a chain of multiple
-commit-graph files stored in `<dir>/info/commit-graphs`. The new commits
-not already in the commit-graph are added in a new "tip" file. This file
-is merged with the existing file if the following merge conditions are
-met:
+With the `--split[=<strategy>]` option, write the commit-graph as a
+chain of multiple commit-graph files stored in
+`<dir>/info/commit-graphs`. Commit-graph layers are merged based on the
+strategy and other splitting options. The new commits not already in the
+commit-graph are added in a new "tip" file. This file is merged with the
+existing file if the following merge conditions are met:
+* If `--split=merge-always` is specified, then a merge is always
+conducted, and the remaining options are ignored. Conversely, if
+`--split=no-merge` is specified, a merge is never performed, and the
+remaining options are ignored. A bare `--split` defers to the remaining
+options.
 +
 * If `--size-multiple=<X>` is not specified, let `X` equal 2. If the new
 tip file would have `N` commits and the previous tip has `M` commits and
diff --git a/builtin/commit-graph.c b/builtin/commit-graph.c
index 4a70b33fb5..4d3c1c46c2 100644
--- a/builtin/commit-graph.c
+++ b/builtin/commit-graph.c
@@ -9,7 +9,9 @@
 
 static char const * const builtin_commit_graph_usage[] = {
 	N_("git commit-graph verify [--object-dir <objdir>] [--shallow] [--[no-]progress]"),
-	N_("git commit-graph write [--object-dir <objdir>] [--append|--split] [--reachable|--stdin-packs|--stdin-commits] [--[no-]progress] <split options>"),
+	N_("git commit-graph write [--object-dir <objdir>] [--append] "
+	   "[--split[=<strategy>]] [--reachable|--stdin-packs|--stdin-commits] "
+	   "[--[no-]progress] <split options>"),
 	NULL
 };
 
@@ -19,7 +21,9 @@ static const char * const builtin_commit_graph_verify_usage[] = {
 };
 
 static const char * const builtin_commit_graph_write_usage[] = {
-	N_("git commit-graph write [--object-dir <objdir>] [--append|--split] [--reachable|--stdin-packs|--stdin-commits] [--[no-]progress] <split options>"),
+	N_("git commit-graph write [--object-dir <objdir>] [--append] "
+	   "[--split[=<strategy>]] [--reachable|--stdin-packs|--stdin-commits] "
+	   "[--[no-]progress] <split options>"),
 	NULL
 };
 
@@ -111,6 +115,27 @@ static int graph_verify(int argc, const char **argv)
 extern int read_replace_refs;
 static struct split_commit_graph_opts split_opts;
 
+static int write_option_parse_split(const struct option *opt, const char *arg,
+				    int unset)
+{
+	enum commit_graph_split_flags *flags = opt->value;
+
+	opts.split = 1;
+	if (!arg) {
+		*flags = COMMIT_GRAPH_SPLIT_MERGE_AUTO;
+		return 0;
+	}
+
+	if (!strcmp(arg, "merge-all"))
+		*flags = COMMIT_GRAPH_SPLIT_MERGE_REQUIRED;
+	else if (!strcmp(arg, "no-merge"))
+		*flags = COMMIT_GRAPH_SPLIT_MERGE_PROHIBITED;
+	else
+		die(_("unrecognized --split argument, %s"), arg);
+
+	return 0;
+}
+
 static int graph_write(int argc, const char **argv)
 {
 	struct string_list *pack_indexes = NULL;
@@ -133,8 +158,10 @@ static int graph_write(int argc, const char **argv)
 		OPT_BOOL(0, "append", &opts.append,
 			N_("include all commits already in the commit-graph file")),
 		OPT_BOOL(0, "progress", &opts.progress, N_("force progress reporting")),
-		OPT_BOOL(0, "split", &opts.split,
-			N_("allow writing an incremental commit-graph file")),
+		OPT_CALLBACK_F(0, "split", &split_opts.flags, NULL,
+			N_("allow writing an incremental commit-graph file"),
+			PARSE_OPT_OPTARG | PARSE_OPT_NONEG,
+			write_option_parse_split),
 		OPT_INTEGER(0, "max-commits", &split_opts.max_commits,
 			N_("maximum number of commits in a non-base split commit-graph")),
 		OPT_INTEGER(0, "size-multiple", &split_opts.size_multiple,
diff --git a/commit-graph.c b/commit-graph.c
index 656dd647d5..3a5cb23cd7 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -1533,27 +1533,33 @@ static void split_graph_merge_strategy(struct write_commit_graph_context *ctx)
 
 	int max_commits = 0;
 	int size_mult = 2;
+	enum commit_graph_split_flags flags = COMMIT_GRAPH_SPLIT_MERGE_AUTO;
 
 	if (ctx->split_opts) {
 		max_commits = ctx->split_opts->max_commits;
 
 		if (ctx->split_opts->size_multiple)
 			size_mult = ctx->split_opts->size_multiple;
+
+		flags = ctx->split_opts->flags;
 	}
 
 	g = ctx->r->objects->commit_graph;
 	num_commits = ctx->commits.nr;
 	ctx->num_commit_graphs_after = ctx->num_commit_graphs_before + 1;
 
-	while (g && (g->num_commits <= size_mult * num_commits ||
-		    (max_commits && num_commits > max_commits))) {
-		if (g->odb != ctx->odb)
-			break;
+	if (flags != COMMIT_GRAPH_SPLIT_MERGE_PROHIBITED) {
+		while (g && (g->num_commits <= size_mult * num_commits ||
+			    (max_commits && num_commits > max_commits) ||
+			    (flags == COMMIT_GRAPH_SPLIT_MERGE_REQUIRED))) {
+			if (g->odb != ctx->odb)
+				break;
 
-		num_commits += g->num_commits;
-		g = g->base_graph;
+			num_commits += g->num_commits;
+			g = g->base_graph;
 
-		ctx->num_commit_graphs_after--;
+			ctx->num_commit_graphs_after--;
+		}
 	}
 
 	ctx->new_base_graph = g;
@@ -1861,7 +1867,7 @@ int write_commit_graph(struct object_directory *odb,
 		goto cleanup;
 	}
 
-	if (!ctx->commits.nr)
+	if (!ctx->commits.nr && (!ctx->split_opts || ctx->split_opts->flags != COMMIT_GRAPH_SPLIT_MERGE_REQUIRED))
 		goto cleanup;
 
 	if (ctx->split) {
diff --git a/commit-graph.h b/commit-graph.h
index e87a6f6360..65a7d2edae 100644
--- a/commit-graph.h
+++ b/commit-graph.h
@@ -82,10 +82,17 @@ enum commit_graph_write_flags {
 	COMMIT_GRAPH_WRITE_CHECK_OIDS = (1 << 3)
 };
 
+enum commit_graph_split_flags {
+	COMMIT_GRAPH_SPLIT_MERGE_AUTO       = 0,
+	COMMIT_GRAPH_SPLIT_MERGE_REQUIRED   = 1,
+	COMMIT_GRAPH_SPLIT_MERGE_PROHIBITED = 2
+};
+
 struct split_commit_graph_opts {
 	int size_multiple;
 	int max_commits;
 	timestamp_t expire_time;
+	enum commit_graph_split_flags flags;
 };
 
 /*
diff --git a/t/t5324-split-commit-graph.sh b/t/t5324-split-commit-graph.sh
index c24823431f..a165b48afe 100755
--- a/t/t5324-split-commit-graph.sh
+++ b/t/t5324-split-commit-graph.sh
@@ -344,4 +344,29 @@ test_expect_success 'split across alternate where alternate is not split' '
 	test_cmp commit-graph .git/objects/info/commit-graph
 '
 
+test_expect_success '--split=merge-all always merges incrementals' '
+	test_when_finished rm -rf a b c &&
+	rm -rf $graphdir $infodir/commit-graph &&
+	git reset --hard commits/10 &&
+	git rev-list -3 HEAD~4 >a &&
+	git rev-list -2 HEAD~2 >b &&
+	git rev-list -2 HEAD >c &&
+	git commit-graph write --split=no-merge --stdin-commits <a &&
+	git commit-graph write --split=no-merge --stdin-commits <b &&
+	test_line_count = 2 $graphdir/commit-graph-chain &&
+	git commit-graph write --split=merge-all --stdin-commits <c &&
+	test_line_count = 1 $graphdir/commit-graph-chain
+'
+
+test_expect_success '--split=no-merge always writes an incremental' '
+	test_when_finished rm -rf a b &&
+	rm -rf $graphdir &&
+	git reset --hard commits/2 &&
+	git rev-list HEAD~1 >a &&
+	git rev-list HEAD >b &&
+	git commit-graph write --split --stdin-commits <a &&
+	git commit-graph write --split=no-merge --stdin-commits <b &&
+	test_line_count = 2 $graphdir/commit-graph-chain
+'
+
 test_done
-- 
2.25.0.119.gaa12b7378b


^ permalink raw reply	[flat|nested] 58+ messages in thread

* [PATCH v2 2/3] builtin/commit-graph.c: introduce '--input=<source>'
  2020-02-05  0:28 ` [PATCH v2 " Taylor Blau
  2020-02-05  0:28   ` [PATCH v2 1/3] builtin/commit-graph.c: support '--split[=<strategy>]' Taylor Blau
@ 2020-02-05  0:28   ` Taylor Blau
  2020-02-05  0:28   ` [PATCH v2 3/3] builtin/commit-graph.c: support '--input=none' Taylor Blau
  2020-02-05 20:07   ` [PATCH v2 0/3] builtin/commit-graph.c: new split/merge options Junio C Hamano
  3 siblings, 0 replies; 58+ messages in thread
From: Taylor Blau @ 2020-02-05  0:28 UTC (permalink / raw)
  To: git; +Cc: peff, dstolee, gitster, martin.agren

The 'write' mode of the 'commit-graph' supports input from a number of
different sources: pack indexes over stdin, commits over stdin, commits
reachable from all references, and so on. Each of these options are
specified with a unique option: '--stdin-packs', '--stdin-commits', etc.

Similar to our replacement of 'git config [--<type>]' with 'git config
[--type=<type>]' (c.f., fb0dc3bac1 (builtin/config.c: support
`--type=<type>` as preferred alias for `--<type>`, 2018-04-18)), softly
deprecate '[--<input>]' in favor of '[--input=<source>]'.

This makes it more clear to implement new options that are combinations
of other options (such as, for example, "none", a combination of the old
"--append" and a new sentinel to specify to _not_ look in other packs,
which we will implement in a future patch).

Unfortunately, the new enumerated type is a bitfield, even though it
makes much more sense as '0, 1, 2, ...'. Even though *almost* all
options are pairwise exclusive, '--stdin-{packs,commits}' *is*
compatible with '--append'. For this reason, use a bitfield.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 Documentation/git-commit-graph.txt | 26 +++++-----
 builtin/commit-graph.c             | 83 +++++++++++++++++++++---------
 t/t5318-commit-graph.sh            |  4 +-
 t/t5324-split-commit-graph.sh      |  2 +-
 4 files changed, 76 insertions(+), 39 deletions(-)

diff --git a/Documentation/git-commit-graph.txt b/Documentation/git-commit-graph.txt
index b7fe65ef21..2ae9de679a 100644
--- a/Documentation/git-commit-graph.txt
+++ b/Documentation/git-commit-graph.txt
@@ -41,21 +41,21 @@ COMMANDS
 
 Write a commit-graph file based on the commits found in packfiles.
 +
-With the `--stdin-packs` option, generate the new commit graph by
+With the `--input=stdin-packs` option, generate the new commit graph by
 walking objects only in the specified pack-indexes. (Cannot be combined
-with `--stdin-commits` or `--reachable`.)
+with `--input=stdin-commits` or `--input=reachable`.)
 +
-With the `--stdin-commits` option, generate the new commit graph by
-walking commits starting at the commits specified in stdin as a list
+With the `--input=stdin-commits` option, generate the new commit graph
+by walking commits starting at the commits specified in stdin as a list
 of OIDs in hex, one OID per line. (Cannot be combined with
-`--stdin-packs` or `--reachable`.)
+`--input=stdin-packs` or `--input=reachable`.)
 +
-With the `--reachable` option, generate the new commit graph by walking
-commits starting at all refs. (Cannot be combined with `--stdin-commits`
-or `--stdin-packs`.)
+With the `--input=reachable` option, generate the new commit graph by
+walking commits starting at all refs. (Cannot be combined with
+`--input=stdin-commits` or `--input=stdin-packs`.)
 +
-With the `--append` option, include all commits that are present in the
-existing commit-graph file.
+With the `--input=append` option, include all commits that are present
+in the existing commit-graph file.
 +
 With the `--split[=<strategy>]` option, write the commit-graph as a
 chain of multiple commit-graph files stored in
@@ -105,20 +105,20 @@ $ git commit-graph write
   using commits in `<pack-index>`.
 +
 ------------------------------------------------
-$ echo <pack-index> | git commit-graph write --stdin-packs
+$ echo <pack-index> | git commit-graph write --input=stdin-packs
 ------------------------------------------------
 
 * Write a commit-graph file containing all reachable commits.
 +
 ------------------------------------------------
-$ git show-ref -s | git commit-graph write --stdin-commits
+$ git show-ref -s | git commit-graph write --input=stdin-commits
 ------------------------------------------------
 
 * Write a commit-graph file containing all commits in the current
   commit-graph file along with those reachable from `HEAD`.
 +
 ------------------------------------------------
-$ git rev-parse HEAD | git commit-graph write --stdin-commits --append
+$ git rev-parse HEAD | git commit-graph write --input=stdin-commits --input=append
 ------------------------------------------------
 
 
diff --git a/builtin/commit-graph.c b/builtin/commit-graph.c
index 4d3c1c46c2..0ff25896d0 100644
--- a/builtin/commit-graph.c
+++ b/builtin/commit-graph.c
@@ -9,8 +9,9 @@
 
 static char const * const builtin_commit_graph_usage[] = {
 	N_("git commit-graph verify [--object-dir <objdir>] [--shallow] [--[no-]progress]"),
-	N_("git commit-graph write [--object-dir <objdir>] [--append] "
-	   "[--split[=<strategy>]] [--reachable|--stdin-packs|--stdin-commits] "
+	N_("git commit-graph write [--object-dir <objdir>] "
+	   "[--split[=<strategy>]] "
+	   "[--input=<reachable|stdin-packs|stdin-commits|append>] "
 	   "[--[no-]progress] <split options>"),
 	NULL
 };
@@ -21,18 +22,23 @@ static const char * const builtin_commit_graph_verify_usage[] = {
 };
 
 static const char * const builtin_commit_graph_write_usage[] = {
-	N_("git commit-graph write [--object-dir <objdir>] [--append] "
-	   "[--split[=<strategy>]] [--reachable|--stdin-packs|--stdin-commits] "
+	N_("git commit-graph write [--object-dir <objdir>] "
+	   "[--split[=<strategy>]] "
+	   "[--input=<reachable|stdin-packs|stdin-commits|append>] "
 	   "[--[no-]progress] <split options>"),
 	NULL
 };
 
+enum commit_graph_input {
+	COMMIT_GRAPH_INPUT_REACHABLE     = (1 << 1),
+	COMMIT_GRAPH_INPUT_STDIN_PACKS   = (1 << 2),
+	COMMIT_GRAPH_INPUT_STDIN_COMMITS = (1 << 3),
+	COMMIT_GRAPH_INPUT_APPEND        = (1 << 4)
+};
+
 static struct opts_commit_graph {
 	const char *obj_dir;
-	int reachable;
-	int stdin_packs;
-	int stdin_commits;
-	int append;
+	enum commit_graph_input input;
 	int split;
 	int shallow;
 	int progress;
@@ -57,6 +63,28 @@ static struct object_directory *find_odb(struct repository *r,
 	return odb;
 }
 
+static int option_parse_input(const struct option *opt, const char *arg,
+			      int unset)
+{
+	enum commit_graph_input *to = opt->value;
+	if (unset || !strcmp(arg, "packs")) {
+		*to = 0;
+		return 0;
+	}
+
+	if (!strcmp(arg, "reachable"))
+		*to |= COMMIT_GRAPH_INPUT_REACHABLE;
+	else if (!strcmp(arg, "stdin-packs"))
+		*to |= COMMIT_GRAPH_INPUT_STDIN_PACKS;
+	else if (!strcmp(arg, "stdin-commits"))
+		*to |= COMMIT_GRAPH_INPUT_STDIN_COMMITS;
+	else if (!strcmp(arg, "append"))
+		*to |= COMMIT_GRAPH_INPUT_APPEND;
+	else
+		die(_("unrecognized --input source, %s"), arg);
+	return 0;
+}
+
 static int graph_verify(int argc, const char **argv)
 {
 	struct commit_graph *graph = NULL;
@@ -149,14 +177,21 @@ static int graph_write(int argc, const char **argv)
 		OPT_STRING(0, "object-dir", &opts.obj_dir,
 			N_("dir"),
 			N_("The object directory to store the graph")),
-		OPT_BOOL(0, "reachable", &opts.reachable,
-			N_("start walk at all refs")),
-		OPT_BOOL(0, "stdin-packs", &opts.stdin_packs,
-			N_("scan pack-indexes listed by stdin for commits")),
-		OPT_BOOL(0, "stdin-commits", &opts.stdin_commits,
-			N_("start walk at commits listed by stdin")),
-		OPT_BOOL(0, "append", &opts.append,
-			N_("include all commits already in the commit-graph file")),
+		OPT_CALLBACK(0, "input", &opts.input, NULL,
+			N_("include commits from this source in the graph"),
+			option_parse_input),
+		OPT_BIT(0, "reachable", &opts.input,
+			N_("start walk at all refs"),
+			COMMIT_GRAPH_INPUT_REACHABLE),
+		OPT_BIT(0, "stdin-packs", &opts.input,
+			N_("scan pack-indexes listed by stdin for commits"),
+			COMMIT_GRAPH_INPUT_STDIN_PACKS),
+		OPT_BIT(0, "stdin-commits", &opts.input,
+			N_("start walk at commits listed by stdin"),
+			COMMIT_GRAPH_INPUT_STDIN_COMMITS),
+		OPT_BIT(0, "append", &opts.input,
+			N_("include all commits already in the commit-graph file"),
+			COMMIT_GRAPH_INPUT_APPEND),
 		OPT_BOOL(0, "progress", &opts.progress, N_("force progress reporting")),
 		OPT_CALLBACK_F(0, "split", &split_opts.flags, NULL,
 			N_("allow writing an incremental commit-graph file"),
@@ -182,11 +217,13 @@ static int graph_write(int argc, const char **argv)
 			     builtin_commit_graph_write_options,
 			     builtin_commit_graph_write_usage, 0);
 
-	if (opts.reachable + opts.stdin_packs + opts.stdin_commits > 1)
-		die(_("use at most one of --reachable, --stdin-commits, or --stdin-packs"));
+	if ((!!(opts.input & COMMIT_GRAPH_INPUT_REACHABLE) +
+	     !!(opts.input & COMMIT_GRAPH_INPUT_STDIN_PACKS) +
+	     !!(opts.input & COMMIT_GRAPH_INPUT_STDIN_COMMITS)) > 1)
+		die(_("use at most one of --input=reachable, --input=stdin-commits, or --input=stdin-packs"));
 	if (!opts.obj_dir)
 		opts.obj_dir = get_object_directory();
-	if (opts.append)
+	if (opts.input & COMMIT_GRAPH_INPUT_APPEND)
 		flags |= COMMIT_GRAPH_WRITE_APPEND;
 	if (opts.split)
 		flags |= COMMIT_GRAPH_WRITE_SPLIT;
@@ -196,22 +233,22 @@ static int graph_write(int argc, const char **argv)
 	read_replace_refs = 0;
 	odb = find_odb(the_repository, opts.obj_dir);
 
-	if (opts.reachable) {
+	if (opts.input & COMMIT_GRAPH_INPUT_REACHABLE) {
 		if (write_commit_graph_reachable(odb, flags, &split_opts))
 			return 1;
 		return 0;
 	}
 
 	string_list_init(&lines, 0);
-	if (opts.stdin_packs || opts.stdin_commits) {
+	if (opts.input & (COMMIT_GRAPH_INPUT_STDIN_PACKS | COMMIT_GRAPH_INPUT_STDIN_COMMITS)) {
 		struct strbuf buf = STRBUF_INIT;
 
 		while (strbuf_getline(&buf, stdin) != EOF)
 			string_list_append(&lines, strbuf_detach(&buf, NULL));
 
-		if (opts.stdin_packs)
+		if (opts.input & COMMIT_GRAPH_INPUT_STDIN_PACKS)
 			pack_indexes = &lines;
-		if (opts.stdin_commits) {
+		if (opts.input & COMMIT_GRAPH_INPUT_STDIN_COMMITS) {
 			commit_hex = &lines;
 			flags |= COMMIT_GRAPH_WRITE_CHECK_OIDS;
 		}
diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh
index 0bf98b56ec..786b5f73ef 100755
--- a/t/t5318-commit-graph.sh
+++ b/t/t5318-commit-graph.sh
@@ -227,7 +227,7 @@ graph_git_behavior 'cleared graph, commit 8 vs merge 2' full commits/8 merge/2
 
 test_expect_success 'build graph from latest pack with closure' '
 	cd "$TRASH_DIRECTORY/full" &&
-	cat new-idx | git commit-graph write --stdin-packs &&
+	cat new-idx | git commit-graph write --input=stdin-packs &&
 	test_path_is_file $objdir/info/commit-graph &&
 	graph_read_expect "9" "extra_edges"
 '
@@ -240,7 +240,7 @@ test_expect_success 'build graph from commits with closure' '
 	git tag -a -m "merge" tag/merge merge/2 &&
 	git rev-parse tag/merge >commits-in &&
 	git rev-parse merge/1 >>commits-in &&
-	cat commits-in | git commit-graph write --stdin-commits &&
+	cat commits-in | git commit-graph write --input=stdin-commits &&
 	test_path_is_file $objdir/info/commit-graph &&
 	graph_read_expect "6"
 '
diff --git a/t/t5324-split-commit-graph.sh b/t/t5324-split-commit-graph.sh
index a165b48afe..353523eca4 100755
--- a/t/t5324-split-commit-graph.sh
+++ b/t/t5324-split-commit-graph.sh
@@ -35,7 +35,7 @@ test_expect_success 'create commits and write commit-graph' '
 		test_commit $i &&
 		git branch commits/$i || return 1
 	done &&
-	git commit-graph write --reachable &&
+	git commit-graph write --input=reachable &&
 	test_path_is_file $infodir/commit-graph &&
 	graph_read_expect 3
 '
-- 
2.25.0.119.gaa12b7378b


^ permalink raw reply	[flat|nested] 58+ messages in thread

* [PATCH v2 3/3] builtin/commit-graph.c: support '--input=none'
  2020-02-05  0:28 ` [PATCH v2 " Taylor Blau
  2020-02-05  0:28   ` [PATCH v2 1/3] builtin/commit-graph.c: support '--split[=<strategy>]' Taylor Blau
  2020-02-05  0:28   ` [PATCH v2 2/3] builtin/commit-graph.c: introduce '--input=<source>' Taylor Blau
@ 2020-02-05  0:28   ` Taylor Blau
  2020-02-06 19:50     ` Martin Ågren
  2020-02-05 20:07   ` [PATCH v2 0/3] builtin/commit-graph.c: new split/merge options Junio C Hamano
  3 siblings, 1 reply; 58+ messages in thread
From: Taylor Blau @ 2020-02-05  0:28 UTC (permalink / raw)
  To: git; +Cc: peff, dstolee, gitster, martin.agren

In the previous commit, we introduced '--split=<no-merge|merge-all>',
and alluded to the fact that '--split=merge-all' would be useful for
callers who wish to always trigger a merge of an incremental chain.

There is a problem with the above approach, which is that there is no
way to specify to the commit-graph builtin that a caller only wants to
include commits already in the graph. One can specify '--input=append'
to include all commits in the existing graphs, but the absence of
'--input=stdin-{commits,packs}' causes the builtin to call
'fill_oids_from_all_packs()'.

Passing '--input=reachable' (as in 'git commit-graph write
--split=merge-all --input=reachable --input=append') works around this
issue by making '--input=reachable' effectively a no-op, but this can be
prohibitively expensive in large repositories, making it an undesirable
choice for some users.

Teach '--input=none' as an option to behave as if '--input=append' were
given, but to consider no other sources in addition.

This, in conjunction with the option introduced in the previous patch
offers the convenient way to force the commit-graph machinery to
condense a chain of incrementals without requiring any new commits:

  $ git commit-graph write --split=merge-all --input=none

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 Documentation/git-commit-graph.txt |  8 +++++++-
 builtin/commit-graph.c             | 11 ++++++++---
 commit-graph.c                     |  6 ++++--
 commit-graph.h                     |  3 ++-
 t/t5324-split-commit-graph.sh      | 26 ++++++++++++++++++++++++++
 5 files changed, 47 insertions(+), 7 deletions(-)

diff --git a/Documentation/git-commit-graph.txt b/Documentation/git-commit-graph.txt
index 2ae9de679a..633cfbe023 100644
--- a/Documentation/git-commit-graph.txt
+++ b/Documentation/git-commit-graph.txt
@@ -39,7 +39,7 @@ COMMANDS
 --------
 'write'::
 
-Write a commit-graph file based on the commits found in packfiles.
+Write a commit-graph file based on the specified sources of input:
 +
 With the `--input=stdin-packs` option, generate the new commit graph by
 walking objects only in the specified pack-indexes. (Cannot be combined
@@ -57,6 +57,12 @@ walking commits starting at all refs. (Cannot be combined with
 With the `--input=append` option, include all commits that are present
 in the existing commit-graph file.
 +
+With the `--input=none` option, behave as if `--input=append` were
+given, but do not walk other packs to find additional commits.
+
+If none of the above options are given, then generate the new
+commit-graph by walking over all pack-indexes.
++
 With the `--split[=<strategy>]` option, write the commit-graph as a
 chain of multiple commit-graph files stored in
 `<dir>/info/commit-graphs`. Commit-graph layers are merged based on the
diff --git a/builtin/commit-graph.c b/builtin/commit-graph.c
index 0ff25896d0..a71af88815 100644
--- a/builtin/commit-graph.c
+++ b/builtin/commit-graph.c
@@ -11,7 +11,7 @@ static char const * const builtin_commit_graph_usage[] = {
 	N_("git commit-graph verify [--object-dir <objdir>] [--shallow] [--[no-]progress]"),
 	N_("git commit-graph write [--object-dir <objdir>] "
 	   "[--split[=<strategy>]] "
-	   "[--input=<reachable|stdin-packs|stdin-commits|append>] "
+	   "[--input=<reachable|stdin-packs|stdin-commits|append|none>] "
 	   "[--[no-]progress] <split options>"),
 	NULL
 };
@@ -24,7 +24,7 @@ static const char * const builtin_commit_graph_verify_usage[] = {
 static const char * const builtin_commit_graph_write_usage[] = {
 	N_("git commit-graph write [--object-dir <objdir>] "
 	   "[--split[=<strategy>]] "
-	   "[--input=<reachable|stdin-packs|stdin-commits|append>] "
+	   "[--input=<reachable|stdin-packs|stdin-commits|append|none>] "
 	   "[--[no-]progress] <split options>"),
 	NULL
 };
@@ -33,7 +33,8 @@ enum commit_graph_input {
 	COMMIT_GRAPH_INPUT_REACHABLE     = (1 << 1),
 	COMMIT_GRAPH_INPUT_STDIN_PACKS   = (1 << 2),
 	COMMIT_GRAPH_INPUT_STDIN_COMMITS = (1 << 3),
-	COMMIT_GRAPH_INPUT_APPEND        = (1 << 4)
+	COMMIT_GRAPH_INPUT_APPEND        = (1 << 4),
+	COMMIT_GRAPH_INPUT_NONE          = (1 << 5)
 };
 
 static struct opts_commit_graph {
@@ -80,6 +81,8 @@ static int option_parse_input(const struct option *opt, const char *arg,
 		*to |= COMMIT_GRAPH_INPUT_STDIN_COMMITS;
 	else if (!strcmp(arg, "append"))
 		*to |= COMMIT_GRAPH_INPUT_APPEND;
+	else if (!strcmp(arg, "none"))
+		*to |= (COMMIT_GRAPH_INPUT_APPEND | COMMIT_GRAPH_INPUT_NONE);
 	else
 		die(_("unrecognized --input source, %s"), arg);
 	return 0;
@@ -225,6 +228,8 @@ static int graph_write(int argc, const char **argv)
 		opts.obj_dir = get_object_directory();
 	if (opts.input & COMMIT_GRAPH_INPUT_APPEND)
 		flags |= COMMIT_GRAPH_WRITE_APPEND;
+	if (opts.input & COMMIT_GRAPH_INPUT_NONE)
+		flags |= COMMIT_GRAPH_WRITE_NO_INPUT;
 	if (opts.split)
 		flags |= COMMIT_GRAPH_WRITE_SPLIT;
 	if (opts.progress)
diff --git a/commit-graph.c b/commit-graph.c
index 3a5cb23cd7..417b7eac9c 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -788,7 +788,8 @@ struct write_commit_graph_context {
 	unsigned append:1,
 		 report_progress:1,
 		 split:1,
-		 check_oids:1;
+		 check_oids:1,
+		 no_input:1;
 
 	const struct split_commit_graph_opts *split_opts;
 };
@@ -1785,6 +1786,7 @@ int write_commit_graph(struct object_directory *odb,
 	ctx->split = flags & COMMIT_GRAPH_WRITE_SPLIT ? 1 : 0;
 	ctx->check_oids = flags & COMMIT_GRAPH_WRITE_CHECK_OIDS ? 1 : 0;
 	ctx->split_opts = split_opts;
+	ctx->no_input = flags & COMMIT_GRAPH_WRITE_NO_INPUT ? 1 : 0;
 
 	if (ctx->split) {
 		struct commit_graph *g;
@@ -1843,7 +1845,7 @@ int write_commit_graph(struct object_directory *odb,
 			goto cleanup;
 	}
 
-	if (!pack_indexes && !commit_hex)
+	if (!ctx->no_input && !pack_indexes && !commit_hex)
 		fill_oids_from_all_packs(ctx);
 
 	close_reachable(ctx);
diff --git a/commit-graph.h b/commit-graph.h
index 65a7d2edae..df7f3f5961 100644
--- a/commit-graph.h
+++ b/commit-graph.h
@@ -79,7 +79,8 @@ enum commit_graph_write_flags {
 	COMMIT_GRAPH_WRITE_PROGRESS   = (1 << 1),
 	COMMIT_GRAPH_WRITE_SPLIT      = (1 << 2),
 	/* Make sure that each OID in the input is a valid commit OID. */
-	COMMIT_GRAPH_WRITE_CHECK_OIDS = (1 << 3)
+	COMMIT_GRAPH_WRITE_CHECK_OIDS = (1 << 3),
+	COMMIT_GRAPH_WRITE_NO_INPUT   = (1 << 4)
 };
 
 enum commit_graph_split_flags {
diff --git a/t/t5324-split-commit-graph.sh b/t/t5324-split-commit-graph.sh
index 353523eca4..e3f317a1f4 100755
--- a/t/t5324-split-commit-graph.sh
+++ b/t/t5324-split-commit-graph.sh
@@ -369,4 +369,30 @@ test_expect_success '--split=no-merge always writes an incremental' '
 	test_line_count = 2 $graphdir/commit-graph-chain
 '
 
+test_expect_success '--split=no-merge, --input=none writes nothing' '
+	test_when_finished rm -rf a graphs.before graphs.after &&
+	rm -rf $graphdir &&
+	git reset --hard commits/2 &&
+	git rev-list -1 HEAD~1 >a &&
+	git commit-graph write --split=no-merge --input=stdin-commits <a &&
+	ls $graphdir/graph-*.graph >graphs.before &&
+	test_line_count = 1 $graphdir/commit-graph-chain &&
+	git commit-graph write --split --input=none &&
+	ls $graphdir/graph-*.graph >graphs.after &&
+	test_cmp graphs.before graphs.after
+'
+
+test_expect_success '--split=merge-all, --input=none merges the chain' '
+	test_when_finished rm -rf a b &&
+	rm -rf $graphdir &&
+	git reset --hard commits/2 &&
+	git rev-list -1 HEAD~1 >a &&
+	git rev-list -1 HEAD >b &&
+	git commit-graph write --split=no-merge --input=stdin-commits <a &&
+	git commit-graph write --split=no-merge --input=stdin-commits <b &&
+	test_line_count = 2 $graphdir/commit-graph-chain &&
+	git commit-graph write --split=merge-all --input=none &&
+	test_line_count = 1 $graphdir/commit-graph-chain
+'
+
 test_done
-- 
2.25.0.119.gaa12b7378b

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 0/3] builtin/commit-graph.c: new split/merge options
  2020-02-04 23:44 ` Junio C Hamano
@ 2020-02-05  0:30   ` Taylor Blau
  0 siblings, 0 replies; 58+ messages in thread
From: Taylor Blau @ 2020-02-05  0:30 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Taylor Blau, git, peff, dstolee

Hi Junio,

On Tue, Feb 04, 2020 at 03:44:04PM -0800, Junio C Hamano wrote:
> As the topoic this depends on has been updated, I tried to rebase
> this on top, but I am seeing segfaults in tests.  We'd probably need
> a fresh round of this one to replace it.

Sure, although in fairness this test failure is unique to this series.
But, I sent a rebased version of these three patches to be based on the
latest from the 'tb/commit-graph-use-odb.v2' topic.

> Thanks.

Thanks for taking a look at it.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v2 0/3] builtin/commit-graph.c: new split/merge options
  2020-02-05  0:28 ` [PATCH v2 " Taylor Blau
                     ` (2 preceding siblings ...)
  2020-02-05  0:28   ` [PATCH v2 3/3] builtin/commit-graph.c: support '--input=none' Taylor Blau
@ 2020-02-05 20:07   ` Junio C Hamano
  3 siblings, 0 replies; 58+ messages in thread
From: Junio C Hamano @ 2020-02-05 20:07 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, peff, dstolee, martin.agren

Taylor Blau <me@ttaylorr.com> writes:

> Here is an updated 'v2' of my series to introduce new splitting and
> merging options to the 'commit-graph write' builtin.
>
> These patches are updated to be based on the latest changes in the
> series upon which this is based (tb/commit-graph-use-odb), and contain
> some other fixes that I picked up during the last round of review. For
> convenience, I included a range-diff against 'v1' below.
>
> Thanks as always for your review.

Thanks, replaced.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 1/3] builtin/commit-graph.c: support '--split[=<strategy>]'
  2020-02-04  4:06     ` Taylor Blau
@ 2020-02-06 19:15       ` Martin Ågren
  2020-02-09 23:27         ` Taylor Blau
  0 siblings, 1 reply; 58+ messages in thread
From: Martin Ågren @ 2020-02-06 19:15 UTC (permalink / raw)
  To: Taylor Blau; +Cc: Git Mailing List, Jeff King, Derrick Stolee, Junio C Hamano

On Tue, 4 Feb 2020 at 05:06, Taylor Blau <me@ttaylorr.com> wrote:
>
> > Or can you convince me otherwise? From which angle should I look at
> > this?
>
> Heh. This is all very reminiscent of an off-list discussion that I had
> with Peff and Stolee before sending this upstream. Originally, I had
> implemented this as:
>
>   $ git commit-graph write --split --[no-]merge
>
> but we decided that this '--merge' and '--no-merge' requiring '--split'
> seemed to indicate that this was better off as an argument to '--split'.
> Of course, there's no getting around that it is... odd to say
> '--split=no-merge' for exactly the reason you suggest.
>
> Here's another way of looking at it: the presence of '--split' means
> "work with split graph files" and the '[no-]merge' argument means:
> "always/never condense multiple layers".
>
> For me, this not only makes the new option language jive, but makes it
> clearer to me than the combination of '--split', '--split --no-merge'
> and '--split --merge', where the third one is truly bizarre. At least
> condensing the second '--' and making 'merge' an argument to 'split'
> makes it clear that the two work together somehow.

Yes, "--split --merge" sounds no better. :-)

> If you have a different suggestion, I'd certainly love to hear about it
> and discuss. But, at least as far as our internal discussions have gone,
> this is by far the best option that we have been able to come up with.

I can't come up with anything better, so please feel free to carry on
(as you already have).


> > > -               OPT_BOOL(0, "split", &opts.split,
> > > -                       N_("allow writing an incremental commit-graph file")),
> > > +               OPT_CALLBACK_F(0, "split", &split_opts.flags, NULL,
> > > +                       N_("allow writing an incremental commit-graph file"),
> >
> > This still sounds very boolean. Cramming in the "strategy" might be hard
> > -- is this an argument in favor of having two separate options? ;-)
>
> Heh. Exactly how we started these patches when I originally wrote
> them...

You left this as-is in v2. I don't have any immediate improvements to
offer. I could see shortening the original to "use the 'split' file
format", in which case maybe one could allude to a strategy here. (I
don't think "allow" is really needed, right? Maybe it tries to cover for
the situation where there's no commit graph yet, so you might say we
wouldn't write an "incremental" one, but the format would still be the
same, AFAIU. Anyway, that's outside the scope of this patch.)

Martin

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v2 1/3] builtin/commit-graph.c: support '--split[=<strategy>]'
  2020-02-05  0:28   ` [PATCH v2 1/3] builtin/commit-graph.c: support '--split[=<strategy>]' Taylor Blau
@ 2020-02-06 19:41     ` Martin Ågren
  2020-02-07 15:48       ` Derrick Stolee
  2020-02-09 23:30       ` Taylor Blau
  0 siblings, 2 replies; 58+ messages in thread
From: Martin Ågren @ 2020-02-06 19:41 UTC (permalink / raw)
  To: Taylor Blau; +Cc: Git Mailing List, Jeff King, Derrick Stolee, Junio C Hamano

On Wed, 5 Feb 2020 at 01:28, Taylor Blau <me@ttaylorr.com> wrote:
> diff --git a/Documentation/git-commit-graph.txt b/Documentation/git-commit-graph.txt
> index 28d1fee505..b7fe65ef21 100644
> --- a/Documentation/git-commit-graph.txt
> +++ b/Documentation/git-commit-graph.txt
> @@ -57,11 +57,17 @@ or `--stdin-packs`.)
>  With the `--append` option, include all commits that are present in the
>  existing commit-graph file.
>  +
> -With the `--split` option, write the commit-graph as a chain of multiple
> -commit-graph files stored in `<dir>/info/commit-graphs`. The new commits
> -not already in the commit-graph are added in a new "tip" file. This file
> -is merged with the existing file if the following merge conditions are
> -met:
> +With the `--split[=<strategy>]` option, write the commit-graph as a
> +chain of multiple commit-graph files stored in
> +`<dir>/info/commit-graphs`. Commit-graph layers are merged based on the
> +strategy and other splitting options. The new commits not already in the
> +commit-graph are added in a new "tip" file. This file is merged with the
> +existing file if the following merge conditions are met:

Please add a lone "+" here.

> +* If `--split=merge-always` is specified, then a merge is always
> +conducted, and the remaining options are ignored. Conversely, if
> +`--split=no-merge` is specified, a merge is never performed, and the
> +remaining options are ignored. A bare `--split` defers to the remaining
> +options.
>  +

Similar to this existing one here. There's some minor misrendering here
otherwise.

>  * If `--size-multiple=<X>` is not specified, let `X` equal 2. If the new
>  tip file would have `N` commits and the previous tip has `M` commits and

> -               OPT_BOOL(0, "split", &opts.split,
> -                       N_("allow writing an incremental commit-graph file")),
> +               OPT_CALLBACK_F(0, "split", &split_opts.flags, NULL,
> +                       N_("allow writing an incremental commit-graph file"),
> +                       PARSE_OPT_OPTARG | PARSE_OPT_NONEG,
> +                       write_option_parse_split),


I keep getting back to this -- sorry! So this actually forbids
"--no-split", which used to work before. Unfortunate?

I have to ask, what is the long-term plan for the two formats (split and
non-split)? As I understand it, and I might well be wrong, the non-split
format came first and the split format was a user-experience
improvement. Should we expect that `--split` becomes the default? In
which case `--no-split` would be needed. Or might the non-split format
go away entirely, leaving `--split` a no-op and `--split=<strategy>` a
pretty funky way of choosing a strategy for the one-and-only file
format?

To try to be concrete, here's a suggestion: `--format=split` and
`--split-strategy=<strategy>`.

Martin

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v2 3/3] builtin/commit-graph.c: support '--input=none'
  2020-02-05  0:28   ` [PATCH v2 3/3] builtin/commit-graph.c: support '--input=none' Taylor Blau
@ 2020-02-06 19:50     ` Martin Ågren
  2020-02-09 23:32       ` Taylor Blau
  0 siblings, 1 reply; 58+ messages in thread
From: Martin Ågren @ 2020-02-06 19:50 UTC (permalink / raw)
  To: Taylor Blau; +Cc: Git Mailing List, Jeff King, Derrick Stolee, Junio C Hamano

On Wed, 5 Feb 2020 at 01:28, Taylor Blau <me@ttaylorr.com> wrote:
> @@ -57,6 +57,12 @@ walking commits starting at all refs. (Cannot be combined with
>  With the `--input=append` option, include all commits that are present
>  in the existing commit-graph file.
>  +
> +With the `--input=none` option, behave as if `--input=append` were
> +given, but do not walk other packs to find additional commits.
> +

Similar to my comment in patch 1/3. Please add a "+" instead of an empty
line. This one actually trips up the rendering quite a bit of a lot of
what follows.

> +If none of the above options are given, then generate the new
> +commit-graph by walking over all pack-indexes.
> ++

This one's good.

Martin

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v2 1/3] builtin/commit-graph.c: support '--split[=<strategy>]'
  2020-02-06 19:41     ` Martin Ågren
@ 2020-02-07 15:48       ` Derrick Stolee
  2020-02-09 23:32         ` Taylor Blau
  2020-02-12  6:03         ` Martin Ågren
  2020-02-09 23:30       ` Taylor Blau
  1 sibling, 2 replies; 58+ messages in thread
From: Derrick Stolee @ 2020-02-07 15:48 UTC (permalink / raw)
  To: Martin Ågren, Taylor Blau
  Cc: Git Mailing List, Jeff King, Derrick Stolee, Junio C Hamano

On 2/6/2020 2:41 PM, Martin Ågren wrote:
> On Wed, 5 Feb 2020 at 01:28, Taylor Blau <me@ttaylorr.com> wrote:
>>  * If `--size-multiple=<X>` is not specified, let `X` equal 2. If the new
>>  tip file would have `N` commits and the previous tip has `M` commits and
> 
>> -               OPT_BOOL(0, "split", &opts.split,
>> -                       N_("allow writing an incremental commit-graph file")),
>> +               OPT_CALLBACK_F(0, "split", &split_opts.flags, NULL,
>> +                       N_("allow writing an incremental commit-graph file"),
>> +                       PARSE_OPT_OPTARG | PARSE_OPT_NONEG,
>> +                       write_option_parse_split),
> 
> 
> I keep getting back to this -- sorry! So this actually forbids
> "--no-split", which used to work before. Unfortunate?

That certainly is unfortunate. Hopefully no one is taking a dependence on
this, which only means something if they had a `--split` previously in
the command-line arguments.

> I have to ask, what is the long-term plan for the two formats (split and
> non-split)? As I understand it, and I might well be wrong, the non-split
> format came first and the split format was a user-experience
> improvement. Should we expect that `--split` becomes the default?

In some ways, the split is now the default because that is how it is
written during 'git fetch' using fetch.writeCommitGraph. However, I
don't think that it will ever become the default for the commit-graph
builtin.

> In
> which case `--no-split` would be needed. Or might the non-split format
> go away entirely, leaving `--split` a no-op and `--split=<strategy>` a
> pretty funky way of choosing a strategy for the one-and-only file
> format?

In some ways, the --split=merge-all is similar, except it writes a one-line
commit-graph-chain file and puts a .graph file in
.git/objects/info/commit-graphs instead of writing to .git/objects/commit-graph.

> To try to be concrete, here's a suggestion: `--format=split` and
> `--split-strategy=<strategy>`.

Why --format=split instead of leaving it as --[no-]split? Is there a reason to
introduce this string-based option when there are only two options right now?

Perhaps using --split-strategy=<strategy> is the most backwards-compatible
option, especially because we won't need --split="" to substitute for
"auto-merge". However, I wonder if this is a case where we should make the
hard choice to sacrifice a narrow backwards-compatibility in favor of a
simplified set of options?

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 1/3] builtin/commit-graph.c: support '--split[=<strategy>]'
  2020-02-06 19:15       ` Martin Ågren
@ 2020-02-09 23:27         ` Taylor Blau
  0 siblings, 0 replies; 58+ messages in thread
From: Taylor Blau @ 2020-02-09 23:27 UTC (permalink / raw)
  To: Martin Ågren
  Cc: Taylor Blau, Git Mailing List, Jeff King, Derrick Stolee, Junio C Hamano

On Thu, Feb 06, 2020 at 08:15:03PM +0100, Martin Ågren wrote:
> On Tue, 4 Feb 2020 at 05:06, Taylor Blau <me@ttaylorr.com> wrote:
> >
> > > Or can you convince me otherwise? From which angle should I look at
> > > this?
> >
> > Heh. This is all very reminiscent of an off-list discussion that I had
> > with Peff and Stolee before sending this upstream. Originally, I had
> > implemented this as:
> >
> >   $ git commit-graph write --split --[no-]merge
> >
> > but we decided that this '--merge' and '--no-merge' requiring '--split'
> > seemed to indicate that this was better off as an argument to '--split'.
> > Of course, there's no getting around that it is... odd to say
> > '--split=no-merge' for exactly the reason you suggest.
> >
> > Here's another way of looking at it: the presence of '--split' means
> > "work with split graph files" and the '[no-]merge' argument means:
> > "always/never condense multiple layers".
> >
> > For me, this not only makes the new option language jive, but makes it
> > clearer to me than the combination of '--split', '--split --no-merge'
> > and '--split --merge', where the third one is truly bizarre. At least
> > condensing the second '--' and making 'merge' an argument to 'split'
> > makes it clear that the two work together somehow.
>
> Yes, "--split --merge" sounds no better. :-)

:-).

> > If you have a different suggestion, I'd certainly love to hear about it
> > and discuss. But, at least as far as our internal discussions have gone,
> > this is by far the best option that we have been able to come up with.
>
> I can't come up with anything better, so please feel free to carry on
> (as you already have).

Sounds good. It looks like you might have had some further thoughts a
little bit lower down in the thread, so I'll respond to those shortly
just to make sure that I didn't miss anything before readying a 'v3' for
submission.

>
> > > > -               OPT_BOOL(0, "split", &opts.split,
> > > > -                       N_("allow writing an incremental commit-graph file")),
> > > > +               OPT_CALLBACK_F(0, "split", &split_opts.flags, NULL,
> > > > +                       N_("allow writing an incremental commit-graph file"),
> > >
> > > This still sounds very boolean. Cramming in the "strategy" might be hard
> > > -- is this an argument in favor of having two separate options? ;-)
> >
> > Heh. Exactly how we started these patches when I originally wrote
> > them...
>
> You left this as-is in v2. I don't have any immediate improvements to
> offer. I could see shortening the original to "use the 'split' file
> format", in which case maybe one could allude to a strategy here. (I
> don't think "allow" is really needed, right? Maybe it tries to cover for
> the situation where there's no commit graph yet, so you might say we
> wouldn't write an "incremental" one, but the format would still be the
> same, AFAIU. Anyway, that's outside the scope of this patch.)

Yeah, I agree that the use of "allow" is a little funny for these
reasons. That said, I don't think that it's in dire need of changing,
and so since we agree that it's outside the scope of this series, I'm
happy to ignore it for now.

> Martin

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v2 1/3] builtin/commit-graph.c: support '--split[=<strategy>]'
  2020-02-06 19:41     ` Martin Ågren
  2020-02-07 15:48       ` Derrick Stolee
@ 2020-02-09 23:30       ` Taylor Blau
  1 sibling, 0 replies; 58+ messages in thread
From: Taylor Blau @ 2020-02-09 23:30 UTC (permalink / raw)
  To: Martin Ågren
  Cc: Taylor Blau, Git Mailing List, Jeff King, Derrick Stolee, Junio C Hamano

On Thu, Feb 06, 2020 at 08:41:28PM +0100, Martin Ågren wrote:
> On Wed, 5 Feb 2020 at 01:28, Taylor Blau <me@ttaylorr.com> wrote:
> > diff --git a/Documentation/git-commit-graph.txt b/Documentation/git-commit-graph.txt
> > index 28d1fee505..b7fe65ef21 100644
> > --- a/Documentation/git-commit-graph.txt
> > +++ b/Documentation/git-commit-graph.txt
> > @@ -57,11 +57,17 @@ or `--stdin-packs`.)
> >  With the `--append` option, include all commits that are present in the
> >  existing commit-graph file.
> >  +
> > -With the `--split` option, write the commit-graph as a chain of multiple
> > -commit-graph files stored in `<dir>/info/commit-graphs`. The new commits
> > -not already in the commit-graph are added in a new "tip" file. This file
> > -is merged with the existing file if the following merge conditions are
> > -met:
> > +With the `--split[=<strategy>]` option, write the commit-graph as a
> > +chain of multiple commit-graph files stored in
> > +`<dir>/info/commit-graphs`. Commit-graph layers are merged based on the
> > +strategy and other splitting options. The new commits not already in the
> > +commit-graph are added in a new "tip" file. This file is merged with the
> > +existing file if the following merge conditions are met:
>
> Please add a lone "+" here.

Sure, thanks for noticing.

> > +* If `--split=merge-always` is specified, then a merge is always
> > +conducted, and the remaining options are ignored. Conversely, if
> > +`--split=no-merge` is specified, a merge is never performed, and the
> > +remaining options are ignored. A bare `--split` defers to the remaining
> > +options.
> >  +
>
> Similar to this existing one here. There's some minor misrendering here
> otherwise.
>
> >  * If `--size-multiple=<X>` is not specified, let `X` equal 2. If the new
> >  tip file would have `N` commits and the previous tip has `M` commits and
>
> > -               OPT_BOOL(0, "split", &opts.split,
> > -                       N_("allow writing an incremental commit-graph file")),
> > +               OPT_CALLBACK_F(0, "split", &split_opts.flags, NULL,
> > +                       N_("allow writing an incremental commit-graph file"),
> > +                       PARSE_OPT_OPTARG | PARSE_OPT_NONEG,
> > +                       write_option_parse_split),
>
>
> I keep getting back to this -- sorry! So this actually forbids
> "--no-split", which used to work before. Unfortunate?

Ah, I see. Yes, this definitely *does* forbid that. My thinking when I
decided to give this 'PARSE_OPT_NONEG' was that '--no-split' is
confusing for users: does it mean "don't split" or "unset any split
options"?

This probably would be ameliorated by your suggestion below, maybe of
'--split-strategy', specifically (I could probably go either way on
'--format=split', but it really depends on what Stolee has planned
long-term). Then, '--[no-]split' remains clear, as does
'--no-split-strategy' (although I suppose that you could make the
argument that '--no-split-strategy' sounds a little bit like letting the
machinery use its defaults, which may or may not be true depending on
how it's implemented.)

But, I'm not sure that it's all worth it to add another option here.
This sub-builtin has a plethora of options already, and I'm skeptical
that there are a lot of real-world uses of '--no-split' in the wild that
we'd be breaking.

> I have to ask, what is the long-term plan for the two formats (split and
> non-split)? As I understand it, and I might well be wrong, the non-split
> format came first and the split format was a user-experience
> improvement. Should we expect that `--split` becomes the default? In
> which case `--no-split` would be needed. Or might the non-split format
> go away entirely, leaving `--split` a no-op and `--split=<strategy>` a
> pretty funky way of choosing a strategy for the one-and-only file
> format?
>
> To try to be concrete, here's a suggestion: `--format=split` and
> `--split-strategy=<strategy>`.
>
> Martin

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v2 1/3] builtin/commit-graph.c: support '--split[=<strategy>]'
  2020-02-07 15:48       ` Derrick Stolee
@ 2020-02-09 23:32         ` Taylor Blau
  2020-02-12  6:03         ` Martin Ågren
  1 sibling, 0 replies; 58+ messages in thread
From: Taylor Blau @ 2020-02-09 23:32 UTC (permalink / raw)
  To: Derrick Stolee
  Cc: Martin Ågren, Taylor Blau, Git Mailing List, Jeff King,
	Derrick Stolee, Junio C Hamano

On Fri, Feb 07, 2020 at 10:48:39AM -0500, Derrick Stolee wrote:
> On 2/6/2020 2:41 PM, Martin Ågren wrote:
> > On Wed, 5 Feb 2020 at 01:28, Taylor Blau <me@ttaylorr.com> wrote:
> >>  * If `--size-multiple=<X>` is not specified, let `X` equal 2. If the new
> >>  tip file would have `N` commits and the previous tip has `M` commits and
> >
> >> -               OPT_BOOL(0, "split", &opts.split,
> >> -                       N_("allow writing an incremental commit-graph file")),
> >> +               OPT_CALLBACK_F(0, "split", &split_opts.flags, NULL,
> >> +                       N_("allow writing an incremental commit-graph file"),
> >> +                       PARSE_OPT_OPTARG | PARSE_OPT_NONEG,
> >> +                       write_option_parse_split),
> >
> >
> > I keep getting back to this -- sorry! So this actually forbids
> > "--no-split", which used to work before. Unfortunate?
>
> That certainly is unfortunate. Hopefully no one is taking a dependence on
> this, which only means something if they had a `--split` previously in
> the command-line arguments.
>
> > I have to ask, what is the long-term plan for the two formats (split and
> > non-split)? As I understand it, and I might well be wrong, the non-split
> > format came first and the split format was a user-experience
> > improvement. Should we expect that `--split` becomes the default?
>
> In some ways, the split is now the default because that is how it is
> written during 'git fetch' using fetch.writeCommitGraph. However, I
> don't think that it will ever become the default for the commit-graph
> builtin.
>
> > In
> > which case `--no-split` would be needed. Or might the non-split format
> > go away entirely, leaving `--split` a no-op and `--split=<strategy>` a
> > pretty funky way of choosing a strategy for the one-and-only file
> > format?
>
> In some ways, the --split=merge-all is similar, except it writes a one-line
> commit-graph-chain file and puts a .graph file in
> .git/objects/info/commit-graphs instead of writing to .git/objects/commit-graph.
>
> > To try to be concrete, here's a suggestion: `--format=split` and
> > `--split-strategy=<strategy>`.
>
> Why --format=split instead of leaving it as --[no-]split? Is there a reason to
> introduce this string-based option when there are only two options right now?
>
> Perhaps using --split-strategy=<strategy> is the most backwards-compatible
> option, especially because we won't need --split="" to substitute for
> "auto-merge". However, I wonder if this is a case where we should make the
> hard choice to sacrifice a narrow backwards-compatibility in favor of a
> simplified set of options?

My preference would be the latter, which I vaguely indicated in my last
email to Martin. Like I said, I think that the number of hypothetical
cases that we're breaking is pretty small, if not zero, and so I don't
feel too worried about changing the behavior like this.

If others feel strongly that keeping '--no-split' functional in the
classical sense is worthwhile, then I'm certainly happy to introduce
'--split-strategy' as another option, but I think that we agree that the
simplicity is worth the tradeoff here.

> Thanks,
> -Stolee

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v2 3/3] builtin/commit-graph.c: support '--input=none'
  2020-02-06 19:50     ` Martin Ågren
@ 2020-02-09 23:32       ` Taylor Blau
  0 siblings, 0 replies; 58+ messages in thread
From: Taylor Blau @ 2020-02-09 23:32 UTC (permalink / raw)
  To: Martin Ågren
  Cc: Taylor Blau, Git Mailing List, Jeff King, Derrick Stolee, Junio C Hamano

On Thu, Feb 06, 2020 at 08:50:57PM +0100, Martin Ågren wrote:
> On Wed, 5 Feb 2020 at 01:28, Taylor Blau <me@ttaylorr.com> wrote:
> > @@ -57,6 +57,12 @@ walking commits starting at all refs. (Cannot be combined with
> >  With the `--input=append` option, include all commits that are present
> >  in the existing commit-graph file.
> >  +
> > +With the `--input=none` option, behave as if `--input=append` were
> > +given, but do not walk other packs to find additional commits.
> > +
>
> Similar to my comment in patch 1/3. Please add a "+" instead of an empty
> line. This one actually trips up the rendering quite a bit of a lot of
> what follows.

Thanks again :-).

> > +If none of the above options are given, then generate the new
> > +commit-graph by walking over all pack-indexes.
> > ++
>
> This one's good.
>
> Martin

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [PATCH v3 0/3] builtin/commit-graph.c: new split/merge options
  2020-01-31  0:28 [PATCH 0/3] builtin/commit-graph.c: new split/merge options Taylor Blau
                   ` (6 preceding siblings ...)
  2020-02-05  0:28 ` [PATCH v2 " Taylor Blau
@ 2020-02-12  5:47 ` " Taylor Blau
  2020-02-12  5:47   ` [PATCH v3 1/3] builtin/commit-graph.c: support '--split[=<strategy>]' Taylor Blau
                     ` (4 more replies)
  7 siblings, 5 replies; 58+ messages in thread
From: Taylor Blau @ 2020-02-12  5:47 UTC (permalink / raw)
  To: git; +Cc: peff, dstolee, gitster, martin.agren

Hi everybody,

Attached is what I anticipate/hope to be the final reroll of my series
to add new arguments to the 'git commit-graph write --split' flag to
allow callers to force or prohibit the commit-graph machinery to merge
multiple commit-graph layers.

I was keeping an eye out for more discussion about whether or not these
flags were acceptable by reviewers. Martin Ågren and Derrick Stolee have
both chimed in that they seem OK.

Since there hasn't been much more discussion in this thread, I replayed
this series on top of 'tb/commit-graph-use-odb' (which was itself
rebased on 'master'). I picked up a couple of ASCIIDoc changes along the
way, and a range-diff is included below.

Thanks again.

Taylor Blau (3):
  builtin/commit-graph.c: support '--split[=<strategy>]'
  builtin/commit-graph.c: introduce '--input=<source>'
  builtin/commit-graph.c: support '--input=none'

 Documentation/git-commit-graph.txt |  51 ++++++++-----
 builtin/commit-graph.c             | 115 +++++++++++++++++++++++------
 commit-graph.c                     |  28 ++++---
 commit-graph.h                     |  10 ++-
 t/t5318-commit-graph.sh            |   4 +-
 t/t5324-split-commit-graph.sh      |  53 ++++++++++++-
 6 files changed, 205 insertions(+), 56 deletions(-)

Range-diff against v2:
1:  6428dac6e5 ! 1:  e1635a0e34 builtin/commit-graph.c: support '--split[=<strategy>]'
    @@ Documentation/git-commit-graph.txt: or `--stdin-packs`.)
     +strategy and other splitting options. The new commits not already in the
     +commit-graph are added in a new "tip" file. This file is merged with the
     +existing file if the following merge conditions are met:
    +++
     +* If `--split=merge-always` is specified, then a merge is always
     +conducted, and the remaining options are ignored. Conversely, if
     +`--split=no-merge` is specified, a merge is never performed, and the
2:  c7ba70e19d = 2:  655fe63076 builtin/commit-graph.c: introduce '--input=<source>'
3:  7d6a608acd ! 3:  4e85c6f7e4 builtin/commit-graph.c: support '--input=none'
    @@ Documentation/git-commit-graph.txt: walking commits starting at all refs. (Canno
      +
     +With the `--input=none` option, behave as if `--input=append` were
     +given, but do not walk other packs to find additional commits.
    -+
    +++
     +If none of the above options are given, then generate the new
     +commit-graph by walking over all pack-indexes.
     ++
--
2.25.0.119.gaa12b7378b

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [PATCH v3 1/3] builtin/commit-graph.c: support '--split[=<strategy>]'
  2020-02-12  5:47 ` [PATCH v3 " Taylor Blau
@ 2020-02-12  5:47   ` Taylor Blau
  2020-02-12  5:47   ` [PATCH v3 2/3] builtin/commit-graph.c: introduce '--input=<source>' Taylor Blau
                     ` (3 subsequent siblings)
  4 siblings, 0 replies; 58+ messages in thread
From: Taylor Blau @ 2020-02-12  5:47 UTC (permalink / raw)
  To: git; +Cc: peff, dstolee, gitster, martin.agren

With '--split', the commit-graph machinery writes new commits in another
incremental commit-graph which is part of the existing chain, and
optionally decides to condense the chain into a single commit-graph.
This is done to ensure that the asymptotic behavior of looking up a
commit in an incremental chain is dominated by the number of
incrementals in that chain. It can be controlled by the '--max-commits'
and '--size-multiple' options.

On occasion, callers may want to ensure that 'git commit-graph write
--split' always writes an incremental, and never spends effort
condensing the incremental chain [1]. Previously, this was possible by
passing '--size-multiple=0', but this no longer the case following
63020f175f (commit-graph: prefer default size_mult when given zero,
2020-01-02).

Reintroduce a less-magical variant of the above with a new pair of
arguments to '--split': '--split=no-merge' and '--split=merge-all'. When
'--split=no-merge' is given, the commit-graph machinery will never
condense an existing chain and will always write a new incremental.
Conversely, if '--split=merge-all' is given, any invocation including it
will always condense a chain if one exists.  If '--split' is given with
no arguments, it behaves as before and defers to '--size-multiple', and
so on.

[1]: This might occur when, for example, a server administrator running
some program after each push may want to ensure that each job runs
proportional in time to the size of the push, and does not "jump" when
the commit-graph machinery decides to trigger a merge.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 Documentation/git-commit-graph.txt | 17 ++++++++++-----
 builtin/commit-graph.c             | 35 ++++++++++++++++++++++++++----
 commit-graph.c                     | 22 ++++++++++++-------
 commit-graph.h                     |  7 ++++++
 t/t5324-split-commit-graph.sh      | 25 +++++++++++++++++++++
 5 files changed, 89 insertions(+), 17 deletions(-)

diff --git a/Documentation/git-commit-graph.txt b/Documentation/git-commit-graph.txt
index 28d1fee505..269c355b0a 100644
--- a/Documentation/git-commit-graph.txt
+++ b/Documentation/git-commit-graph.txt
@@ -57,11 +57,18 @@ or `--stdin-packs`.)
 With the `--append` option, include all commits that are present in the
 existing commit-graph file.
 +
-With the `--split` option, write the commit-graph as a chain of multiple
-commit-graph files stored in `<dir>/info/commit-graphs`. The new commits
-not already in the commit-graph are added in a new "tip" file. This file
-is merged with the existing file if the following merge conditions are
-met:
+With the `--split[=<strategy>]` option, write the commit-graph as a
+chain of multiple commit-graph files stored in
+`<dir>/info/commit-graphs`. Commit-graph layers are merged based on the
+strategy and other splitting options. The new commits not already in the
+commit-graph are added in a new "tip" file. This file is merged with the
+existing file if the following merge conditions are met:
++
+* If `--split=merge-always` is specified, then a merge is always
+conducted, and the remaining options are ignored. Conversely, if
+`--split=no-merge` is specified, a merge is never performed, and the
+remaining options are ignored. A bare `--split` defers to the remaining
+options.
 +
 * If `--size-multiple=<X>` is not specified, let `X` equal 2. If the new
 tip file would have `N` commits and the previous tip has `M` commits and
diff --git a/builtin/commit-graph.c b/builtin/commit-graph.c
index 4a70b33fb5..4d3c1c46c2 100644
--- a/builtin/commit-graph.c
+++ b/builtin/commit-graph.c
@@ -9,7 +9,9 @@
 
 static char const * const builtin_commit_graph_usage[] = {
 	N_("git commit-graph verify [--object-dir <objdir>] [--shallow] [--[no-]progress]"),
-	N_("git commit-graph write [--object-dir <objdir>] [--append|--split] [--reachable|--stdin-packs|--stdin-commits] [--[no-]progress] <split options>"),
+	N_("git commit-graph write [--object-dir <objdir>] [--append] "
+	   "[--split[=<strategy>]] [--reachable|--stdin-packs|--stdin-commits] "
+	   "[--[no-]progress] <split options>"),
 	NULL
 };
 
@@ -19,7 +21,9 @@ static const char * const builtin_commit_graph_verify_usage[] = {
 };
 
 static const char * const builtin_commit_graph_write_usage[] = {
-	N_("git commit-graph write [--object-dir <objdir>] [--append|--split] [--reachable|--stdin-packs|--stdin-commits] [--[no-]progress] <split options>"),
+	N_("git commit-graph write [--object-dir <objdir>] [--append] "
+	   "[--split[=<strategy>]] [--reachable|--stdin-packs|--stdin-commits] "
+	   "[--[no-]progress] <split options>"),
 	NULL
 };
 
@@ -111,6 +115,27 @@ static int graph_verify(int argc, const char **argv)
 extern int read_replace_refs;
 static struct split_commit_graph_opts split_opts;
 
+static int write_option_parse_split(const struct option *opt, const char *arg,
+				    int unset)
+{
+	enum commit_graph_split_flags *flags = opt->value;
+
+	opts.split = 1;
+	if (!arg) {
+		*flags = COMMIT_GRAPH_SPLIT_MERGE_AUTO;
+		return 0;
+	}
+
+	if (!strcmp(arg, "merge-all"))
+		*flags = COMMIT_GRAPH_SPLIT_MERGE_REQUIRED;
+	else if (!strcmp(arg, "no-merge"))
+		*flags = COMMIT_GRAPH_SPLIT_MERGE_PROHIBITED;
+	else
+		die(_("unrecognized --split argument, %s"), arg);
+
+	return 0;
+}
+
 static int graph_write(int argc, const char **argv)
 {
 	struct string_list *pack_indexes = NULL;
@@ -133,8 +158,10 @@ static int graph_write(int argc, const char **argv)
 		OPT_BOOL(0, "append", &opts.append,
 			N_("include all commits already in the commit-graph file")),
 		OPT_BOOL(0, "progress", &opts.progress, N_("force progress reporting")),
-		OPT_BOOL(0, "split", &opts.split,
-			N_("allow writing an incremental commit-graph file")),
+		OPT_CALLBACK_F(0, "split", &split_opts.flags, NULL,
+			N_("allow writing an incremental commit-graph file"),
+			PARSE_OPT_OPTARG | PARSE_OPT_NONEG,
+			write_option_parse_split),
 		OPT_INTEGER(0, "max-commits", &split_opts.max_commits,
 			N_("maximum number of commits in a non-base split commit-graph")),
 		OPT_INTEGER(0, "size-multiple", &split_opts.size_multiple,
diff --git a/commit-graph.c b/commit-graph.c
index 656dd647d5..3a5cb23cd7 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -1533,27 +1533,33 @@ static void split_graph_merge_strategy(struct write_commit_graph_context *ctx)
 
 	int max_commits = 0;
 	int size_mult = 2;
+	enum commit_graph_split_flags flags = COMMIT_GRAPH_SPLIT_MERGE_AUTO;
 
 	if (ctx->split_opts) {
 		max_commits = ctx->split_opts->max_commits;
 
 		if (ctx->split_opts->size_multiple)
 			size_mult = ctx->split_opts->size_multiple;
+
+		flags = ctx->split_opts->flags;
 	}
 
 	g = ctx->r->objects->commit_graph;
 	num_commits = ctx->commits.nr;
 	ctx->num_commit_graphs_after = ctx->num_commit_graphs_before + 1;
 
-	while (g && (g->num_commits <= size_mult * num_commits ||
-		    (max_commits && num_commits > max_commits))) {
-		if (g->odb != ctx->odb)
-			break;
+	if (flags != COMMIT_GRAPH_SPLIT_MERGE_PROHIBITED) {
+		while (g && (g->num_commits <= size_mult * num_commits ||
+			    (max_commits && num_commits > max_commits) ||
+			    (flags == COMMIT_GRAPH_SPLIT_MERGE_REQUIRED))) {
+			if (g->odb != ctx->odb)
+				break;
 
-		num_commits += g->num_commits;
-		g = g->base_graph;
+			num_commits += g->num_commits;
+			g = g->base_graph;
 
-		ctx->num_commit_graphs_after--;
+			ctx->num_commit_graphs_after--;
+		}
 	}
 
 	ctx->new_base_graph = g;
@@ -1861,7 +1867,7 @@ int write_commit_graph(struct object_directory *odb,
 		goto cleanup;
 	}
 
-	if (!ctx->commits.nr)
+	if (!ctx->commits.nr && (!ctx->split_opts || ctx->split_opts->flags != COMMIT_GRAPH_SPLIT_MERGE_REQUIRED))
 		goto cleanup;
 
 	if (ctx->split) {
diff --git a/commit-graph.h b/commit-graph.h
index e87a6f6360..65a7d2edae 100644
--- a/commit-graph.h
+++ b/commit-graph.h
@@ -82,10 +82,17 @@ enum commit_graph_write_flags {
 	COMMIT_GRAPH_WRITE_CHECK_OIDS = (1 << 3)
 };
 
+enum commit_graph_split_flags {
+	COMMIT_GRAPH_SPLIT_MERGE_AUTO       = 0,
+	COMMIT_GRAPH_SPLIT_MERGE_REQUIRED   = 1,
+	COMMIT_GRAPH_SPLIT_MERGE_PROHIBITED = 2
+};
+
 struct split_commit_graph_opts {
 	int size_multiple;
 	int max_commits;
 	timestamp_t expire_time;
+	enum commit_graph_split_flags flags;
 };
 
 /*
diff --git a/t/t5324-split-commit-graph.sh b/t/t5324-split-commit-graph.sh
index 53b2e6b455..bb2b724178 100755
--- a/t/t5324-split-commit-graph.sh
+++ b/t/t5324-split-commit-graph.sh
@@ -351,4 +351,29 @@ test_expect_success 'split across alternate where alternate is not split' '
 	test_cmp commit-graph .git/objects/info/commit-graph
 '
 
+test_expect_success '--split=merge-all always merges incrementals' '
+	test_when_finished rm -rf a b c &&
+	rm -rf $graphdir $infodir/commit-graph &&
+	git reset --hard commits/10 &&
+	git rev-list -3 HEAD~4 >a &&
+	git rev-list -2 HEAD~2 >b &&
+	git rev-list -2 HEAD >c &&
+	git commit-graph write --split=no-merge --stdin-commits <a &&
+	git commit-graph write --split=no-merge --stdin-commits <b &&
+	test_line_count = 2 $graphdir/commit-graph-chain &&
+	git commit-graph write --split=merge-all --stdin-commits <c &&
+	test_line_count = 1 $graphdir/commit-graph-chain
+'
+
+test_expect_success '--split=no-merge always writes an incremental' '
+	test_when_finished rm -rf a b &&
+	rm -rf $graphdir &&
+	git reset --hard commits/2 &&
+	git rev-list HEAD~1 >a &&
+	git rev-list HEAD >b &&
+	git commit-graph write --split --stdin-commits <a &&
+	git commit-graph write --split=no-merge --stdin-commits <b &&
+	test_line_count = 2 $graphdir/commit-graph-chain
+'
+
 test_done
-- 
2.25.0.119.gaa12b7378b


^ permalink raw reply	[flat|nested] 58+ messages in thread

* [PATCH v3 2/3] builtin/commit-graph.c: introduce '--input=<source>'
  2020-02-12  5:47 ` [PATCH v3 " Taylor Blau
  2020-02-12  5:47   ` [PATCH v3 1/3] builtin/commit-graph.c: support '--split[=<strategy>]' Taylor Blau
@ 2020-02-12  5:47   ` Taylor Blau
  2020-02-12  5:47   ` [PATCH v3 3/3] builtin/commit-graph.c: support '--input=none' Taylor Blau
                     ` (2 subsequent siblings)
  4 siblings, 0 replies; 58+ messages in thread
From: Taylor Blau @ 2020-02-12  5:47 UTC (permalink / raw)
  To: git; +Cc: peff, dstolee, gitster, martin.agren

The 'write' mode of the 'commit-graph' supports input from a number of
different sources: pack indexes over stdin, commits over stdin, commits
reachable from all references, and so on. Each of these options are
specified with a unique option: '--stdin-packs', '--stdin-commits', etc.

Similar to our replacement of 'git config [--<type>]' with 'git config
[--type=<type>]' (c.f., fb0dc3bac1 (builtin/config.c: support
`--type=<type>` as preferred alias for `--<type>`, 2018-04-18)), softly
deprecate '[--<input>]' in favor of '[--input=<source>]'.

This makes it more clear to implement new options that are combinations
of other options (such as, for example, "none", a combination of the old
"--append" and a new sentinel to specify to _not_ look in other packs,
which we will implement in a future patch).

Unfortunately, the new enumerated type is a bitfield, even though it
makes much more sense as '0, 1, 2, ...'. Even though *almost* all
options are pairwise exclusive, '--stdin-{packs,commits}' *is*
compatible with '--append'. For this reason, use a bitfield.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 Documentation/git-commit-graph.txt | 26 +++++-----
 builtin/commit-graph.c             | 83 +++++++++++++++++++++---------
 t/t5318-commit-graph.sh            |  4 +-
 t/t5324-split-commit-graph.sh      |  2 +-
 4 files changed, 76 insertions(+), 39 deletions(-)

diff --git a/Documentation/git-commit-graph.txt b/Documentation/git-commit-graph.txt
index 269c355b0a..0a320cccdd 100644
--- a/Documentation/git-commit-graph.txt
+++ b/Documentation/git-commit-graph.txt
@@ -41,21 +41,21 @@ COMMANDS
 
 Write a commit-graph file based on the commits found in packfiles.
 +
-With the `--stdin-packs` option, generate the new commit graph by
+With the `--input=stdin-packs` option, generate the new commit graph by
 walking objects only in the specified pack-indexes. (Cannot be combined
-with `--stdin-commits` or `--reachable`.)
+with `--input=stdin-commits` or `--input=reachable`.)
 +
-With the `--stdin-commits` option, generate the new commit graph by
-walking commits starting at the commits specified in stdin as a list
+With the `--input=stdin-commits` option, generate the new commit graph
+by walking commits starting at the commits specified in stdin as a list
 of OIDs in hex, one OID per line. (Cannot be combined with
-`--stdin-packs` or `--reachable`.)
+`--input=stdin-packs` or `--input=reachable`.)
 +
-With the `--reachable` option, generate the new commit graph by walking
-commits starting at all refs. (Cannot be combined with `--stdin-commits`
-or `--stdin-packs`.)
+With the `--input=reachable` option, generate the new commit graph by
+walking commits starting at all refs. (Cannot be combined with
+`--input=stdin-commits` or `--input=stdin-packs`.)
 +
-With the `--append` option, include all commits that are present in the
-existing commit-graph file.
+With the `--input=append` option, include all commits that are present
+in the existing commit-graph file.
 +
 With the `--split[=<strategy>]` option, write the commit-graph as a
 chain of multiple commit-graph files stored in
@@ -106,20 +106,20 @@ $ git commit-graph write
   using commits in `<pack-index>`.
 +
 ------------------------------------------------
-$ echo <pack-index> | git commit-graph write --stdin-packs
+$ echo <pack-index> | git commit-graph write --input=stdin-packs
 ------------------------------------------------
 
 * Write a commit-graph file containing all reachable commits.
 +
 ------------------------------------------------
-$ git show-ref -s | git commit-graph write --stdin-commits
+$ git show-ref -s | git commit-graph write --input=stdin-commits
 ------------------------------------------------
 
 * Write a commit-graph file containing all commits in the current
   commit-graph file along with those reachable from `HEAD`.
 +
 ------------------------------------------------
-$ git rev-parse HEAD | git commit-graph write --stdin-commits --append
+$ git rev-parse HEAD | git commit-graph write --input=stdin-commits --input=append
 ------------------------------------------------
 
 
diff --git a/builtin/commit-graph.c b/builtin/commit-graph.c
index 4d3c1c46c2..0ff25896d0 100644
--- a/builtin/commit-graph.c
+++ b/builtin/commit-graph.c
@@ -9,8 +9,9 @@
 
 static char const * const builtin_commit_graph_usage[] = {
 	N_("git commit-graph verify [--object-dir <objdir>] [--shallow] [--[no-]progress]"),
-	N_("git commit-graph write [--object-dir <objdir>] [--append] "
-	   "[--split[=<strategy>]] [--reachable|--stdin-packs|--stdin-commits] "
+	N_("git commit-graph write [--object-dir <objdir>] "
+	   "[--split[=<strategy>]] "
+	   "[--input=<reachable|stdin-packs|stdin-commits|append>] "
 	   "[--[no-]progress] <split options>"),
 	NULL
 };
@@ -21,18 +22,23 @@ static const char * const builtin_commit_graph_verify_usage[] = {
 };
 
 static const char * const builtin_commit_graph_write_usage[] = {
-	N_("git commit-graph write [--object-dir <objdir>] [--append] "
-	   "[--split[=<strategy>]] [--reachable|--stdin-packs|--stdin-commits] "
+	N_("git commit-graph write [--object-dir <objdir>] "
+	   "[--split[=<strategy>]] "
+	   "[--input=<reachable|stdin-packs|stdin-commits|append>] "
 	   "[--[no-]progress] <split options>"),
 	NULL
 };
 
+enum commit_graph_input {
+	COMMIT_GRAPH_INPUT_REACHABLE     = (1 << 1),
+	COMMIT_GRAPH_INPUT_STDIN_PACKS   = (1 << 2),
+	COMMIT_GRAPH_INPUT_STDIN_COMMITS = (1 << 3),
+	COMMIT_GRAPH_INPUT_APPEND        = (1 << 4)
+};
+
 static struct opts_commit_graph {
 	const char *obj_dir;
-	int reachable;
-	int stdin_packs;
-	int stdin_commits;
-	int append;
+	enum commit_graph_input input;
 	int split;
 	int shallow;
 	int progress;
@@ -57,6 +63,28 @@ static struct object_directory *find_odb(struct repository *r,
 	return odb;
 }
 
+static int option_parse_input(const struct option *opt, const char *arg,
+			      int unset)
+{
+	enum commit_graph_input *to = opt->value;
+	if (unset || !strcmp(arg, "packs")) {
+		*to = 0;
+		return 0;
+	}
+
+	if (!strcmp(arg, "reachable"))
+		*to |= COMMIT_GRAPH_INPUT_REACHABLE;
+	else if (!strcmp(arg, "stdin-packs"))
+		*to |= COMMIT_GRAPH_INPUT_STDIN_PACKS;
+	else if (!strcmp(arg, "stdin-commits"))
+		*to |= COMMIT_GRAPH_INPUT_STDIN_COMMITS;
+	else if (!strcmp(arg, "append"))
+		*to |= COMMIT_GRAPH_INPUT_APPEND;
+	else
+		die(_("unrecognized --input source, %s"), arg);
+	return 0;
+}
+
 static int graph_verify(int argc, const char **argv)
 {
 	struct commit_graph *graph = NULL;
@@ -149,14 +177,21 @@ static int graph_write(int argc, const char **argv)
 		OPT_STRING(0, "object-dir", &opts.obj_dir,
 			N_("dir"),
 			N_("The object directory to store the graph")),
-		OPT_BOOL(0, "reachable", &opts.reachable,
-			N_("start walk at all refs")),
-		OPT_BOOL(0, "stdin-packs", &opts.stdin_packs,
-			N_("scan pack-indexes listed by stdin for commits")),
-		OPT_BOOL(0, "stdin-commits", &opts.stdin_commits,
-			N_("start walk at commits listed by stdin")),
-		OPT_BOOL(0, "append", &opts.append,
-			N_("include all commits already in the commit-graph file")),
+		OPT_CALLBACK(0, "input", &opts.input, NULL,
+			N_("include commits from this source in the graph"),
+			option_parse_input),
+		OPT_BIT(0, "reachable", &opts.input,
+			N_("start walk at all refs"),
+			COMMIT_GRAPH_INPUT_REACHABLE),
+		OPT_BIT(0, "stdin-packs", &opts.input,
+			N_("scan pack-indexes listed by stdin for commits"),
+			COMMIT_GRAPH_INPUT_STDIN_PACKS),
+		OPT_BIT(0, "stdin-commits", &opts.input,
+			N_("start walk at commits listed by stdin"),
+			COMMIT_GRAPH_INPUT_STDIN_COMMITS),
+		OPT_BIT(0, "append", &opts.input,
+			N_("include all commits already in the commit-graph file"),
+			COMMIT_GRAPH_INPUT_APPEND),
 		OPT_BOOL(0, "progress", &opts.progress, N_("force progress reporting")),
 		OPT_CALLBACK_F(0, "split", &split_opts.flags, NULL,
 			N_("allow writing an incremental commit-graph file"),
@@ -182,11 +217,13 @@ static int graph_write(int argc, const char **argv)
 			     builtin_commit_graph_write_options,
 			     builtin_commit_graph_write_usage, 0);
 
-	if (opts.reachable + opts.stdin_packs + opts.stdin_commits > 1)
-		die(_("use at most one of --reachable, --stdin-commits, or --stdin-packs"));
+	if ((!!(opts.input & COMMIT_GRAPH_INPUT_REACHABLE) +
+	     !!(opts.input & COMMIT_GRAPH_INPUT_STDIN_PACKS) +
+	     !!(opts.input & COMMIT_GRAPH_INPUT_STDIN_COMMITS)) > 1)
+		die(_("use at most one of --input=reachable, --input=stdin-commits, or --input=stdin-packs"));
 	if (!opts.obj_dir)
 		opts.obj_dir = get_object_directory();
-	if (opts.append)
+	if (opts.input & COMMIT_GRAPH_INPUT_APPEND)
 		flags |= COMMIT_GRAPH_WRITE_APPEND;
 	if (opts.split)
 		flags |= COMMIT_GRAPH_WRITE_SPLIT;
@@ -196,22 +233,22 @@ static int graph_write(int argc, const char **argv)
 	read_replace_refs = 0;
 	odb = find_odb(the_repository, opts.obj_dir);
 
-	if (opts.reachable) {
+	if (opts.input & COMMIT_GRAPH_INPUT_REACHABLE) {
 		if (write_commit_graph_reachable(odb, flags, &split_opts))
 			return 1;
 		return 0;
 	}
 
 	string_list_init(&lines, 0);
-	if (opts.stdin_packs || opts.stdin_commits) {
+	if (opts.input & (COMMIT_GRAPH_INPUT_STDIN_PACKS | COMMIT_GRAPH_INPUT_STDIN_COMMITS)) {
 		struct strbuf buf = STRBUF_INIT;
 
 		while (strbuf_getline(&buf, stdin) != EOF)
 			string_list_append(&lines, strbuf_detach(&buf, NULL));
 
-		if (opts.stdin_packs)
+		if (opts.input & COMMIT_GRAPH_INPUT_STDIN_PACKS)
 			pack_indexes = &lines;
-		if (opts.stdin_commits) {
+		if (opts.input & COMMIT_GRAPH_INPUT_STDIN_COMMITS) {
 			commit_hex = &lines;
 			flags |= COMMIT_GRAPH_WRITE_CHECK_OIDS;
 		}
diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh
index 07b3207595..1aa81d2383 100755
--- a/t/t5318-commit-graph.sh
+++ b/t/t5318-commit-graph.sh
@@ -227,7 +227,7 @@ graph_git_behavior 'cleared graph, commit 8 vs merge 2' full commits/8 merge/2
 
 test_expect_success 'build graph from latest pack with closure' '
 	cd "$TRASH_DIRECTORY/full" &&
-	cat new-idx | git commit-graph write --stdin-packs &&
+	cat new-idx | git commit-graph write --input=stdin-packs &&
 	test_path_is_file $objdir/info/commit-graph &&
 	graph_read_expect "9" "extra_edges"
 '
@@ -240,7 +240,7 @@ test_expect_success 'build graph from commits with closure' '
 	git tag -a -m "merge" tag/merge merge/2 &&
 	git rev-parse tag/merge >commits-in &&
 	git rev-parse merge/1 >>commits-in &&
-	cat commits-in | git commit-graph write --stdin-commits &&
+	cat commits-in | git commit-graph write --input=stdin-commits &&
 	test_path_is_file $objdir/info/commit-graph &&
 	graph_read_expect "6"
 '
diff --git a/t/t5324-split-commit-graph.sh b/t/t5324-split-commit-graph.sh
index bb2b724178..6894106727 100755
--- a/t/t5324-split-commit-graph.sh
+++ b/t/t5324-split-commit-graph.sh
@@ -42,7 +42,7 @@ test_expect_success 'create commits and write commit-graph' '
 		test_commit $i &&
 		git branch commits/$i || return 1
 	done &&
-	git commit-graph write --reachable &&
+	git commit-graph write --input=reachable &&
 	test_path_is_file $infodir/commit-graph &&
 	graph_read_expect 3
 '
-- 
2.25.0.119.gaa12b7378b


^ permalink raw reply	[flat|nested] 58+ messages in thread

* [PATCH v3 3/3] builtin/commit-graph.c: support '--input=none'
  2020-02-12  5:47 ` [PATCH v3 " Taylor Blau
  2020-02-12  5:47   ` [PATCH v3 1/3] builtin/commit-graph.c: support '--split[=<strategy>]' Taylor Blau
  2020-02-12  5:47   ` [PATCH v3 2/3] builtin/commit-graph.c: introduce '--input=<source>' Taylor Blau
@ 2020-02-12  5:47   ` Taylor Blau
  2020-02-13 11:39     ` SZEDER Gábor
  2020-02-13 12:31     ` SZEDER Gábor
  2020-02-12 18:19   ` [PATCH v3 0/3] builtin/commit-graph.c: new split/merge options Junio C Hamano
  2020-02-17 18:24   ` Martin Ågren
  4 siblings, 2 replies; 58+ messages in thread
From: Taylor Blau @ 2020-02-12  5:47 UTC (permalink / raw)
  To: git; +Cc: peff, dstolee, gitster, martin.agren

In the previous commit, we introduced '--split=<no-merge|merge-all>',
and alluded to the fact that '--split=merge-all' would be useful for
callers who wish to always trigger a merge of an incremental chain.

There is a problem with the above approach, which is that there is no
way to specify to the commit-graph builtin that a caller only wants to
include commits already in the graph. One can specify '--input=append'
to include all commits in the existing graphs, but the absence of
'--input=stdin-{commits,packs}' causes the builtin to call
'fill_oids_from_all_packs()'.

Passing '--input=reachable' (as in 'git commit-graph write
--split=merge-all --input=reachable --input=append') works around this
issue by making '--input=reachable' effectively a no-op, but this can be
prohibitively expensive in large repositories, making it an undesirable
choice for some users.

Teach '--input=none' as an option to behave as if '--input=append' were
given, but to consider no other sources in addition.

This, in conjunction with the option introduced in the previous patch
offers the convenient way to force the commit-graph machinery to
condense a chain of incrementals without requiring any new commits:

  $ git commit-graph write --split=merge-all --input=none

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 Documentation/git-commit-graph.txt |  8 +++++++-
 builtin/commit-graph.c             | 11 ++++++++---
 commit-graph.c                     |  6 ++++--
 commit-graph.h                     |  3 ++-
 t/t5324-split-commit-graph.sh      | 26 ++++++++++++++++++++++++++
 5 files changed, 47 insertions(+), 7 deletions(-)

diff --git a/Documentation/git-commit-graph.txt b/Documentation/git-commit-graph.txt
index 0a320cccdd..b210cef52f 100644
--- a/Documentation/git-commit-graph.txt
+++ b/Documentation/git-commit-graph.txt
@@ -39,7 +39,7 @@ COMMANDS
 --------
 'write'::
 
-Write a commit-graph file based on the commits found in packfiles.
+Write a commit-graph file based on the specified sources of input:
 +
 With the `--input=stdin-packs` option, generate the new commit graph by
 walking objects only in the specified pack-indexes. (Cannot be combined
@@ -57,6 +57,12 @@ walking commits starting at all refs. (Cannot be combined with
 With the `--input=append` option, include all commits that are present
 in the existing commit-graph file.
 +
+With the `--input=none` option, behave as if `--input=append` were
+given, but do not walk other packs to find additional commits.
++
+If none of the above options are given, then generate the new
+commit-graph by walking over all pack-indexes.
++
 With the `--split[=<strategy>]` option, write the commit-graph as a
 chain of multiple commit-graph files stored in
 `<dir>/info/commit-graphs`. Commit-graph layers are merged based on the
diff --git a/builtin/commit-graph.c b/builtin/commit-graph.c
index 0ff25896d0..a71af88815 100644
--- a/builtin/commit-graph.c
+++ b/builtin/commit-graph.c
@@ -11,7 +11,7 @@ static char const * const builtin_commit_graph_usage[] = {
 	N_("git commit-graph verify [--object-dir <objdir>] [--shallow] [--[no-]progress]"),
 	N_("git commit-graph write [--object-dir <objdir>] "
 	   "[--split[=<strategy>]] "
-	   "[--input=<reachable|stdin-packs|stdin-commits|append>] "
+	   "[--input=<reachable|stdin-packs|stdin-commits|append|none>] "
 	   "[--[no-]progress] <split options>"),
 	NULL
 };
@@ -24,7 +24,7 @@ static const char * const builtin_commit_graph_verify_usage[] = {
 static const char * const builtin_commit_graph_write_usage[] = {
 	N_("git commit-graph write [--object-dir <objdir>] "
 	   "[--split[=<strategy>]] "
-	   "[--input=<reachable|stdin-packs|stdin-commits|append>] "
+	   "[--input=<reachable|stdin-packs|stdin-commits|append|none>] "
 	   "[--[no-]progress] <split options>"),
 	NULL
 };
@@ -33,7 +33,8 @@ enum commit_graph_input {
 	COMMIT_GRAPH_INPUT_REACHABLE     = (1 << 1),
 	COMMIT_GRAPH_INPUT_STDIN_PACKS   = (1 << 2),
 	COMMIT_GRAPH_INPUT_STDIN_COMMITS = (1 << 3),
-	COMMIT_GRAPH_INPUT_APPEND        = (1 << 4)
+	COMMIT_GRAPH_INPUT_APPEND        = (1 << 4),
+	COMMIT_GRAPH_INPUT_NONE          = (1 << 5)
 };
 
 static struct opts_commit_graph {
@@ -80,6 +81,8 @@ static int option_parse_input(const struct option *opt, const char *arg,
 		*to |= COMMIT_GRAPH_INPUT_STDIN_COMMITS;
 	else if (!strcmp(arg, "append"))
 		*to |= COMMIT_GRAPH_INPUT_APPEND;
+	else if (!strcmp(arg, "none"))
+		*to |= (COMMIT_GRAPH_INPUT_APPEND | COMMIT_GRAPH_INPUT_NONE);
 	else
 		die(_("unrecognized --input source, %s"), arg);
 	return 0;
@@ -225,6 +228,8 @@ static int graph_write(int argc, const char **argv)
 		opts.obj_dir = get_object_directory();
 	if (opts.input & COMMIT_GRAPH_INPUT_APPEND)
 		flags |= COMMIT_GRAPH_WRITE_APPEND;
+	if (opts.input & COMMIT_GRAPH_INPUT_NONE)
+		flags |= COMMIT_GRAPH_WRITE_NO_INPUT;
 	if (opts.split)
 		flags |= COMMIT_GRAPH_WRITE_SPLIT;
 	if (opts.progress)
diff --git a/commit-graph.c b/commit-graph.c
index 3a5cb23cd7..417b7eac9c 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -788,7 +788,8 @@ struct write_commit_graph_context {
 	unsigned append:1,
 		 report_progress:1,
 		 split:1,
-		 check_oids:1;
+		 check_oids:1,
+		 no_input:1;
 
 	const struct split_commit_graph_opts *split_opts;
 };
@@ -1785,6 +1786,7 @@ int write_commit_graph(struct object_directory *odb,
 	ctx->split = flags & COMMIT_GRAPH_WRITE_SPLIT ? 1 : 0;
 	ctx->check_oids = flags & COMMIT_GRAPH_WRITE_CHECK_OIDS ? 1 : 0;
 	ctx->split_opts = split_opts;
+	ctx->no_input = flags & COMMIT_GRAPH_WRITE_NO_INPUT ? 1 : 0;
 
 	if (ctx->split) {
 		struct commit_graph *g;
@@ -1843,7 +1845,7 @@ int write_commit_graph(struct object_directory *odb,
 			goto cleanup;
 	}
 
-	if (!pack_indexes && !commit_hex)
+	if (!ctx->no_input && !pack_indexes && !commit_hex)
 		fill_oids_from_all_packs(ctx);
 
 	close_reachable(ctx);
diff --git a/commit-graph.h b/commit-graph.h
index 65a7d2edae..df7f3f5961 100644
--- a/commit-graph.h
+++ b/commit-graph.h
@@ -79,7 +79,8 @@ enum commit_graph_write_flags {
 	COMMIT_GRAPH_WRITE_PROGRESS   = (1 << 1),
 	COMMIT_GRAPH_WRITE_SPLIT      = (1 << 2),
 	/* Make sure that each OID in the input is a valid commit OID. */
-	COMMIT_GRAPH_WRITE_CHECK_OIDS = (1 << 3)
+	COMMIT_GRAPH_WRITE_CHECK_OIDS = (1 << 3),
+	COMMIT_GRAPH_WRITE_NO_INPUT   = (1 << 4)
 };
 
 enum commit_graph_split_flags {
diff --git a/t/t5324-split-commit-graph.sh b/t/t5324-split-commit-graph.sh
index 6894106727..7614f3915b 100755
--- a/t/t5324-split-commit-graph.sh
+++ b/t/t5324-split-commit-graph.sh
@@ -376,4 +376,30 @@ test_expect_success '--split=no-merge always writes an incremental' '
 	test_line_count = 2 $graphdir/commit-graph-chain
 '
 
+test_expect_success '--split=no-merge, --input=none writes nothing' '
+	test_when_finished rm -rf a graphs.before graphs.after &&
+	rm -rf $graphdir &&
+	git reset --hard commits/2 &&
+	git rev-list -1 HEAD~1 >a &&
+	git commit-graph write --split=no-merge --input=stdin-commits <a &&
+	ls $graphdir/graph-*.graph >graphs.before &&
+	test_line_count = 1 $graphdir/commit-graph-chain &&
+	git commit-graph write --split --input=none &&
+	ls $graphdir/graph-*.graph >graphs.after &&
+	test_cmp graphs.before graphs.after
+'
+
+test_expect_success '--split=merge-all, --input=none merges the chain' '
+	test_when_finished rm -rf a b &&
+	rm -rf $graphdir &&
+	git reset --hard commits/2 &&
+	git rev-list -1 HEAD~1 >a &&
+	git rev-list -1 HEAD >b &&
+	git commit-graph write --split=no-merge --input=stdin-commits <a &&
+	git commit-graph write --split=no-merge --input=stdin-commits <b &&
+	test_line_count = 2 $graphdir/commit-graph-chain &&
+	git commit-graph write --split=merge-all --input=none &&
+	test_line_count = 1 $graphdir/commit-graph-chain
+'
+
 test_done
-- 
2.25.0.119.gaa12b7378b

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v2 1/3] builtin/commit-graph.c: support '--split[=<strategy>]'
  2020-02-07 15:48       ` Derrick Stolee
  2020-02-09 23:32         ` Taylor Blau
@ 2020-02-12  6:03         ` Martin Ågren
  2020-02-12 20:50           ` Taylor Blau
  1 sibling, 1 reply; 58+ messages in thread
From: Martin Ågren @ 2020-02-12  6:03 UTC (permalink / raw)
  To: Derrick Stolee
  Cc: Taylor Blau, Git Mailing List, Jeff King, Derrick Stolee, Junio C Hamano

On Fri, 7 Feb 2020 at 16:48, Derrick Stolee <stolee@gmail.com> wrote:
>
> On 2/6/2020 2:41 PM, Martin Ågren wrote:
> > On Wed, 5 Feb 2020 at 01:28, Taylor Blau <me@ttaylorr.com> wrote:
> >> -               OPT_BOOL(0, "split", &opts.split,
> >> -                       N_("allow writing an incremental commit-graph file")),
> >> +               OPT_CALLBACK_F(0, "split", &split_opts.flags, NULL,
> >> +                       N_("allow writing an incremental commit-graph file"),
> >> +                       PARSE_OPT_OPTARG | PARSE_OPT_NONEG,
> >> +                       write_option_parse_split),
> >
> >
> > I keep getting back to this -- sorry! So this actually forbids
> > "--no-split", which used to work before. Unfortunate?
>
> That certainly is unfortunate. Hopefully no one is taking a dependence on
> this, which only means something if they had a `--split` previously in
> the command-line arguments.
>
> > I have to ask, what is the long-term plan for the two formats (split and
> > non-split)? As I understand it, and I might well be wrong, the non-split
> > format came first and the split format was a user-experience
> > improvement. Should we expect that `--split` becomes the default?
>
> In some ways, the split is now the default because that is how it is
> written during 'git fetch' using fetch.writeCommitGraph. However, I
> don't think that it will ever become the default for the commit-graph
> builtin.

Thanks for giving this piece of background.

> > To try to be concrete, here's a suggestion: `--format=split` and
> > `--split-strategy=<strategy>`.
>
> Why --format=split instead of leaving it as --[no-]split? Is there a reason to
> introduce this string-based option when there are only two options right now?

My thinking was, if my concern is "--split" being overloaded, what would
it look like to "unload" it entirely? From "--split" it isn't obvious
whether it's a verb or an adjective (shall we split, or shall we do
things the split way?). Having "--format=split" would help avoid *that*,
possibly leaving a cleaner field for the issue of "do we
allow/force/forbid the 'merging' to happen?". But I'm happy to accept
"--split=<strategy>" and move on. :-)

I see that Taylor juuust posted a v3. I'll try to find time to look it
over, but I won't be raising this point again.

Martin

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v3 0/3] builtin/commit-graph.c: new split/merge options
  2020-02-12  5:47 ` [PATCH v3 " Taylor Blau
                     ` (2 preceding siblings ...)
  2020-02-12  5:47   ` [PATCH v3 3/3] builtin/commit-graph.c: support '--input=none' Taylor Blau
@ 2020-02-12 18:19   ` Junio C Hamano
  2020-02-13 17:41     ` Taylor Blau
  2020-02-17 18:24   ` Martin Ågren
  4 siblings, 1 reply; 58+ messages in thread
From: Junio C Hamano @ 2020-02-12 18:19 UTC (permalink / raw)
  To: Taylor Blau, Garima Singh; +Cc: git, peff, dstolee, martin.agren

Taylor Blau <me@ttaylorr.com> writes:

> Attached is what I anticipate/hope to be the final reroll of my series
> to add new arguments to the 'git commit-graph write --split' flag to
> allow callers to force or prohibit the commit-graph machinery to merge
> multiple commit-graph layers.
>
> I was keeping an eye out for more discussion about whether or not these
> flags were acceptable by reviewers. Martin Ågren and Derrick Stolee have
> both chimed in that they seem OK.
>
> Since there hasn't been much more discussion in this thread, I replayed
> this series on top of 'tb/commit-graph-use-odb' (which was itself
> rebased on 'master'). I picked up a couple of ASCIIDoc changes along the
> way, and a range-diff is included below.

I haven't had a chance to form an opinion on this topic, and will
let others comment on it first.

This topic and the "changed paths bloom filter" topic obviously and
inevitably have trivial textual conflicts.  When today's integration
result is pushed out in 6 hours or so, please see if the resolution
is reasonable in 'pu'.

Thanks.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v2 1/3] builtin/commit-graph.c: support '--split[=<strategy>]'
  2020-02-12  6:03         ` Martin Ågren
@ 2020-02-12 20:50           ` Taylor Blau
  0 siblings, 0 replies; 58+ messages in thread
From: Taylor Blau @ 2020-02-12 20:50 UTC (permalink / raw)
  To: Martin Ågren
  Cc: Derrick Stolee, Taylor Blau, Git Mailing List, Jeff King,
	Derrick Stolee, Junio C Hamano

On Wed, Feb 12, 2020 at 07:03:46AM +0100, Martin Ågren wrote:
> On Fri, 7 Feb 2020 at 16:48, Derrick Stolee <stolee@gmail.com> wrote:
> >
> > On 2/6/2020 2:41 PM, Martin Ågren wrote:
> > > On Wed, 5 Feb 2020 at 01:28, Taylor Blau <me@ttaylorr.com> wrote:
> > >> -               OPT_BOOL(0, "split", &opts.split,
> > >> -                       N_("allow writing an incremental commit-graph file")),
> > >> +               OPT_CALLBACK_F(0, "split", &split_opts.flags, NULL,
> > >> +                       N_("allow writing an incremental commit-graph file"),
> > >> +                       PARSE_OPT_OPTARG | PARSE_OPT_NONEG,
> > >> +                       write_option_parse_split),
> > >
> > >
> > > I keep getting back to this -- sorry! So this actually forbids
> > > "--no-split", which used to work before. Unfortunate?
> >
> > That certainly is unfortunate. Hopefully no one is taking a dependence on
> > this, which only means something if they had a `--split` previously in
> > the command-line arguments.
> >
> > > I have to ask, what is the long-term plan for the two formats (split and
> > > non-split)? As I understand it, and I might well be wrong, the non-split
> > > format came first and the split format was a user-experience
> > > improvement. Should we expect that `--split` becomes the default?
> >
> > In some ways, the split is now the default because that is how it is
> > written during 'git fetch' using fetch.writeCommitGraph. However, I
> > don't think that it will ever become the default for the commit-graph
> > builtin.
>
> Thanks for giving this piece of background.
>
> > > To try to be concrete, here's a suggestion: `--format=split` and
> > > `--split-strategy=<strategy>`.
> >
> > Why --format=split instead of leaving it as --[no-]split? Is there a reason to
> > introduce this string-based option when there are only two options right now?
>
> My thinking was, if my concern is "--split" being overloaded, what would
> it look like to "unload" it entirely? From "--split" it isn't obvious
> whether it's a verb or an adjective (shall we split, or shall we do
> things the split way?). Having "--format=split" would help avoid *that*,
> possibly leaving a cleaner field for the issue of "do we
> allow/force/forbid the 'merging' to happen?". But I'm happy to accept
> "--split=<strategy>" and move on. :-)
>
> I see that Taylor juuust posted a v3. I'll try to find time to look it
> over, but I won't be raising this point again.

It looks like we raced :-). Sorry about that. I didn't see your email
until after I sent, and I certainly would have waited if I knew that you
were writing an email to the same thread as I was working in at the same
time.

I'm still fairly happy with the '--split[=<strategy>]' approach that is
implemented in all versions of this patch series, although I do
understand your suggestions.

My preference would be to see if anybody else feels like the trade-off
*is* worth it (I explained earlier in the thread some reasons why I feel
that the trade-off is *not* worth it), but I'd be happy to move this
series forward as-is unless others echo this idea.

> Martin

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 2/3] builtin/commit-graph.c: introduce '--input=<source>'
  2020-02-04  4:51     ` Taylor Blau
@ 2020-02-13 11:33       ` SZEDER Gábor
  2020-02-13 11:48         ` SZEDER Gábor
  0 siblings, 1 reply; 58+ messages in thread
From: SZEDER Gábor @ 2020-02-13 11:33 UTC (permalink / raw)
  To: Taylor Blau
  Cc: Martin Ågren, Git Mailing List, Jeff King, Derrick Stolee,
	Junio C Hamano

On Mon, Feb 03, 2020 at 08:51:24PM -0800, Taylor Blau wrote:
> On Fri, Jan 31, 2020 at 08:34:41PM +0100, Martin Ågren wrote:
> > On Fri, 31 Jan 2020 at 01:30, Taylor Blau <me@ttaylorr.com> wrote:
> > > The 'write' mode of the 'commit-graph' supports input from a number of

s/mode/subcommand/

> > > different sources:

I note that you use the word "sources" here, in the subject line as
well (as '--input=<source>'), and in the code as well (e.g.  the in
the error message "unrecognized --input source, %s").  I like this
word, I think the words "input" and "source" go really well together.

> > > pack indexes over stdin, commits over stdin, commits
> > > reachable from all references, and so on.

It's interesting to see that you stopped listing and went for "and so
on" right when it got interesting/controversial with '--append'... :)

> > > Each of these options are
> > > specified with a unique option: '--stdin-packs', '--stdin-commits', etc.

It also supports the very inefficient scanning through all objects in
all pack files to find commit objects, which, sadly, ended up being
the default, and thus doesn't have its own --option.  Should there be
a corresponding '--input=<source>' as well?  (Note that I don't mean
this as a suggestion to add one; on the contrary, the less exposure it
gets the better.)

> > > Similar to our replacement of 'git config [--<type>]' with 'git config
> > > [--type=<type>]' (c.f., fb0dc3bac1 (builtin/config.c: support
> > > `--type=<type>` as preferred alias for `--<type>`, 2018-04-18)), softly
> > > deprecate '[--<input>]' in favor of '[--input=<source>]'.
> > >
> > > This makes it more clear to implement new options that are combinations
> > > of other options (such as, for example, "none", a combination of the old
> > > "--append" and a new sentinel to specify to _not_ look in other packs,
> > > which we will implement in a future patch).
> >
> > Makes sense.
> >
> > > Unfortunately, the new enumerated type is a bitfield, even though it
> > > makes much more sense as '0, 1, 2, ...'. Even though *almost* all
> > > options are pairwise exclusive, '--stdin-{packs,commits}' *is*
> > > compatible with '--append'. For this reason, use a bitfield.
> >
> > > -With the `--append` option, include all commits that are present in the
> > > -existing commit-graph file.
> > > +With the `--input=append` option, include all commits that are present
> > > +in the existing commit-graph file.
> >
> > Would it be too crazy to call this `--input=existing` instead, and have
> > it be the same as `--append`? I find that `--append` makes a lot of
> > sense (it's a mode we can turn on or off), whereas "input = append"
> > seems more odd.
> 
> Hmm. When I wrote this, I was thinking of introducing equivalent options
> that are identical in name and functionality as '--input=<mode>' instead
> of '--<mode>'. So, I guess that is to say that I didn't spend an awful
> amount of time thinking about whether or not '--input=append' made sense
> given anything else.
> 
> So, I don't think that '--input=existing' is a bad idea at all, but I do
> worry about advertising this deprecation as "'--<mode>' becomes
> '--input=<mode>', except when your mode is 'append', in which case it
> becomes '--input=existing'".

But here you suddenly start using the word "mode" both in
'--input=<mode>' and in '--<mode>'.

On one hand, I don't think that the word "mode" goes as well with
"input" as "source" does.

On the other, is '--append' really a source/mode, like '--reachable'
and '--stdin-commits' are?  Source, no: from wordsmithing perspective
it doesn't fit with "source", and being orthogonal to the "real"
source options while they are mutually exclusive seems to be a clear
indication that it isn't.  Mode, yes: it's a mode of operation where
no longer reachable/present commits are not discarded from the
commit-graph.

So I don't think that adding '--input=append' is a good idea, even if
we were call it differently, e.g. '--input=existing' as suggested
above.

However, I do think that '--input=existing' would better express what
'--input=none' in the next patch wants to achieve.


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v3 3/3] builtin/commit-graph.c: support '--input=none'
  2020-02-12  5:47   ` [PATCH v3 3/3] builtin/commit-graph.c: support '--input=none' Taylor Blau
@ 2020-02-13 11:39     ` SZEDER Gábor
  2020-02-13 12:31     ` SZEDER Gábor
  1 sibling, 0 replies; 58+ messages in thread
From: SZEDER Gábor @ 2020-02-13 11:39 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, peff, dstolee, gitster, martin.agren

On Tue, Feb 11, 2020 at 09:47:57PM -0800, Taylor Blau wrote:
> In the previous commit, we introduced '--split=<no-merge|merge-all>',

Nit: the previous commit introduced '--input=<source>'.

> and alluded to the fact that '--split=merge-all' would be useful for
> callers who wish to always trigger a merge of an incremental chain.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 2/3] builtin/commit-graph.c: introduce '--input=<source>'
  2020-02-13 11:33       ` SZEDER Gábor
@ 2020-02-13 11:48         ` SZEDER Gábor
  2020-02-13 17:56           ` Taylor Blau
  0 siblings, 1 reply; 58+ messages in thread
From: SZEDER Gábor @ 2020-02-13 11:48 UTC (permalink / raw)
  To: Taylor Blau
  Cc: Martin Ågren, Git Mailing List, Jeff King, Derrick Stolee,
	Junio C Hamano

On Thu, Feb 13, 2020 at 12:33:13PM +0100, SZEDER Gábor wrote:
> On Mon, Feb 03, 2020 at 08:51:24PM -0800, Taylor Blau wrote:
> > On Fri, Jan 31, 2020 at 08:34:41PM +0100, Martin Ågren wrote:
> > > On Fri, 31 Jan 2020 at 01:30, Taylor Blau <me@ttaylorr.com> wrote:
> > > > The 'write' mode of the 'commit-graph' supports input from a number of
> 
> s/mode/subcommand/
> 
> > > > different sources:
> 
> I note that you use the word "sources" here, in the subject line as
> well (as '--input=<source>'), and in the code as well (e.g.  the in
> the error message "unrecognized --input source, %s").  I like this
> word, I think the words "input" and "source" go really well together.
> 
> > > > pack indexes over stdin, commits over stdin, commits
> > > > reachable from all references, and so on.
> 
> It's interesting to see that you stopped listing and went for "and so
> on" right when it got interesting/controversial with '--append'... :)
> 
> > > > Each of these options are
> > > > specified with a unique option: '--stdin-packs', '--stdin-commits', etc.
> 
> It also supports the very inefficient scanning through all objects in
> all pack files to find commit objects, which, sadly, ended up being
> the default, and thus doesn't have its own --option.  Should there be
> a corresponding '--input=<source>' as well?  (Note that I don't mean
> this as a suggestion to add one; on the contrary, the less exposure it
> gets the better.)
> 
> > > > Similar to our replacement of 'git config [--<type>]' with 'git config
> > > > [--type=<type>]' (c.f., fb0dc3bac1 (builtin/config.c: support
> > > > `--type=<type>` as preferred alias for `--<type>`, 2018-04-18)), softly
> > > > deprecate '[--<input>]' in favor of '[--input=<source>]'.
> > > >
> > > > This makes it more clear to implement new options that are combinations
> > > > of other options (such as, for example, "none", a combination of the old
> > > > "--append" and a new sentinel to specify to _not_ look in other packs,
> > > > which we will implement in a future patch).
> > >
> > > Makes sense.
> > >
> > > > Unfortunately, the new enumerated type is a bitfield, even though it
> > > > makes much more sense as '0, 1, 2, ...'. Even though *almost* all
> > > > options are pairwise exclusive, '--stdin-{packs,commits}' *is*
> > > > compatible with '--append'. For this reason, use a bitfield.
> > >
> > > > -With the `--append` option, include all commits that are present in the
> > > > -existing commit-graph file.
> > > > +With the `--input=append` option, include all commits that are present
> > > > +in the existing commit-graph file.
> > >
> > > Would it be too crazy to call this `--input=existing` instead, and have
> > > it be the same as `--append`? I find that `--append` makes a lot of
> > > sense (it's a mode we can turn on or off), whereas "input = append"
> > > seems more odd.
> > 
> > Hmm. When I wrote this, I was thinking of introducing equivalent options
> > that are identical in name and functionality as '--input=<mode>' instead
> > of '--<mode>'. So, I guess that is to say that I didn't spend an awful
> > amount of time thinking about whether or not '--input=append' made sense
> > given anything else.
> > 
> > So, I don't think that '--input=existing' is a bad idea at all, but I do
> > worry about advertising this deprecation as "'--<mode>' becomes
> > '--input=<mode>', except when your mode is 'append', in which case it
> > becomes '--input=existing'".
> 
> But here you suddenly start using the word "mode" both in
> '--input=<mode>' and in '--<mode>'.
> 
> On one hand, I don't think that the word "mode" goes as well with
> "input" as "source" does.
> 
> On the other, is '--append' really a source/mode, like '--reachable'
> and '--stdin-commits' are?

Well, re-reading this question got me confused right after sending it,
so let me try to rephrase.

Is '--append' really a "source", like '--reachable' and
'--stdin-commits' are?  No:

>  Source, no: from wordsmithing perspective
> it doesn't fit with "source", and being orthogonal to the "real"
> source options while they are mutually exclusive seems to be a clear
> indication that it isn't.

Or is it a "mode" modifying how other options are handled?  Yes:

> Mode, yes: it's a mode of operation where
> no longer reachable/present commits are not discarded from the
> commit-graph.
> 
> So I don't think that adding '--input=append' is a good idea, even if
> we were call it differently, e.g. '--input=existing' as suggested
> above.
> 
> However, I do think that '--input=existing' would better express what
> '--input=none' in the next patch wants to achieve.
> 

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v3 3/3] builtin/commit-graph.c: support '--input=none'
  2020-02-12  5:47   ` [PATCH v3 3/3] builtin/commit-graph.c: support '--input=none' Taylor Blau
  2020-02-13 11:39     ` SZEDER Gábor
@ 2020-02-13 12:31     ` SZEDER Gábor
  2020-02-13 16:08       ` Junio C Hamano
  2020-02-13 17:56       ` Taylor Blau
  1 sibling, 2 replies; 58+ messages in thread
From: SZEDER Gábor @ 2020-02-13 12:31 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, peff, dstolee, gitster, martin.agren

On Tue, Feb 11, 2020 at 09:47:57PM -0800, Taylor Blau wrote:
> In the previous commit, we introduced '--split=<no-merge|merge-all>',
> and alluded to the fact that '--split=merge-all' would be useful for
> callers who wish to always trigger a merge of an incremental chain.
> 
> There is a problem with the above approach, which is that there is no
> way to specify to the commit-graph builtin that a caller only wants to
> include commits already in the graph.

I'd like clarification on a detail here.  Is it only about not adding
any new commits, or about keeping all existing commits as well?  IOW,
do you want to:

  - include only commits already existing in the commit-graph, without
    adding any new commits, but remove any commits that do not exist
    in the object database anymore.

or:

  - include _all_ commits already existing in the commit-graph, even
    those that don't exist anymore in the object database, without
    adding any new commits.


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v3 3/3] builtin/commit-graph.c: support '--input=none'
  2020-02-13 12:31     ` SZEDER Gábor
@ 2020-02-13 16:08       ` Junio C Hamano
  2020-02-13 17:58         ` Taylor Blau
  2020-02-13 17:56       ` Taylor Blau
  1 sibling, 1 reply; 58+ messages in thread
From: Junio C Hamano @ 2020-02-13 16:08 UTC (permalink / raw)
  To: SZEDER Gábor; +Cc: Taylor Blau, git, peff, dstolee, martin.agren

SZEDER Gábor <szeder.dev@gmail.com> writes:

> On Tue, Feb 11, 2020 at 09:47:57PM -0800, Taylor Blau wrote:
>> In the previous commit, we introduced '--split=<no-merge|merge-all>',
>> and alluded to the fact that '--split=merge-all' would be useful for
>> callers who wish to always trigger a merge of an incremental chain.
>> 
>> There is a problem with the above approach, which is that there is no
>> way to specify to the commit-graph builtin that a caller only wants to
>> include commits already in the graph.
>
> I'd like clarification on a detail here.  Is it only about not adding
> any new commits, or about keeping all existing commits as well?  IOW,
> do you want to:
>
>   - include only commits already existing in the commit-graph, without
>     adding any new commits, but remove any commits that do not exist
>     in the object database anymore.
>
> or:
>
>   - include _all_ commits already existing in the commit-graph, even
>     those that don't exist anymore in the object database, without
>     adding any new commits.

FWIW, I read it as the former, but now you brought it up, it can be
read either way.

Thanks for good review comments, as always.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v3 0/3] builtin/commit-graph.c: new split/merge options
  2020-02-12 18:19   ` [PATCH v3 0/3] builtin/commit-graph.c: new split/merge options Junio C Hamano
@ 2020-02-13 17:41     ` Taylor Blau
  0 siblings, 0 replies; 58+ messages in thread
From: Taylor Blau @ 2020-02-13 17:41 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Taylor Blau, Garima Singh, git, peff, dstolee, martin.agren

Hi Junio,

On Wed, Feb 12, 2020 at 10:19:14AM -0800, Junio C Hamano wrote:
> Taylor Blau <me@ttaylorr.com> writes:
>
> > Attached is what I anticipate/hope to be the final reroll of my series
> > to add new arguments to the 'git commit-graph write --split' flag to
> > allow callers to force or prohibit the commit-graph machinery to merge
> > multiple commit-graph layers.
> >
> > I was keeping an eye out for more discussion about whether or not these
> > flags were acceptable by reviewers. Martin Ågren and Derrick Stolee have
> > both chimed in that they seem OK.
> >
> > Since there hasn't been much more discussion in this thread, I replayed
> > this series on top of 'tb/commit-graph-use-odb' (which was itself
> > rebased on 'master'). I picked up a couple of ASCIIDoc changes along the
> > way, and a range-diff is included below.
>
> I haven't had a chance to form an opinion on this topic, and will
> let others comment on it first.
>
> This topic and the "changed paths bloom filter" topic obviously and
> inevitably have trivial textual conflicts.  When today's integration
> result is pushed out in 6 hours or so, please see if the resolution
> is reasonable in 'pu'.

The resolution looks good to me, thanks.

> Thanks.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 2/3] builtin/commit-graph.c: introduce '--input=<source>'
  2020-02-13 11:48         ` SZEDER Gábor
@ 2020-02-13 17:56           ` Taylor Blau
  0 siblings, 0 replies; 58+ messages in thread
From: Taylor Blau @ 2020-02-13 17:56 UTC (permalink / raw)
  To: SZEDER Gábor
  Cc: Taylor Blau, Martin Ågren, Git Mailing List, Jeff King,
	Derrick Stolee, Junio C Hamano

On Thu, Feb 13, 2020 at 12:48:00PM +0100, SZEDER Gábor wrote:
> On Thu, Feb 13, 2020 at 12:33:13PM +0100, SZEDER Gábor wrote:
> > On Mon, Feb 03, 2020 at 08:51:24PM -0800, Taylor Blau wrote:
> > > On Fri, Jan 31, 2020 at 08:34:41PM +0100, Martin Ågren wrote:
> > > > On Fri, 31 Jan 2020 at 01:30, Taylor Blau <me@ttaylorr.com> wrote:
> > > > > The 'write' mode of the 'commit-graph' supports input from a number of
> >
> > s/mode/subcommand/

Sure.

> > > > > different sources:
> >
> > I note that you use the word "sources" here, in the subject line as
> > well (as '--input=<source>'), and in the code as well (e.g.  the in
> > the error message "unrecognized --input source, %s").  I like this
> > word, I think the words "input" and "source" go really well together.
> >
> > > > > pack indexes over stdin, commits over stdin, commits
> > > > > reachable from all references, and so on.
> >
> > It's interesting to see that you stopped listing and went for "and so
> > on" right when it got interesting/controversial with '--append'... :)

Only because I figured that I had illustrated the point as "here are the
input sources that we currently understand", like "--stdin-packs",
"--stdin-commits", and so on, not because I find this option to be
controversial.

> > > > > Each of these options are
> > > > > specified with a unique option: '--stdin-packs', '--stdin-commits', etc.
> >
> > It also supports the very inefficient scanning through all objects in
> > all pack files to find commit objects, which, sadly, ended up being
> > the default, and thus doesn't have its own --option.  Should there be
> > a corresponding '--input=<source>' as well?  (Note that I don't mean
> > this as a suggestion to add one; on the contrary, the less exposure it
> > gets the better.)

Maybe... although I (like you, I think) have a hard time imagining that
this would ever get used, since it *is* the default source of input, so
you could just as easily *not* write '--input' anything and get the same
effect.

> > > > > Similar to our replacement of 'git config [--<type>]' with 'git config
> > > > > [--type=<type>]' (c.f., fb0dc3bac1 (builtin/config.c: support
> > > > > `--type=<type>` as preferred alias for `--<type>`, 2018-04-18)), softly
> > > > > deprecate '[--<input>]' in favor of '[--input=<source>]'.
> > > > >
> > > > > This makes it more clear to implement new options that are combinations
> > > > > of other options (such as, for example, "none", a combination of the old
> > > > > "--append" and a new sentinel to specify to _not_ look in other packs,
> > > > > which we will implement in a future patch).
> > > >
> > > > Makes sense.
> > > >
> > > > > Unfortunately, the new enumerated type is a bitfield, even though it
> > > > > makes much more sense as '0, 1, 2, ...'. Even though *almost* all
> > > > > options are pairwise exclusive, '--stdin-{packs,commits}' *is*
> > > > > compatible with '--append'. For this reason, use a bitfield.
> > > >
> > > > > -With the `--append` option, include all commits that are present in the
> > > > > -existing commit-graph file.
> > > > > +With the `--input=append` option, include all commits that are present
> > > > > +in the existing commit-graph file.
> > > >
> > > > Would it be too crazy to call this `--input=existing` instead, and have
> > > > it be the same as `--append`? I find that `--append` makes a lot of
> > > > sense (it's a mode we can turn on or off), whereas "input = append"
> > > > seems more odd.

I don't think that I have a strong preference here, since I don't find
'--input=append' to be out of the ordinary, so I'd be happy with either.
If you'd strongly prefer that we call this '--input=existing', then
that's fine with me.

I called it '--input=append' to translate 1-to-1 from '--<source>' to
'--input=<source>'. I would worry a little about saying, "yeah, if you
want to use the new '--input'-style options, just write what you used to
write *unless* that thing is '--append' in which case write 'existing'
instead".

Maybe '--append' was poorly-named to begin with, so I think the
question is really: if we believe that it is poorly named, is now the
right time to correct that naming?

> > > Hmm. When I wrote this, I was thinking of introducing equivalent options
> > > that are identical in name and functionality as '--input=<mode>' instead
> > > of '--<mode>'. So, I guess that is to say that I didn't spend an awful
> > > amount of time thinking about whether or not '--input=append' made sense
> > > given anything else.
> > >
> > > So, I don't think that '--input=existing' is a bad idea at all, but I do
> > > worry about advertising this deprecation as "'--<mode>' becomes
> > > '--input=<mode>', except when your mode is 'append', in which case it
> > > becomes '--input=existing'".
> >
> > But here you suddenly start using the word "mode" both in
> > '--input=<mode>' and in '--<mode>'.
> >
> > On one hand, I don't think that the word "mode" goes as well with
> > "input" as "source" does.
> >
> > On the other, is '--append' really a source/mode, like '--reachable'
> > and '--stdin-commits' are?
>
> Well, re-reading this question got me confused right after sending it,
> so let me try to rephrase.
>
> Is '--append' really a "source", like '--reachable' and
> '--stdin-commits' are?  No:
>
> >  Source, no: from wordsmithing perspective
> > it doesn't fit with "source", and being orthogonal to the "real"
> > source options while they are mutually exclusive seems to be a clear
> > indication that it isn't.
>
> Or is it a "mode" modifying how other options are handled?  Yes:
>
> > Mode, yes: it's a mode of operation where
> > no longer reachable/present commits are not discarded from the
> > commit-graph.
> >
> > So I don't think that adding '--input=append' is a good idea, even if
> > we were call it differently, e.g. '--input=existing' as suggested
> > above.
> >
> > However, I do think that '--input=existing' would better express what
> > '--input=none' in the next patch wants to achieve.

So, I guess I'm left wondering what we can do to move forward here. For
my part, I see a couple of options:

  (a) we could replace every instance of '--input=<source>' with
      '--input=<mode>', which seems it would make "append" a valid
      pattern-match for "mode", and move on.

  (b) we could stick with '--input=<source>' and 's/append/existing',
      and move on

  (c) or do something else that I didn't think of here and go forward
      with that instead.

Let me know which you'd prefer that I do, and I'd be happy to send an
updated version of the series for you to look at.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v3 3/3] builtin/commit-graph.c: support '--input=none'
  2020-02-13 12:31     ` SZEDER Gábor
  2020-02-13 16:08       ` Junio C Hamano
@ 2020-02-13 17:56       ` Taylor Blau
  1 sibling, 0 replies; 58+ messages in thread
From: Taylor Blau @ 2020-02-13 17:56 UTC (permalink / raw)
  To: SZEDER Gábor; +Cc: Taylor Blau, git, peff, dstolee, gitster, martin.agren

On Thu, Feb 13, 2020 at 01:31:29PM +0100, SZEDER Gábor wrote:
> On Tue, Feb 11, 2020 at 09:47:57PM -0800, Taylor Blau wrote:
> > In the previous commit, we introduced '--split=<no-merge|merge-all>',
> > and alluded to the fact that '--split=merge-all' would be useful for
> > callers who wish to always trigger a merge of an incremental chain.
> >
> > There is a problem with the above approach, which is that there is no
> > way to specify to the commit-graph builtin that a caller only wants to
> > include commits already in the graph.
>
> I'd like clarification on a detail here.  Is it only about not adding
> any new commits, or about keeping all existing commits as well?  IOW,
> do you want to:
>
>   - include only commits already existing in the commit-graph, without
>     adding any new commits, but remove any commits that do not exist
>     in the object database anymore.

This one, since the commit-graph machinery will drop any commits (no
matter what input/source/mode you specify) that no longer exist in the
object store.

> or:
>
>   - include _all_ commits already existing in the commit-graph, even
>     those that don't exist anymore in the object database, without
>     adding any new commits.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v3 3/3] builtin/commit-graph.c: support '--input=none'
  2020-02-13 16:08       ` Junio C Hamano
@ 2020-02-13 17:58         ` Taylor Blau
  0 siblings, 0 replies; 58+ messages in thread
From: Taylor Blau @ 2020-02-13 17:58 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: SZEDER Gábor, Taylor Blau, git, peff, dstolee, martin.agren

On Thu, Feb 13, 2020 at 08:08:15AM -0800, Junio C Hamano wrote:
> SZEDER Gábor <szeder.dev@gmail.com> writes:
>
> > On Tue, Feb 11, 2020 at 09:47:57PM -0800, Taylor Blau wrote:
> >> In the previous commit, we introduced '--split=<no-merge|merge-all>',
> >> and alluded to the fact that '--split=merge-all' would be useful for
> >> callers who wish to always trigger a merge of an incremental chain.
> >>
> >> There is a problem with the above approach, which is that there is no
> >> way to specify to the commit-graph builtin that a caller only wants to
> >> include commits already in the graph.
> >
> > I'd like clarification on a detail here.  Is it only about not adding
> > any new commits, or about keeping all existing commits as well?  IOW,
> > do you want to:
> >
> >   - include only commits already existing in the commit-graph, without
> >     adding any new commits, but remove any commits that do not exist
> >     in the object database anymore.
> >
> > or:
> >
> >   - include _all_ commits already existing in the commit-graph, even
> >     those that don't exist anymore in the object database, without
> >     adding any new commits.
>
> FWIW, I read it as the former, but now you brought it up, it can be
> read either way.

It was intended as the former, but I share both of your feelings that it
could be read either way. I amended the commit message to clarify by
adding:

  (and haven't since been deleted from the object store)

as a parenthetical after "already in the graph...".

> Thanks for good review comments, as always.

Yes, indeed: thank very much for your thoughtful feedback.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v3 0/3] builtin/commit-graph.c: new split/merge options
  2020-02-12  5:47 ` [PATCH v3 " Taylor Blau
                     ` (3 preceding siblings ...)
  2020-02-12 18:19   ` [PATCH v3 0/3] builtin/commit-graph.c: new split/merge options Junio C Hamano
@ 2020-02-17 18:24   ` Martin Ågren
  4 siblings, 0 replies; 58+ messages in thread
From: Martin Ågren @ 2020-02-17 18:24 UTC (permalink / raw)
  To: Taylor Blau; +Cc: Git Mailing List, Jeff King, Derrick Stolee, Junio C Hamano

On Wed, 12 Feb 2020 at 06:47, Taylor Blau <me@ttaylorr.com> wrote:
> I picked up a couple of ASCIIDoc changes along the
> way, and a range-diff is included below.

Yup, this fixes the documentation misrendering from the previous round.

Martin

^ permalink raw reply	[flat|nested] 58+ messages in thread

end of thread, back to index

Thread overview: 58+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-31  0:28 [PATCH 0/3] builtin/commit-graph.c: new split/merge options Taylor Blau
2020-01-31  0:28 ` [PATCH 1/3] builtin/commit-graph.c: support '--split[=<strategy>]' Taylor Blau
2020-01-31 14:19   ` Derrick Stolee
2020-02-04  3:47     ` Taylor Blau
2020-01-31 19:27   ` Martin Ågren
2020-02-04  4:06     ` Taylor Blau
2020-02-06 19:15       ` Martin Ågren
2020-02-09 23:27         ` Taylor Blau
2020-01-31 23:34   ` SZEDER Gábor
2020-02-01 21:25     ` Johannes Schindelin
2020-02-03 10:47       ` SZEDER Gábor
2020-02-03 11:11         ` Jeff King
2020-02-04  3:58           ` Taylor Blau
2020-02-04 14:14             ` Jeff King
2020-02-04  3:59       ` Taylor Blau
2020-02-04  3:59     ` Taylor Blau
2020-01-31  0:28 ` [PATCH 2/3] builtin/commit-graph.c: introduce '--input=<source>' Taylor Blau
2020-01-31 14:40   ` Derrick Stolee
2020-02-04  4:21     ` Taylor Blau
2020-01-31 19:34   ` Martin Ågren
2020-02-04  4:51     ` Taylor Blau
2020-02-13 11:33       ` SZEDER Gábor
2020-02-13 11:48         ` SZEDER Gábor
2020-02-13 17:56           ` Taylor Blau
2020-01-31  0:28 ` [PATCH 3/3] builtin/commit-graph.c: support '--input=none' Taylor Blau
2020-01-31 14:40   ` Derrick Stolee
2020-01-31 19:45   ` Martin Ågren
2020-02-04  5:01     ` Taylor Blau
2020-01-31  0:32 ` [PATCH 0/3] builtin/commit-graph.c: new split/merge options Taylor Blau
2020-01-31 13:26   ` Derrick Stolee
2020-01-31 14:41 ` Derrick Stolee
2020-02-04 23:44 ` Junio C Hamano
2020-02-05  0:30   ` Taylor Blau
2020-02-05  0:28 ` [PATCH v2 " Taylor Blau
2020-02-05  0:28   ` [PATCH v2 1/3] builtin/commit-graph.c: support '--split[=<strategy>]' Taylor Blau
2020-02-06 19:41     ` Martin Ågren
2020-02-07 15:48       ` Derrick Stolee
2020-02-09 23:32         ` Taylor Blau
2020-02-12  6:03         ` Martin Ågren
2020-02-12 20:50           ` Taylor Blau
2020-02-09 23:30       ` Taylor Blau
2020-02-05  0:28   ` [PATCH v2 2/3] builtin/commit-graph.c: introduce '--input=<source>' Taylor Blau
2020-02-05  0:28   ` [PATCH v2 3/3] builtin/commit-graph.c: support '--input=none' Taylor Blau
2020-02-06 19:50     ` Martin Ågren
2020-02-09 23:32       ` Taylor Blau
2020-02-05 20:07   ` [PATCH v2 0/3] builtin/commit-graph.c: new split/merge options Junio C Hamano
2020-02-12  5:47 ` [PATCH v3 " Taylor Blau
2020-02-12  5:47   ` [PATCH v3 1/3] builtin/commit-graph.c: support '--split[=<strategy>]' Taylor Blau
2020-02-12  5:47   ` [PATCH v3 2/3] builtin/commit-graph.c: introduce '--input=<source>' Taylor Blau
2020-02-12  5:47   ` [PATCH v3 3/3] builtin/commit-graph.c: support '--input=none' Taylor Blau
2020-02-13 11:39     ` SZEDER Gábor
2020-02-13 12:31     ` SZEDER Gábor
2020-02-13 16:08       ` Junio C Hamano
2020-02-13 17:58         ` Taylor Blau
2020-02-13 17:56       ` Taylor Blau
2020-02-12 18:19   ` [PATCH v3 0/3] builtin/commit-graph.c: new split/merge options Junio C Hamano
2020-02-13 17:41     ` Taylor Blau
2020-02-17 18:24   ` Martin Ågren

git@vger.kernel.org list mirror (unofficial, one of many)

Archives are clonable:
	git clone --mirror https://public-inbox.org/git
	git clone --mirror http://ou63pmih66umazou.onion/git
	git clone --mirror http://czquwvybam4bgbro.onion/git
	git clone --mirror http://hjrcffqmbrq6wope.onion/git

Example config snippet for mirrors

Newsgroups are available over NNTP:
	nntp://news.public-inbox.org/inbox.comp.version-control.git
	nntp://ou63pmih66umazou.onion/inbox.comp.version-control.git
	nntp://czquwvybam4bgbro.onion/inbox.comp.version-control.git
	nntp://hjrcffqmbrq6wope.onion/inbox.comp.version-control.git
	nntp://news.gmane.io/gmane.comp.version-control.git

 note: .onion URLs require Tor: https://www.torproject.org/

AGPL code for this site: git clone https://public-inbox.org/public-inbox.git