git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH 0/3] commit-graph: error out on invalid commit oids in 'write --stdin-commits'
@ 2019-08-05  8:02 SZEDER Gábor
  2019-08-05  8:02 ` [PATCH 1/3] t5318-commit-graph: use 'test_expect_code' SZEDER Gábor
                   ` (3 more replies)
  0 siblings, 4 replies; 19+ messages in thread
From: SZEDER Gábor @ 2019-08-05  8:02 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Derrick Stolee, git, SZEDER Gábor

While 'git commit-graph write --stdin-commits' expects commit object
ids as input, it accepts and silently skips over any invalid commit
object ids, and still exits with success:

  $ echo not-a-commit-oid | git commit-graph write --stdin-commits
  $ echo $?
  $ ls -l .git/objects/info/commit-graph
  ls: cannot access '.git/objects/info/commit-graph': No such file or directory

The last patch in this series fixes this issue, with a bit of
preparatory refactoring in the second and a while-at-it cleanup in the
first patches.

SZEDER Gábor (3):
  t5318-commit-graph: use 'test_expect_code'
  commit-graph: turn a group of write-related macro flags into an enum
  commit-graph: error out on invalid commit oids in 'write
    --stdin-commits'

 builtin/commit-graph.c  | 10 ++++++----
 builtin/gc.c            |  2 +-
 commit-graph.c          | 40 +++++++++++++++++++++++-----------------
 commit-graph.h          | 15 ++++++++++-----
 t/t5318-commit-graph.sh | 14 +++++++++++---
 5 files changed, 51 insertions(+), 30 deletions(-)

-- 
2.23.0.rc1.309.g896d8c5f5f


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 1/3] t5318-commit-graph: use 'test_expect_code'
  2019-08-05  8:02 [PATCH 0/3] commit-graph: error out on invalid commit oids in 'write --stdin-commits' SZEDER Gábor
@ 2019-08-05  8:02 ` SZEDER Gábor
  2019-08-05  8:02 ` [PATCH 2/3] commit-graph: turn a group of write-related macro flags into an enum SZEDER Gábor
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 19+ messages in thread
From: SZEDER Gábor @ 2019-08-05  8:02 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Derrick Stolee, git, SZEDER Gábor

In 't5318-commit-graph.sh' the test 'close with correct error on bad
input' manually verifies the exit code of a 'git commit-graph write'
command.

Use 'test_expect_code' instead.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
---
 t/t5318-commit-graph.sh | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh
index 22cb9d6643..4391007f4c 100755
--- a/t/t5318-commit-graph.sh
+++ b/t/t5318-commit-graph.sh
@@ -26,8 +26,7 @@ test_expect_success 'write graph with no packs' '
 test_expect_success 'close with correct error on bad input' '
 	cd "$TRASH_DIRECTORY/full" &&
 	echo doesnotexist >in &&
-	{ git commit-graph write --stdin-packs <in 2>stderr; ret=$?; } &&
-	test "$ret" = 1 &&
+	test_expect_code 1 git commit-graph write --stdin-packs <in 2>stderr &&
 	test_i18ngrep "error adding pack" stderr
 '
 
-- 
2.23.0.rc1.309.g896d8c5f5f


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 2/3] commit-graph: turn a group of write-related macro flags into an enum
  2019-08-05  8:02 [PATCH 0/3] commit-graph: error out on invalid commit oids in 'write --stdin-commits' SZEDER Gábor
  2019-08-05  8:02 ` [PATCH 1/3] t5318-commit-graph: use 'test_expect_code' SZEDER Gábor
@ 2019-08-05  8:02 ` SZEDER Gábor
  2019-08-05  8:02 ` [PATCH 3/3] commit-graph: error out on invalid commit oids in 'write --stdin-commits' SZEDER Gábor
  2019-08-05 10:14 ` [PATCH 0/3] " SZEDER Gábor
  3 siblings, 0 replies; 19+ messages in thread
From: SZEDER Gábor @ 2019-08-05  8:02 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Derrick Stolee, git, SZEDER Gábor

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
---
 builtin/commit-graph.c |  6 +++---
 builtin/gc.c           |  2 +-
 commit-graph.c         | 11 ++++++-----
 commit-graph.h         | 13 ++++++++-----
 4 files changed, 18 insertions(+), 14 deletions(-)

diff --git a/builtin/commit-graph.c b/builtin/commit-graph.c
index 38027b83d9..64eccde314 100644
--- a/builtin/commit-graph.c
+++ b/builtin/commit-graph.c
@@ -154,7 +154,7 @@ static int graph_write(int argc, const char **argv)
 	struct string_list *commit_hex = NULL;
 	struct string_list lines;
 	int result = 0;
-	unsigned int flags = COMMIT_GRAPH_PROGRESS;
+	enum commit_graph_write_flags flags = COMMIT_GRAPH_WRITE_PROGRESS;
 
 	static struct option builtin_commit_graph_write_options[] = {
 		OPT_STRING(0, "object-dir", &opts.obj_dir,
@@ -192,9 +192,9 @@ static int graph_write(int argc, const char **argv)
 	if (!opts.obj_dir)
 		opts.obj_dir = get_object_directory();
 	if (opts.append)
-		flags |= COMMIT_GRAPH_APPEND;
+		flags |= COMMIT_GRAPH_WRITE_APPEND;
 	if (opts.split)
-		flags |= COMMIT_GRAPH_SPLIT;
+		flags |= COMMIT_GRAPH_WRITE_SPLIT;
 
 	read_replace_refs = 0;
 
diff --git a/builtin/gc.c b/builtin/gc.c
index c18efadda5..305fb0f45a 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -687,7 +687,7 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 
 	if (gc_write_commit_graph &&
 	    write_commit_graph_reachable(get_object_directory(),
-					 !quiet && !daemonized ? COMMIT_GRAPH_PROGRESS : 0,
+					 !quiet && !daemonized ? COMMIT_GRAPH_WRITE_PROGRESS : 0,
 					 NULL))
 		return 1;
 
diff --git a/commit-graph.c b/commit-graph.c
index b3c4de79b6..04324a4648 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -1133,7 +1133,8 @@ static int add_ref_to_list(const char *refname,
 	return 0;
 }
 
-int write_commit_graph_reachable(const char *obj_dir, unsigned int flags,
+int write_commit_graph_reachable(const char *obj_dir,
+				 enum commit_graph_write_flags flags,
 				 const struct split_commit_graph_opts *split_opts)
 {
 	struct string_list list = STRING_LIST_INIT_DUP;
@@ -1750,7 +1751,7 @@ static void expire_commit_graphs(struct write_commit_graph_context *ctx)
 int write_commit_graph(const char *obj_dir,
 		       struct string_list *pack_indexes,
 		       struct string_list *commit_hex,
-		       unsigned int flags,
+		       enum commit_graph_write_flags flags,
 		       const struct split_commit_graph_opts *split_opts)
 {
 	struct write_commit_graph_context *ctx;
@@ -1771,9 +1772,9 @@ int write_commit_graph(const char *obj_dir,
 	if (len && ctx->obj_dir[len - 1] == '/')
 		ctx->obj_dir[len - 1] = 0;
 
-	ctx->append = flags & COMMIT_GRAPH_APPEND ? 1 : 0;
-	ctx->report_progress = flags & COMMIT_GRAPH_PROGRESS ? 1 : 0;
-	ctx->split = flags & COMMIT_GRAPH_SPLIT ? 1 : 0;
+	ctx->append = flags & COMMIT_GRAPH_WRITE_APPEND ? 1 : 0;
+	ctx->report_progress = flags & COMMIT_GRAPH_WRITE_PROGRESS ? 1 : 0;
+	ctx->split = flags & COMMIT_GRAPH_WRITE_SPLIT ? 1 : 0;
 	ctx->split_opts = split_opts;
 
 	if (ctx->split) {
diff --git a/commit-graph.h b/commit-graph.h
index df9a3b20e4..ae4db87fb5 100644
--- a/commit-graph.h
+++ b/commit-graph.h
@@ -71,9 +71,11 @@ struct commit_graph *parse_commit_graph(void *graph_map, int fd,
  */
 int generation_numbers_enabled(struct repository *r);
 
-#define COMMIT_GRAPH_APPEND     (1 << 0)
-#define COMMIT_GRAPH_PROGRESS   (1 << 1)
-#define COMMIT_GRAPH_SPLIT      (1 << 2)
+enum commit_graph_write_flags {
+	COMMIT_GRAPH_WRITE_APPEND     = (1 << 0),
+	COMMIT_GRAPH_WRITE_PROGRESS   = (1 << 1),
+	COMMIT_GRAPH_WRITE_SPLIT      = (1 << 2)
+};
 
 struct split_commit_graph_opts {
 	int size_multiple;
@@ -87,12 +89,13 @@ struct split_commit_graph_opts {
  * is not compatible with the commit-graph feature, then the
  * methods will return 0 without writing a commit-graph.
  */
-int write_commit_graph_reachable(const char *obj_dir, unsigned int flags,
+int write_commit_graph_reachable(const char *obj_dir,
+				 enum commit_graph_write_flags flags,
 				 const struct split_commit_graph_opts *split_opts);
 int write_commit_graph(const char *obj_dir,
 		       struct string_list *pack_indexes,
 		       struct string_list *commit_hex,
-		       unsigned int flags,
+		       enum commit_graph_write_flags flags,
 		       const struct split_commit_graph_opts *split_opts);
 
 #define COMMIT_GRAPH_VERIFY_SHALLOW	(1 << 0)
-- 
2.23.0.rc1.309.g896d8c5f5f


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 3/3] commit-graph: error out on invalid commit oids in 'write --stdin-commits'
  2019-08-05  8:02 [PATCH 0/3] commit-graph: error out on invalid commit oids in 'write --stdin-commits' SZEDER Gábor
  2019-08-05  8:02 ` [PATCH 1/3] t5318-commit-graph: use 'test_expect_code' SZEDER Gábor
  2019-08-05  8:02 ` [PATCH 2/3] commit-graph: turn a group of write-related macro flags into an enum SZEDER Gábor
@ 2019-08-05  8:02 ` SZEDER Gábor
  2019-08-05 13:57   ` Derrick Stolee
  2020-04-03 18:30   ` Jeff King
  2019-08-05 10:14 ` [PATCH 0/3] " SZEDER Gábor
  3 siblings, 2 replies; 19+ messages in thread
From: SZEDER Gábor @ 2019-08-05  8:02 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Derrick Stolee, git, SZEDER Gábor

While 'git commit-graph write --stdin-commits' expects commit object
ids as input, it accepts and silently skips over any invalid commit
object ids, and still exits with success:

  # nonsense
  $ echo not-a-commit-oid | git commit-graph write --stdin-commits
  $ echo $?
  0
  # sometimes I forgot that refs are not good...
  $ echo HEAD | git commit-graph write --stdin-commits
  $ echo $?
  0
  # valid tree OID, but not a commit OID
  $ git rev-parse HEAD^{tree} | git commit-graph write --stdin-commits
  $ echo $?
  0
  $ ls -l .git/objects/info/commit-graph
  ls: cannot access '.git/objects/info/commit-graph': No such file or directory

Check that all input records are indeed valid commit object ids and
return with error otherwise, the same way '--stdin-packs' handles
invalid input; see e103f7276f (commit-graph: return with errors during
write, 2019-06-12).

Note that it should only return with error when encountering an
invalid commit object id coming from standard input.  However,
'--reachable' uses the same code path to process object ids pointed to
by all refs, and that includes tag object ids as well, which should
still be skipped over.  Therefore add a new flag to 'enum
commit_graph_write_flags' and a corresponding field to 'struct
write_commit_graph_context', so we can differentiate between those two
cases.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
---
 builtin/commit-graph.c  |  4 +++-
 commit-graph.c          | 29 +++++++++++++++++------------
 commit-graph.h          |  4 +++-
 t/t5318-commit-graph.sh | 11 ++++++++++-
 4 files changed, 33 insertions(+), 15 deletions(-)

diff --git a/builtin/commit-graph.c b/builtin/commit-graph.c
index 64eccde314..57863619b7 100644
--- a/builtin/commit-graph.c
+++ b/builtin/commit-graph.c
@@ -213,8 +213,10 @@ static int graph_write(int argc, const char **argv)
 
 		if (opts.stdin_packs)
 			pack_indexes = &lines;
-		if (opts.stdin_commits)
+		if (opts.stdin_commits) {
 			commit_hex = &lines;
+			flags |= COMMIT_GRAPH_WRITE_CHECK_OIDS;
+		}
 
 		UNLEAK(buf);
 	}
diff --git a/commit-graph.c b/commit-graph.c
index 04324a4648..821900675b 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -782,7 +782,8 @@ struct write_commit_graph_context {
 
 	unsigned append:1,
 		 report_progress:1,
-		 split:1;
+		 split:1,
+		 check_oids:1;
 
 	const struct split_commit_graph_opts *split_opts;
 };
@@ -1193,8 +1194,8 @@ static int fill_oids_from_packs(struct write_commit_graph_context *ctx,
 	return 0;
 }
 
-static void fill_oids_from_commit_hex(struct write_commit_graph_context *ctx,
-				      struct string_list *commit_hex)
+static int fill_oids_from_commit_hex(struct write_commit_graph_context *ctx,
+				     struct string_list *commit_hex)
 {
 	uint32_t i;
 	struct strbuf progress_title = STRBUF_INIT;
@@ -1215,20 +1216,21 @@ static void fill_oids_from_commit_hex(struct write_commit_graph_context *ctx,
 		struct commit *result;
 
 		display_progress(ctx->progress, i + 1);
-		if (commit_hex->items[i].string &&
-		    parse_oid_hex(commit_hex->items[i].string, &oid, &end))
-			continue;
-
-		result = lookup_commit_reference_gently(ctx->r, &oid, 1);
-
-		if (result) {
+		if (!parse_oid_hex(commit_hex->items[i].string, &oid, &end) &&
+		    (result = lookup_commit_reference_gently(ctx->r, &oid, 1))) {
 			ALLOC_GROW(ctx->oids.list, ctx->oids.nr + 1, ctx->oids.alloc);
 			oidcpy(&ctx->oids.list[ctx->oids.nr], &(result->object.oid));
 			ctx->oids.nr++;
+		} else if (ctx->check_oids) {
+			error(_("invalid commit object id: %s"),
+			    commit_hex->items[i].string);
+			return -1;
 		}
 	}
 	stop_progress(&ctx->progress);
 	strbuf_release(&progress_title);
+
+	return 0;
 }
 
 static void fill_oids_from_all_packs(struct write_commit_graph_context *ctx)
@@ -1775,6 +1777,7 @@ int write_commit_graph(const char *obj_dir,
 	ctx->append = flags & COMMIT_GRAPH_WRITE_APPEND ? 1 : 0;
 	ctx->report_progress = flags & COMMIT_GRAPH_WRITE_PROGRESS ? 1 : 0;
 	ctx->split = flags & COMMIT_GRAPH_WRITE_SPLIT ? 1 : 0;
+	ctx->check_oids = flags & COMMIT_GRAPH_WRITE_CHECK_OIDS ? 1 : 0;
 	ctx->split_opts = split_opts;
 
 	if (ctx->split) {
@@ -1829,8 +1832,10 @@ int write_commit_graph(const char *obj_dir,
 			goto cleanup;
 	}
 
-	if (commit_hex)
-		fill_oids_from_commit_hex(ctx, commit_hex);
+	if (commit_hex) {
+		if ((res = fill_oids_from_commit_hex(ctx, commit_hex)))
+			goto cleanup;
+	}
 
 	if (!pack_indexes && !commit_hex)
 		fill_oids_from_all_packs(ctx);
diff --git a/commit-graph.h b/commit-graph.h
index ae4db87fb5..486e64e591 100644
--- a/commit-graph.h
+++ b/commit-graph.h
@@ -74,7 +74,9 @@ int generation_numbers_enabled(struct repository *r);
 enum commit_graph_write_flags {
 	COMMIT_GRAPH_WRITE_APPEND     = (1 << 0),
 	COMMIT_GRAPH_WRITE_PROGRESS   = (1 << 1),
-	COMMIT_GRAPH_WRITE_SPLIT      = (1 << 2)
+	COMMIT_GRAPH_WRITE_SPLIT      = (1 << 2),
+	/* Make sure that each OID in the input is a valid commit OID. */
+	COMMIT_GRAPH_WRITE_CHECK_OIDS = (1 << 3)
 };
 
 struct split_commit_graph_opts {
diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh
index 4391007f4c..ab3eccf0fa 100755
--- a/t/t5318-commit-graph.sh
+++ b/t/t5318-commit-graph.sh
@@ -23,7 +23,7 @@ test_expect_success 'write graph with no packs' '
 	test_path_is_missing info/commit-graph
 '
 
-test_expect_success 'close with correct error on bad input' '
+test_expect_success 'exit with correct error on bad input to --stdin-packs' '
 	cd "$TRASH_DIRECTORY/full" &&
 	echo doesnotexist >in &&
 	test_expect_code 1 git commit-graph write --stdin-packs <in 2>stderr &&
@@ -40,6 +40,15 @@ test_expect_success 'create commits and repack' '
 	git repack
 '
 
+test_expect_success 'exit with correct error on bad input to --stdin-commits' '
+	cd "$TRASH_DIRECTORY/full" &&
+	echo HEAD | test_expect_code 1 git commit-graph write --stdin-commits 2>stderr &&
+	test_i18ngrep "invalid commit object id" stderr &&
+	# valid tree OID, but not a commit OID
+	git rev-parse HEAD^{tree} | test_expect_code 1 git commit-graph write --stdin-commits 2>stderr &&
+	test_i18ngrep "invalid commit object id" stderr
+'
+
 graph_git_two_modes() {
 	git -c core.commitGraph=true $1 >output
 	git -c core.commitGraph=false $1 >expect
-- 
2.23.0.rc1.309.g896d8c5f5f


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH 0/3] commit-graph: error out on invalid commit oids in 'write --stdin-commits'
  2019-08-05  8:02 [PATCH 0/3] commit-graph: error out on invalid commit oids in 'write --stdin-commits' SZEDER Gábor
                   ` (2 preceding siblings ...)
  2019-08-05  8:02 ` [PATCH 3/3] commit-graph: error out on invalid commit oids in 'write --stdin-commits' SZEDER Gábor
@ 2019-08-05 10:14 ` SZEDER Gábor
  3 siblings, 0 replies; 19+ messages in thread
From: SZEDER Gábor @ 2019-08-05 10:14 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Derrick Stolee, git

On Mon, Aug 05, 2019 at 10:02:37AM +0200, SZEDER Gábor wrote:
> While 'git commit-graph write --stdin-commits' expects commit object
> ids as input, it accepts and silently skips over any invalid commit
> object ids, and still exits with success:
> 
>   $ echo not-a-commit-oid | git commit-graph write --stdin-commits
>   $ echo $?

Oh, messed up the copy-paste; this prints 0.

>   $ ls -l .git/objects/info/commit-graph
>   ls: cannot access '.git/objects/info/commit-graph': No such file or directory

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 3/3] commit-graph: error out on invalid commit oids in 'write --stdin-commits'
  2019-08-05  8:02 ` [PATCH 3/3] commit-graph: error out on invalid commit oids in 'write --stdin-commits' SZEDER Gábor
@ 2019-08-05 13:57   ` Derrick Stolee
  2019-08-05 17:57     ` SZEDER Gábor
  2020-04-03 18:30   ` Jeff King
  1 sibling, 1 reply; 19+ messages in thread
From: Derrick Stolee @ 2019-08-05 13:57 UTC (permalink / raw)
  To: SZEDER Gábor, Junio C Hamano; +Cc: Derrick Stolee, git

On 8/5/2019 4:02 AM, SZEDER Gábor wrote:
> While 'git commit-graph write --stdin-commits' expects commit object
> ids as input, it accepts and silently skips over any invalid commit
> object ids, and still exits with success:
> 
>   # nonsense
>   $ echo not-a-commit-oid | git commit-graph write --stdin-commits
>   $ echo $?
>   0
>   # sometimes I forgot that refs are not good...
>   $ echo HEAD | git commit-graph write --stdin-commits
>   $ echo $?
>   0
>   # valid tree OID, but not a commit OID
>   $ git rev-parse HEAD^{tree} | git commit-graph write --stdin-commits
>   $ echo $?
>   0
>   $ ls -l .git/objects/info/commit-graph
>   ls: cannot access '.git/objects/info/commit-graph': No such file or directory
> 
> Check that all input records are indeed valid commit object ids and
> return with error otherwise, the same way '--stdin-packs' handles
> invalid input; see e103f7276f (commit-graph: return with errors during
> write, 2019-06-12).

Consistency is good. We should definitely make these modes match.

> Note that it should only return with error when encountering an
> invalid commit object id coming from standard input.  However,
> '--reachable' uses the same code path to process object ids pointed to
> by all refs, and that includes tag object ids as well, which should
> still be skipped over.  Therefore add a new flag to 'enum
> commit_graph_write_flags' and a corresponding field to 'struct
> write_commit_graph_context', so we can differentiate between those two
> cases.

Thank you for the care here.

[snip]
> @@ -1215,20 +1216,21 @@ static void fill_oids_from_commit_hex(struct write_commit_graph_context *ctx,
>  		struct commit *result;
>  
>  		display_progress(ctx->progress, i + 1);
> -		if (commit_hex->items[i].string &&
> -		    parse_oid_hex(commit_hex->items[i].string, &oid, &end))
> -			continue;
> -
> -		result = lookup_commit_reference_gently(ctx->r, &oid, 1);
> -
> -		if (result) {
> +		if (!parse_oid_hex(commit_hex->items[i].string, &oid, &end) &&
> +		    (result = lookup_commit_reference_gently(ctx->r, &oid, 1))) {
>  			ALLOC_GROW(ctx->oids.list, ctx->oids.nr + 1, ctx->oids.alloc);
>  			oidcpy(&ctx->oids.list[ctx->oids.nr], &(result->object.oid));
>  			ctx->oids.nr++;
> +		} else if (ctx->check_oids) {
> +			error(_("invalid commit object id: %s"),
> +			    commit_hex->items[i].string);
> +			return -1;
>  		}
>  	}
>  	stop_progress(&ctx->progress);
>  	strbuf_release(&progress_title);
> +
> +	return 0;
>  }

This is the critical bit. I notice that you are not checking commit_hex->items[i].string
for NULL, but it should never be NULL here anyway.

> @@ -1775,6 +1777,7 @@ int write_commit_graph(const char *obj_dir,
>  	ctx->append = flags & COMMIT_GRAPH_WRITE_APPEND ? 1 : 0;
>  	ctx->report_progress = flags & COMMIT_GRAPH_WRITE_PROGRESS ? 1 : 0;
>  	ctx->split = flags & COMMIT_GRAPH_WRITE_SPLIT ? 1 : 0;
> +	ctx->check_oids = flags & COMMIT_GRAPH_WRITE_CHECK_OIDS ? 1 : 0;
>  	ctx->split_opts = split_opts;

Using the enum for the function and the bitfield for internal logic matches the
existing pattern. Thanks.

> @@ -1829,8 +1832,10 @@ int write_commit_graph(const char *obj_dir,
>  			goto cleanup;
>  	}
>  
> -	if (commit_hex)
> -		fill_oids_from_commit_hex(ctx, commit_hex);
> +	if (commit_hex) {
> +		if ((res = fill_oids_from_commit_hex(ctx, commit_hex)))
> +			goto cleanup;
> +	}

And this links the low-level error to a return code.

Thanks for this! The changes here look good and justify the two cleanup
patches.

-Stolee

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 3/3] commit-graph: error out on invalid commit oids in 'write --stdin-commits'
  2019-08-05 13:57   ` Derrick Stolee
@ 2019-08-05 17:57     ` SZEDER Gábor
  0 siblings, 0 replies; 19+ messages in thread
From: SZEDER Gábor @ 2019-08-05 17:57 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: Junio C Hamano, Derrick Stolee, git

On Mon, Aug 05, 2019 at 09:57:19AM -0400, Derrick Stolee wrote:
> On 8/5/2019 4:02 AM, SZEDER Gábor wrote:
> > While 'git commit-graph write --stdin-commits' expects commit object
> > ids as input, it accepts and silently skips over any invalid commit
> > object ids, and still exits with success:
> > 
> >   # nonsense
> >   $ echo not-a-commit-oid | git commit-graph write --stdin-commits
> >   $ echo $?
> >   0
> >   # sometimes I forgot that refs are not good...
> >   $ echo HEAD | git commit-graph write --stdin-commits
> >   $ echo $?
> >   0
> >   # valid tree OID, but not a commit OID
> >   $ git rev-parse HEAD^{tree} | git commit-graph write --stdin-commits
> >   $ echo $?
> >   0
> >   $ ls -l .git/objects/info/commit-graph
> >   ls: cannot access '.git/objects/info/commit-graph': No such file or directory
> > 
> > Check that all input records are indeed valid commit object ids and
> > return with error otherwise, the same way '--stdin-packs' handles
> > invalid input; see e103f7276f (commit-graph: return with errors during
> > write, 2019-06-12).
> 
> Consistency is good. We should definitely make these modes match.

I was also wondering whether it would be worth accepting refs as well,
either as DWIMery or only when a '--revs' option is given (similar to
'git pack-objects --revs').  Dunno, I'm a bit hesitant about always
accepting refs as a DWIMery, this is plumbing after all.  And I don't
really care whether I correct my bogus command by replacing 'echo'
with 'git rev-parse' or by adding a '--revs' argument; the important
thing is that the command should tell me that I gave it junk.  And
that would be a new feature, while this patch is a bugfix IMO.

> > Note that it should only return with error when encountering an
> > invalid commit object id coming from standard input.  However,
> > '--reachable' uses the same code path to process object ids pointed to
> > by all refs, and that includes tag object ids as well, which should
> > still be skipped over.  Therefore add a new flag to 'enum
> > commit_graph_write_flags' and a corresponding field to 'struct
> > write_commit_graph_context', so we can differentiate between those two
> > cases.
> 
> Thank you for the care here.

Well, to be honest, I wasn't careful... :)  but running the test suite
with GIT_TEST_COMMIT_GRAPH=1 resulted in about a dozen failed test
scripts that traced back to this.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 3/3] commit-graph: error out on invalid commit oids in 'write --stdin-commits'
  2019-08-05  8:02 ` [PATCH 3/3] commit-graph: error out on invalid commit oids in 'write --stdin-commits' SZEDER Gábor
  2019-08-05 13:57   ` Derrick Stolee
@ 2020-04-03 18:30   ` Jeff King
  2020-04-03 18:49     ` Taylor Blau
  1 sibling, 1 reply; 19+ messages in thread
From: Jeff King @ 2020-04-03 18:30 UTC (permalink / raw)
  To: SZEDER Gábor; +Cc: Taylor Blau, Junio C Hamano, Derrick Stolee, git

On Mon, Aug 05, 2019 at 10:02:40AM +0200, SZEDER Gábor wrote:

> While 'git commit-graph write --stdin-commits' expects commit object
> ids as input, it accepts and silently skips over any invalid commit
> object ids, and still exits with success:
> 
>   # nonsense
>   $ echo not-a-commit-oid | git commit-graph write --stdin-commits
>   $ echo $?
>   0
>   # sometimes I forgot that refs are not good...
>   $ echo HEAD | git commit-graph write --stdin-commits
>   $ echo $?
>   0
>   # valid tree OID, but not a commit OID
>   $ git rev-parse HEAD^{tree} | git commit-graph write --stdin-commits
>   $ echo $?
>   0
>   $ ls -l .git/objects/info/commit-graph
>   ls: cannot access '.git/objects/info/commit-graph': No such file or directory
> 
> Check that all input records are indeed valid commit object ids and
> return with error otherwise, the same way '--stdin-packs' handles
> invalid input; see e103f7276f (commit-graph: return with errors during
> write, 2019-06-12).

Can you explain more why the old behavior is a problem? For reasons (see
below), we want to do something like:

  git for-each-ref --format='%(objectname)' |
  git commit-graph write --stdin-commits

In v2.23 and earlier, that worked exactly like --reachable, but now it
will blow up if there are any refs that point to a non-commit (e.g., a
tag of a blob).

It can be worked around by asking for %(objecttype) and %(*objecttype)
and grepping the result, but that's awkward and much less efficient
(especially if you have a lot of annotated tags, as we may have to open
and parse each one).

Now obviously you could just use --reachable for the code above. But
here are two plausible cases where you might not want to do that:

 - you're limiting the graph to only a subset of refs (e.g., you want to
   graph refs/heads/ and refs/tags, but not refs/some-other-weird-area/).

 - you're generating an incremental graph update. You know somehow that
   a few refs were updated, and you want to feed those tips to generate
   the incremental, but not the rest of the refs (not because it would
   be wrong to do so, but in the name of keeping it O(size of change)
   and not O(number of refs in the repo).

The latter is the actual case that bit us. I suppose one could do
something like:

  git rev-list --no-walk <maybe-commits |
  git commit-graph write --stdin-commits

to use rev-list as a filter, but that feels kind of baroque.

Normally I'm in favor of more error checking instead of less, but in
this case it feels like it's putting scripted use at a disadvantage
versus the internal code (e.g., the auto-write for git-fetch uses the
"--reachable" semantics for its internal invocation).

-Peff

PS As an aside, I think the internal git-fetch write could benefit from
   this same trick: feed the set of newly-stored ref tips to the
   commit-graph machinery, rather than using for_each_ref().

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 3/3] commit-graph: error out on invalid commit oids in 'write --stdin-commits'
  2020-04-03 18:30   ` Jeff King
@ 2020-04-03 18:49     ` Taylor Blau
  2020-04-03 19:38       ` SZEDER Gábor
  2020-04-03 19:47       ` Junio C Hamano
  0 siblings, 2 replies; 19+ messages in thread
From: Taylor Blau @ 2020-04-03 18:49 UTC (permalink / raw)
  To: Jeff King
  Cc: SZEDER Gábor, Taylor Blau, Junio C Hamano, Derrick Stolee,
	git

On Fri, Apr 03, 2020 at 02:30:57PM -0400, Jeff King wrote:
> On Mon, Aug 05, 2019 at 10:02:40AM +0200, SZEDER Gábor wrote:
>
> > While 'git commit-graph write --stdin-commits' expects commit object
> > ids as input, it accepts and silently skips over any invalid commit
> > object ids, and still exits with success:
> >
> >   # nonsense
> >   $ echo not-a-commit-oid | git commit-graph write --stdin-commits
> >   $ echo $?
> >   0
> >   # sometimes I forgot that refs are not good...
> >   $ echo HEAD | git commit-graph write --stdin-commits
> >   $ echo $?
> >   0
> >   # valid tree OID, but not a commit OID
> >   $ git rev-parse HEAD^{tree} | git commit-graph write --stdin-commits
> >   $ echo $?
> >   0
> >   $ ls -l .git/objects/info/commit-graph
> >   ls: cannot access '.git/objects/info/commit-graph': No such file or directory
> >
> > Check that all input records are indeed valid commit object ids and
> > return with error otherwise, the same way '--stdin-packs' handles
> > invalid input; see e103f7276f (commit-graph: return with errors during
> > write, 2019-06-12).
>
> Can you explain more why the old behavior is a problem? For reasons (see
> below), we want to do something like:
>
>   git for-each-ref --format='%(objectname)' |
>   git commit-graph write --stdin-commits
>
> In v2.23 and earlier, that worked exactly like --reachable, but now it
> will blow up if there are any refs that point to a non-commit (e.g., a
> tag of a blob).
>
> It can be worked around by asking for %(objecttype) and %(*objecttype)
> and grepping the result, but that's awkward and much less efficient
> (especially if you have a lot of annotated tags, as we may have to open
> and parse each one).
>
> Now obviously you could just use --reachable for the code above. But
> here are two plausible cases where you might not want to do that:
>
>  - you're limiting the graph to only a subset of refs (e.g., you want to
>    graph refs/heads/ and refs/tags, but not refs/some-other-weird-area/).
>
>  - you're generating an incremental graph update. You know somehow that
>    a few refs were updated, and you want to feed those tips to generate
>    the incremental, but not the rest of the refs (not because it would
>    be wrong to do so, but in the name of keeping it O(size of change)
>    and not O(number of refs in the repo).
>
> The latter is the actual case that bit us. I suppose one could do
> something like:
>
>   git rev-list --no-walk <maybe-commits |
>   git commit-graph write --stdin-commits
>
> to use rev-list as a filter, but that feels kind of baroque.
>
> Normally I'm in favor of more error checking instead of less, but in
> this case it feels like it's putting scripted use at a disadvantage
> versus the internal code (e.g., the auto-write for git-fetch uses the
> "--reachable" semantics for its internal invocation).

For what it's worth, (and in case it wasn't obvious) this came about
because we feed '--stdin-commits' at GitHub, and observed exactly this
error case. I wasn't sure what approach would be more palatable, so I
prepared both in my fork at https://github.com/ttaylorr/git:

  - Branch 'tb/commit-graph-dont-check-oids' drops this checking
    entirely.

  - Branch 'tb/commit-graph-check-oids-option' adds a
    '--[no-]check-oids', in case that this is generally desirable
    behavior, by offering an opt-out of this OID checking.

Please let me know if you find either of these to be good options, and
I'll happily send one of them to the list. Thanks.

> -Peff
>
> PS As an aside, I think the internal git-fetch write could benefit from
>    this same trick: feed the set of newly-stored ref tips to the
>    commit-graph machinery, rather than using for_each_ref().

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 3/3] commit-graph: error out on invalid commit oids in 'write --stdin-commits'
  2020-04-03 18:49     ` Taylor Blau
@ 2020-04-03 19:38       ` SZEDER Gábor
  2020-04-03 19:51         ` Jeff King
  2020-04-03 19:55         ` Taylor Blau
  2020-04-03 19:47       ` Junio C Hamano
  1 sibling, 2 replies; 19+ messages in thread
From: SZEDER Gábor @ 2020-04-03 19:38 UTC (permalink / raw)
  To: Taylor Blau; +Cc: Jeff King, Junio C Hamano, Derrick Stolee, git

On Fri, Apr 03, 2020 at 12:49:33PM -0600, Taylor Blau wrote:
> On Fri, Apr 03, 2020 at 02:30:57PM -0400, Jeff King wrote:
> > On Mon, Aug 05, 2019 at 10:02:40AM +0200, SZEDER Gábor wrote:
> >
> > > While 'git commit-graph write --stdin-commits' expects commit object
> > > ids as input, it accepts and silently skips over any invalid commit
> > > object ids, and still exits with success:
> > >
> > >   # nonsense
> > >   $ echo not-a-commit-oid | git commit-graph write --stdin-commits
> > >   $ echo $?
> > >   0
> > >   # sometimes I forgot that refs are not good...
> > >   $ echo HEAD | git commit-graph write --stdin-commits
> > >   $ echo $?
> > >   0
> > >   # valid tree OID, but not a commit OID
> > >   $ git rev-parse HEAD^{tree} | git commit-graph write --stdin-commits
> > >   $ echo $?
> > >   0
> > >   $ ls -l .git/objects/info/commit-graph
> > >   ls: cannot access '.git/objects/info/commit-graph': No such file or directory
> > >
> > > Check that all input records are indeed valid commit object ids and
> > > return with error otherwise, the same way '--stdin-packs' handles
> > > invalid input; see e103f7276f (commit-graph: return with errors during
> > > write, 2019-06-12).
> >
> > Can you explain more why the old behavior is a problem?

Because when I do:

   # sometimes I forgot that refs are not good...
   $ echo HEAD | git commit-graph write --stdin-commits

then I get _nothing_: neither an error, nor a commit-graph.

> > For reasons (see
> > below), we want to do something like:
> >
> >   git for-each-ref --format='%(objectname)' |
> >   git commit-graph write --stdin-commits
> >
> > In v2.23 and earlier, that worked exactly like --reachable, but now it
> > will blow up if there are any refs that point to a non-commit (e.g., a
> > tag of a blob).
> >
> > It can be worked around by asking for %(objecttype) and %(*objecttype)
> > and grepping the result, but that's awkward and much less efficient
> > (especially if you have a lot of annotated tags, as we may have to open
> > and parse each one).
> >
> > Now obviously you could just use --reachable for the code above. But
> > here are two plausible cases where you might not want to do that:
> >
> >  - you're limiting the graph to only a subset of refs (e.g., you want to
> >    graph refs/heads/ and refs/tags, but not refs/some-other-weird-area/).
> >
> >  - you're generating an incremental graph update. You know somehow that
> >    a few refs were updated, and you want to feed those tips to generate
> >    the incremental, but not the rest of the refs (not because it would
> >    be wrong to do so, but in the name of keeping it O(size of change)
> >    and not O(number of refs in the repo).
> >
> > The latter is the actual case that bit us. I suppose one could do
> > something like:
> >
> >   git rev-list --no-walk <maybe-commits |
> >   git commit-graph write --stdin-commits
> >
> > to use rev-list as a filter, but that feels kind of baroque.
> >
> > Normally I'm in favor of more error checking instead of less, but in
> > this case it feels like it's putting scripted use at a disadvantage
> > versus the internal code (e.g., the auto-write for git-fetch uses the
> > "--reachable" semantics for its internal invocation).
> 
> For what it's worth, (and in case it wasn't obvious) this came about
> because we feed '--stdin-commits' at GitHub, and observed exactly this
> error case. I wasn't sure what approach would be more palatable, so I
> prepared both in my fork at https://github.com/ttaylorr/git:
> 
>   - Branch 'tb/commit-graph-dont-check-oids' drops this checking
>     entirely.

Oh, no :)

>   - Branch 'tb/commit-graph-check-oids-option' adds a
>     '--[no-]check-oids', in case that this is generally desirable
>     behavior, by offering an opt-out of this OID checking.
> 
> Please let me know if you find either of these to be good options, and
> I'll happily send one of them to the list. Thanks.

Or introduce 'git commit-graph write --stdin-refs'?  Or teach
'--stdin-commits' to DWIM and accept and parse refs?  Though the
question still remains what to do with refs that can't be peeled back
to commits

> > -Peff
> >
> > PS As an aside, I think the internal git-fetch write could benefit from
> >    this same trick: feed the set of newly-stored ref tips to the
> >    commit-graph machinery, rather than using for_each_ref().
> 
> Thanks,
> Taylor

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 3/3] commit-graph: error out on invalid commit oids in 'write --stdin-commits'
  2020-04-03 18:49     ` Taylor Blau
  2020-04-03 19:38       ` SZEDER Gábor
@ 2020-04-03 19:47       ` Junio C Hamano
  2020-04-03 19:57         ` Taylor Blau
  1 sibling, 1 reply; 19+ messages in thread
From: Junio C Hamano @ 2020-04-03 19:47 UTC (permalink / raw)
  To: Taylor Blau; +Cc: Jeff King, SZEDER Gábor, Derrick Stolee, git

Taylor Blau <me@ttaylorr.com> writes:

> On Fri, Apr 03, 2020 at 02:30:57PM -0400, Jeff King wrote:
>> On Mon, Aug 05, 2019 at 10:02:40AM +0200, SZEDER Gábor wrote:
>>
>> > Check that all input records are indeed valid commit object ids and
>> > return with error otherwise, the same way '--stdin-packs' handles
>> > invalid input; see e103f7276f (commit-graph: return with errors during
>> > write, 2019-06-12).
>>
>> Can you explain more why the old behavior is a problem? For reasons (see
>> below), we want to do something like:
>>
>>   git for-each-ref --format='%(objectname)' |
>>   git commit-graph write --stdin-commits
>> ...
>>  - you're generating an incremental graph update. You know somehow that
>>    a few refs were updated, and you want to feed those tips to generate
>>    the incremental, but not the rest of the refs (not because it would
>>    be wrong to do so, but in the name of keeping it O(size of change)
>>    and not O(number of refs in the repo).
>> ...
>> Normally I'm in favor of more error checking instead of less, but in
>> this case it feels like it's putting scripted use at a disadvantage
>> versus the internal code (e.g., the auto-write for git-fetch uses the
>> "--reachable" semantics for its internal invocation).

I think the "incremental from the tip of refs" is a valid and useful
use case.  I am not sure if the rationale given in the original to
compare the (stricter) check done here and what e103f7276f did
(which does not seem to get any input, valid or invalid, from the
end users) was a meaningful comparison, and regardless of Gábor's
answer to Peff's question, I think we should have an easy way to let
the machinery itself filter non-commit, so "--[no-]check-oids" that
optionally turns the stricter checking off would be an easy way out.
I do not have a strong opinion on which way the default should be
(at least not yet), but given that this was already in two major
releases ago, I'd assume that stricter-by-default-with-escape-hatch
would be the way to go (at least for now).

> For what it's worth, (and in case it wasn't obvious) this came about
> because we feed '--stdin-commits' at GitHub, and observed exactly this
> error case. I wasn't sure what approach would be more palatable, so I
> prepared both in my fork at https://github.com/ttaylorr/git:
>
>   - Branch 'tb/commit-graph-dont-check-oids' drops this checking
>     entirely.
>
>   - Branch 'tb/commit-graph-check-oids-option' adds a
>     '--[no-]check-oids', in case that this is generally desirable
>     behavior, by offering an opt-out of this OID checking.

Thanks.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 3/3] commit-graph: error out on invalid commit oids in 'write --stdin-commits'
  2020-04-03 19:38       ` SZEDER Gábor
@ 2020-04-03 19:51         ` Jeff King
  2020-04-03 20:40           ` SZEDER Gábor
  2020-04-03 19:55         ` Taylor Blau
  1 sibling, 1 reply; 19+ messages in thread
From: Jeff King @ 2020-04-03 19:51 UTC (permalink / raw)
  To: SZEDER Gábor; +Cc: Taylor Blau, Junio C Hamano, Derrick Stolee, git

On Fri, Apr 03, 2020 at 09:38:42PM +0200, SZEDER Gábor wrote:

> > > Can you explain more why the old behavior is a problem?
> 
> Because when I do:
> 
>    # sometimes I forgot that refs are not good...
>    $ echo HEAD | git commit-graph write --stdin-commits
> 
> then I get _nothing_: neither an error, nor a commit-graph.

OK, that makes more sense: it's an input format error, because we only
take hex oids.

Do you care about complaining about:

  git rev-parse HEAD^{tree} | git commit-graph write --stdin-commits

? That's the case that's much more interesting, I think.

> Or introduce 'git commit-graph write --stdin-refs'?  Or teach
> '--stdin-commits' to DWIM and accept and parse refs?  Though the
> question still remains what to do with refs that can't be peeled back
> to commits

Right. I think there are two orthogonal questions:

  - whether to resolve arbitrary names to objects and how to handle such
    input if we don't

  - what to do with an oid (whether given as hex or resolved from a
    name) that isn't a commit-ish

-Peff

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 3/3] commit-graph: error out on invalid commit oids in 'write --stdin-commits'
  2020-04-03 19:38       ` SZEDER Gábor
  2020-04-03 19:51         ` Jeff King
@ 2020-04-03 19:55         ` Taylor Blau
  1 sibling, 0 replies; 19+ messages in thread
From: Taylor Blau @ 2020-04-03 19:55 UTC (permalink / raw)
  To: SZEDER Gábor
  Cc: Taylor Blau, Jeff King, Junio C Hamano, Derrick Stolee, git

On Fri, Apr 03, 2020 at 09:38:42PM +0200, SZEDER Gábor wrote:
> On Fri, Apr 03, 2020 at 12:49:33PM -0600, Taylor Blau wrote:
> > On Fri, Apr 03, 2020 at 02:30:57PM -0400, Jeff King wrote:
> > > On Mon, Aug 05, 2019 at 10:02:40AM +0200, SZEDER Gábor wrote:
> > >
> > > > While 'git commit-graph write --stdin-commits' expects commit object
> > > > ids as input, it accepts and silently skips over any invalid commit
> > > > object ids, and still exits with success:
> > > >
> > > >   # nonsense
> > > >   $ echo not-a-commit-oid | git commit-graph write --stdin-commits
> > > >   $ echo $?
> > > >   0
> > > >   # sometimes I forgot that refs are not good...
> > > >   $ echo HEAD | git commit-graph write --stdin-commits
> > > >   $ echo $?
> > > >   0
> > > >   # valid tree OID, but not a commit OID
> > > >   $ git rev-parse HEAD^{tree} | git commit-graph write --stdin-commits
> > > >   $ echo $?
> > > >   0
> > > >   $ ls -l .git/objects/info/commit-graph
> > > >   ls: cannot access '.git/objects/info/commit-graph': No such file or directory
> > > >
> > > > Check that all input records are indeed valid commit object ids and
> > > > return with error otherwise, the same way '--stdin-packs' handles
> > > > invalid input; see e103f7276f (commit-graph: return with errors during
> > > > write, 2019-06-12).
> > >
> > > Can you explain more why the old behavior is a problem?
>
> Because when I do:
>
>    # sometimes I forgot that refs are not good...
>    $ echo HEAD | git commit-graph write --stdin-commits
>
> then I get _nothing_: neither an error, nor a commit-graph.
>
> > > For reasons (see
> > > below), we want to do something like:
> > >
> > >   git for-each-ref --format='%(objectname)' |
> > >   git commit-graph write --stdin-commits
> > >
> > > In v2.23 and earlier, that worked exactly like --reachable, but now it
> > > will blow up if there are any refs that point to a non-commit (e.g., a
> > > tag of a blob).
> > >
> > > It can be worked around by asking for %(objecttype) and %(*objecttype)
> > > and grepping the result, but that's awkward and much less efficient
> > > (especially if you have a lot of annotated tags, as we may have to open
> > > and parse each one).
> > >
> > > Now obviously you could just use --reachable for the code above. But
> > > here are two plausible cases where you might not want to do that:
> > >
> > >  - you're limiting the graph to only a subset of refs (e.g., you want to
> > >    graph refs/heads/ and refs/tags, but not refs/some-other-weird-area/).
> > >
> > >  - you're generating an incremental graph update. You know somehow that
> > >    a few refs were updated, and you want to feed those tips to generate
> > >    the incremental, but not the rest of the refs (not because it would
> > >    be wrong to do so, but in the name of keeping it O(size of change)
> > >    and not O(number of refs in the repo).
> > >
> > > The latter is the actual case that bit us. I suppose one could do
> > > something like:
> > >
> > >   git rev-list --no-walk <maybe-commits |
> > >   git commit-graph write --stdin-commits
> > >
> > > to use rev-list as a filter, but that feels kind of baroque.
> > >
> > > Normally I'm in favor of more error checking instead of less, but in
> > > this case it feels like it's putting scripted use at a disadvantage
> > > versus the internal code (e.g., the auto-write for git-fetch uses the
> > > "--reachable" semantics for its internal invocation).
> >
> > For what it's worth, (and in case it wasn't obvious) this came about
> > because we feed '--stdin-commits' at GitHub, and observed exactly this
> > error case. I wasn't sure what approach would be more palatable, so I
> > prepared both in my fork at https://github.com/ttaylorr/git:
> >
> >   - Branch 'tb/commit-graph-dont-check-oids' drops this checking
> >     entirely.
>
> Oh, no :)

Heh, I figured as much when Peff pointed me towards this thread ;-).

> >   - Branch 'tb/commit-graph-check-oids-option' adds a
> >     '--[no-]check-oids', in case that this is generally desirable
> >     behavior, by offering an opt-out of this OID checking.
> >
> > Please let me know if you find either of these to be good options, and
> > I'll happily send one of them to the list. Thanks.
>
> Or introduce 'git commit-graph write --stdin-refs'?  Or teach
> '--stdin-commits' to DWIM and accept and parse refs?  Though the
> question still remains what to do with refs that can't be peeled back
> to commits

How would '--stdin-refs' behave differently? I assume that it'd handle
going from 'refs/heads/blah' to an OID, but I think that's only kicking
the problem a little further down the road. Of course, you note the same
problem which is: what do we do when we can't resolve down to a commit
object.

I think that it seems the '--[no-]check-oids' may be the best of both
worlds, which allows for (mostly ;)) easy scripting around 'git
for-each-ref'.

> > > -Peff
> > >
> > > PS As an aside, I think the internal git-fetch write could benefit from
> > >    this same trick: feed the set of newly-stored ref tips to the
> > >    commit-graph machinery, rather than using for_each_ref().
> >
> > Thanks,
> > Taylor
Thanks,
Taylor

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 3/3] commit-graph: error out on invalid commit oids in 'write --stdin-commits'
  2020-04-03 19:47       ` Junio C Hamano
@ 2020-04-03 19:57         ` Taylor Blau
  0 siblings, 0 replies; 19+ messages in thread
From: Taylor Blau @ 2020-04-03 19:57 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Taylor Blau, Jeff King, SZEDER Gábor, Derrick Stolee, git

On Fri, Apr 03, 2020 at 12:47:03PM -0700, Junio C Hamano wrote:
> Taylor Blau <me@ttaylorr.com> writes:
>
> > On Fri, Apr 03, 2020 at 02:30:57PM -0400, Jeff King wrote:
> >> On Mon, Aug 05, 2019 at 10:02:40AM +0200, SZEDER Gábor wrote:
> >>
> >> > Check that all input records are indeed valid commit object ids and
> >> > return with error otherwise, the same way '--stdin-packs' handles
> >> > invalid input; see e103f7276f (commit-graph: return with errors during
> >> > write, 2019-06-12).
> >>
> >> Can you explain more why the old behavior is a problem? For reasons (see
> >> below), we want to do something like:
> >>
> >>   git for-each-ref --format='%(objectname)' |
> >>   git commit-graph write --stdin-commits
> >> ...
> >>  - you're generating an incremental graph update. You know somehow that
> >>    a few refs were updated, and you want to feed those tips to generate
> >>    the incremental, but not the rest of the refs (not because it would
> >>    be wrong to do so, but in the name of keeping it O(size of change)
> >>    and not O(number of refs in the repo).
> >> ...
> >> Normally I'm in favor of more error checking instead of less, but in
> >> this case it feels like it's putting scripted use at a disadvantage
> >> versus the internal code (e.g., the auto-write for git-fetch uses the
> >> "--reachable" semantics for its internal invocation).
>
> I think the "incremental from the tip of refs" is a valid and useful
> use case.  I am not sure if the rationale given in the original to
> compare the (stricter) check done here and what e103f7276f did
> (which does not seem to get any input, valid or invalid, from the
> end users) was a meaningful comparison, and regardless of Gábor's
> answer to Peff's question, I think we should have an easy way to let
> the machinery itself filter non-commit, so "--[no-]check-oids" that
> optionally turns the stricter checking off would be an easy way out.

Thanks. I think that this is probably my preference, too. I'll send this
as a patch to the list shortly...

> I do not have a strong opinion on which way the default should be
> (at least not yet), but given that this was already in two major
> releases ago, I'd assume that stricter-by-default-with-escape-hatch
> would be the way to go (at least for now).

Yeah. I think the nice thing about those patches is that we don't have
to decide that right away.

> > For what it's worth, (and in case it wasn't obvious) this came about
> > because we feed '--stdin-commits' at GitHub, and observed exactly this
> > error case. I wasn't sure what approach would be more palatable, so I
> > prepared both in my fork at https://github.com/ttaylorr/git:
> >
> >   - Branch 'tb/commit-graph-dont-check-oids' drops this checking
> >     entirely.
> >
> >   - Branch 'tb/commit-graph-check-oids-option' adds a
> >     '--[no-]check-oids', in case that this is generally desirable
> >     behavior, by offering an opt-out of this OID checking.
>
> Thanks.

Likewise :).

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 3/3] commit-graph: error out on invalid commit oids in 'write --stdin-commits'
  2020-04-03 19:51         ` Jeff King
@ 2020-04-03 20:40           ` SZEDER Gábor
  2020-04-03 23:10             ` Jeff King
  0 siblings, 1 reply; 19+ messages in thread
From: SZEDER Gábor @ 2020-04-03 20:40 UTC (permalink / raw)
  To: Jeff King; +Cc: Taylor Blau, Junio C Hamano, Derrick Stolee, git

On Fri, Apr 03, 2020 at 03:51:03PM -0400, Jeff King wrote:
> On Fri, Apr 03, 2020 at 09:38:42PM +0200, SZEDER Gábor wrote:
> 
> > > > Can you explain more why the old behavior is a problem?
> > 
> > Because when I do:
> > 
> >    # sometimes I forgot that refs are not good...
> >    $ echo HEAD | git commit-graph write --stdin-commits
> > 
> > then I get _nothing_: neither an error, nor a commit-graph.
> 
> OK, that makes more sense: it's an input format error, because we only
> take hex oids.
> 
> Do you care about complaining about:
> 
>   git rev-parse HEAD^{tree} | git commit-graph write --stdin-commits
> 
> ? That's the case that's much more interesting, I think.

Hm, are you trying to go in the direction where '--stdin-commits'
would keep erroring out on any non-full-hex-oid, but would accept and
silently ignore any hex oids that are not commits (perhaps even when
there is no such object, dunno)?  I think that would support the use
cases you mentioned, while it would still save me when I do the 'echo
<ref>' thing (somehow I regularly do that, remember doing it the day
before yesterday!).

I only mentioned the ^{tree} form in the commit message for the sake
of completeness, i.e. to show various cases where the user would get
neither error nor commit-graph.

> > Or introduce 'git commit-graph write --stdin-refs'?  Or teach
> > '--stdin-commits' to DWIM and accept and parse refs?  Though the
> > question still remains what to do with refs that can't be peeled back
> > to commits
> 
> Right. I think there are two orthogonal questions:
> 
>   - whether to resolve arbitrary names to objects and how to handle such
>     input if we don't
> 
>   - what to do with an oid (whether given as hex or resolved from a
>     name) that isn't a commit-ish
> 
> -Peff

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 3/3] commit-graph: error out on invalid commit oids in 'write --stdin-commits'
  2020-04-03 20:40           ` SZEDER Gábor
@ 2020-04-03 23:10             ` Jeff King
  2020-04-13 19:39               ` Taylor Blau
  0 siblings, 1 reply; 19+ messages in thread
From: Jeff King @ 2020-04-03 23:10 UTC (permalink / raw)
  To: SZEDER Gábor; +Cc: Taylor Blau, Junio C Hamano, Derrick Stolee, git

On Fri, Apr 03, 2020 at 10:40:13PM +0200, SZEDER Gábor wrote:

> > Do you care about complaining about:
> > 
> >   git rev-parse HEAD^{tree} | git commit-graph write --stdin-commits
> > 
> > ? That's the case that's much more interesting, I think.
> 
> Hm, are you trying to go in the direction where '--stdin-commits'
> would keep erroring out on any non-full-hex-oid, but would accept and
> silently ignore any hex oids that are not commits (perhaps even when
> there is no such object, dunno)?  I think that would support the use
> cases you mentioned, while it would still save me when I do the 'echo
> <ref>' thing (somehow I regularly do that, remember doing it the day
> before yesterday!).

Yes, exactly. The case you care about and the case I care about are
different ones, so there's no inherent conflict between them.

-Peff

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 3/3] commit-graph: error out on invalid commit oids in 'write --stdin-commits'
  2020-04-03 23:10             ` Jeff King
@ 2020-04-13 19:39               ` Taylor Blau
  2020-04-13 21:25                 ` Jeff King
  0 siblings, 1 reply; 19+ messages in thread
From: Taylor Blau @ 2020-04-13 19:39 UTC (permalink / raw)
  To: Jeff King
  Cc: SZEDER Gábor, Taylor Blau, Junio C Hamano, Derrick Stolee,
	git

On Fri, Apr 03, 2020 at 07:10:21PM -0400, Jeff King wrote:
> On Fri, Apr 03, 2020 at 10:40:13PM +0200, SZEDER Gábor wrote:
>
> > > Do you care about complaining about:
> > >
> > >   git rev-parse HEAD^{tree} | git commit-graph write --stdin-commits
> > >
> > > ? That's the case that's much more interesting, I think.
> >
> > Hm, are you trying to go in the direction where '--stdin-commits'
> > would keep erroring out on any non-full-hex-oid, but would accept and
> > silently ignore any hex oids that are not commits (perhaps even when
> > there is no such object, dunno)?  I think that would support the use
> > cases you mentioned, while it would still save me when I do the 'echo
> > <ref>' thing (somehow I regularly do that, remember doing it the day
> > before yesterday!).
>
> Yes, exactly. The case you care about and the case I care about are
> different ones, so there's no inherent conflict between them.

I was looking back again at this today, and I think we need something
more or less like the following on top. I'll send it out later today or
early tomorrow...

diff --git a/commit-graph.c b/commit-graph.c
index 2d436907cd..58e7890590 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -1221,17 +1221,24 @@ static int fill_oids_from_commit_hex(struct write_commit_graph_context *ctx,
                                        commit_hex->nr);
        }
        for (i = 0; i < commit_hex->nr; i++) {
+               int ret;
                const char *end;
                struct object_id oid;
                struct commit *result;

                display_progress(ctx->progress, i + 1);
-               if (!parse_oid_hex(commit_hex->items[i].string, &oid, &end) &&
-                   (result = lookup_commit_reference_gently(ctx->r, &oid, 1))) {
+
+               ret = parse_oid_hex(commit_hex->items[i].string, &oid, &end);
+               if (!ret) {
+                   result = lookup_commit_reference_gently(ctx->r, &oid, 1);
+                   if (result) {
                        ALLOC_GROW(ctx->oids.list, ctx->oids.nr + 1, ctx->oids.alloc);
                        oidcpy(&ctx->oids.list[ctx->oids.nr], &(result->object.oid));
                        ctx->oids.nr++;
-               } else if (ctx->check_oids) {
+                   }
+               }
+
+               if (ret || (ctx->check_oids && !result)) {
                        error(_("invalid commit object id: %s"),
                            commit_hex->items[i].string);
                        return -1;

> -Peff

Thanks,
Taylor

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH 3/3] commit-graph: error out on invalid commit oids in 'write --stdin-commits'
  2020-04-13 19:39               ` Taylor Blau
@ 2020-04-13 21:25                 ` Jeff King
  2020-04-14  2:04                   ` Taylor Blau
  0 siblings, 1 reply; 19+ messages in thread
From: Jeff King @ 2020-04-13 21:25 UTC (permalink / raw)
  To: Taylor Blau; +Cc: SZEDER Gábor, Junio C Hamano, Derrick Stolee, git

On Mon, Apr 13, 2020 at 01:39:34PM -0600, Taylor Blau wrote:

> > > Hm, are you trying to go in the direction where '--stdin-commits'
> > > would keep erroring out on any non-full-hex-oid, but would accept and
> > > silently ignore any hex oids that are not commits (perhaps even when
> > > there is no such object, dunno)?  I think that would support the use
> > > cases you mentioned, while it would still save me when I do the 'echo
> > > <ref>' thing (somehow I regularly do that, remember doing it the day
> > > before yesterday!).
> >
> > Yes, exactly. The case you care about and the case I care about are
> > different ones, so there's no inherent conflict between them.
> 
> I was looking back again at this today, and I think we need something
> more or less like the following on top. I'll send it out later today or
> early tomorrow...

Yes, the behavior there looks fine to me. Though you may want to catch
the parse_oid_hex() separately and return its own error message. Telling
the user "I don't understand non-hex object names" instead of just
"invalid commit object" may be useful. I think it would also make the
flow of the function easier to follow.

If we were writing from scratch, I'd actually suggest that
builtin/commit-graph.c do parse_oid_hex() call as we read lines, and
then commit-graph could just be working with an oid_array or oidset,
which would reduce overall memory usage. I don't know if that would
cause other complications, but it could be worth looking into.

> +               if (ret || (ctx->check_oids && !result)) {
>                         error(_("invalid commit object id: %s"),
>                             commit_hex->items[i].string);
>                         return -1;

We could also take this a step further and just ditch check_oids
entirely (under the assumption that nobody really wants it; they just
wanted to catch bad names in the first place).

-Peff

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 3/3] commit-graph: error out on invalid commit oids in 'write --stdin-commits'
  2020-04-13 21:25                 ` Jeff King
@ 2020-04-14  2:04                   ` Taylor Blau
  0 siblings, 0 replies; 19+ messages in thread
From: Taylor Blau @ 2020-04-14  2:04 UTC (permalink / raw)
  To: Jeff King
  Cc: Taylor Blau, SZEDER Gábor, Junio C Hamano, Derrick Stolee,
	git

On Mon, Apr 13, 2020 at 05:25:06PM -0400, Jeff King wrote:
> On Mon, Apr 13, 2020 at 01:39:34PM -0600, Taylor Blau wrote:
>
> > > > Hm, are you trying to go in the direction where '--stdin-commits'
> > > > would keep erroring out on any non-full-hex-oid, but would accept and
> > > > silently ignore any hex oids that are not commits (perhaps even when
> > > > there is no such object, dunno)?  I think that would support the use
> > > > cases you mentioned, while it would still save me when I do the 'echo
> > > > <ref>' thing (somehow I regularly do that, remember doing it the day
> > > > before yesterday!).
> > >
> > > Yes, exactly. The case you care about and the case I care about are
> > > different ones, so there's no inherent conflict between them.
> >
> > I was looking back again at this today, and I think we need something
> > more or less like the following on top. I'll send it out later today or
> > early tomorrow...
>
> Yes, the behavior there looks fine to me. Though you may want to catch
> the parse_oid_hex() separately and return its own error message. Telling
> the user "I don't understand non-hex object names" instead of just
> "invalid commit object" may be useful. I think it would also make the
> flow of the function easier to follow.
>
> If we were writing from scratch, I'd actually suggest that
> builtin/commit-graph.c do parse_oid_hex() call as we read lines, and
> then commit-graph could just be working with an oid_array or oidset,
> which would reduce overall memory usage. I don't know if that would
> cause other complications, but it could be worth looking into.

It actually improved the situation quite a bit, so thanks for the
suggestion. In addition refactoring it was quite a lot of fun. The
second-order benefit was that it moves these two failure modes into
separate parts of the code.

Converting the user input to an OID (and thus dealing with input like
"HEAD", "refs/tags/foo", and etc.) is a separate part from turning those
OIDs into 'struct commit *'s. So the "non-OID input" and "this OID
doesn't refer to a commit" checks can be moved into the builtin and
library machinery separately, which is quite handy.

The patch is too large to send here inline (there's a lot of noise from
renaming 'commit_hex' to 'commits'), but I'll include it in my
commit-graphs backlog series shortly.

> > +               if (ret || (ctx->check_oids && !result)) {
> >                         error(_("invalid commit object id: %s"),
> >                             commit_hex->items[i].string);
> >                         return -1;
>
> We could also take this a step further and just ditch check_oids
> entirely (under the assumption that nobody really wants it; they just
> wanted to catch bad names in the first place).

I'd rather let this shake out in a discussion of the patch once I send
it, since I'm not sure how people feel in general.

> -Peff

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2020-04-14  2:04 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-05  8:02 [PATCH 0/3] commit-graph: error out on invalid commit oids in 'write --stdin-commits' SZEDER Gábor
2019-08-05  8:02 ` [PATCH 1/3] t5318-commit-graph: use 'test_expect_code' SZEDER Gábor
2019-08-05  8:02 ` [PATCH 2/3] commit-graph: turn a group of write-related macro flags into an enum SZEDER Gábor
2019-08-05  8:02 ` [PATCH 3/3] commit-graph: error out on invalid commit oids in 'write --stdin-commits' SZEDER Gábor
2019-08-05 13:57   ` Derrick Stolee
2019-08-05 17:57     ` SZEDER Gábor
2020-04-03 18:30   ` Jeff King
2020-04-03 18:49     ` Taylor Blau
2020-04-03 19:38       ` SZEDER Gábor
2020-04-03 19:51         ` Jeff King
2020-04-03 20:40           ` SZEDER Gábor
2020-04-03 23:10             ` Jeff King
2020-04-13 19:39               ` Taylor Blau
2020-04-13 21:25                 ` Jeff King
2020-04-14  2:04                   ` Taylor Blau
2020-04-03 19:55         ` Taylor Blau
2020-04-03 19:47       ` Junio C Hamano
2020-04-03 19:57         ` Taylor Blau
2019-08-05 10:14 ` [PATCH 0/3] " SZEDER Gábor

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).