git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH 0/2] hooks: fix a race in hook execution
@ 2022-02-18 20:43 Ævar Arnfjörð Bjarmason
  2022-02-18 20:43 ` [PATCH 1/2] merge: don't run post-hook logic on --no-verify Ævar Arnfjörð Bjarmason
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-02-18 20:43 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Bagas Sanjaya, Emily Shaffer,
	Ævar Arnfjörð Bjarmason

Follow-up on the recently landed c70bc338e9a (Merge branch
'ab/config-based-hooks-2', 2022-02-09). This fixes an obscure race
condition in how we execute a few hooks, because we'd run the hook,
and then check if the hook existed.

We could thus skip some post-hook logic if a hook was run, but we
raced with the hook being removed from the repository.

As 2/2 notes being worried about that isn't very realistic, but it
makes sense to have the hook API expose a "did I run it?" as part of
its API, so let's add such a flag, and use it in those cases.

The 2/2 has been on-list before as part of a much bigger hook topic
submission[1]. I fixed up a few things in it, and added 1/2. The
range-diff below is to that previous submission.

1. https://lore.kernel.org/git/patch-v5-36.36-fe056098534-20210902T125111Z-avarab@gmail.com/

Ævar Arnfjörð Bjarmason (2):
  merge: don't run post-hook logic on --no-verify
  hooks: fix a TOCTOU in "did we run a hook?" heuristic

 builtin/commit.c       | 18 +++++++++++-------
 builtin/merge.c        | 28 +++++++++++++++++-----------
 builtin/receive-pack.c |  8 +++++---
 commit.c               |  2 +-
 commit.h               |  3 ++-
 hook.c                 |  7 +++++++
 hook.h                 |  9 +++++++++
 sequencer.c            |  4 ++--
 8 files changed, 54 insertions(+), 25 deletions(-)

Range-diff:
-:  ----------- > 1:  9b5144daee6 merge: don't run post-hook logic on --no-verify
1:  fe056098534 ! 2:  d01d088073b hooks: fix a TOCTOU in "did we run a hook?" heuristic
    @@ builtin/commit.c: static int prepare_to_commit(const char *index_file, const cha
      	int clean_message_contents = (cleanup_mode != COMMIT_MSG_CLEANUP_NONE);
      	int old_display_comment_prefix;
      	int merge_contains_scissors = 0;
    -+	int invoked_hook = 0;
    ++	int invoked_hook;
      
      	/* This checks and barfs if author is badly specified */
      	determine_author_info(author_ident);
    @@ builtin/commit.c: int cmd_commit(int argc, const char **argv, const char *prefix
     
      ## builtin/merge.c ##
     @@ builtin/merge.c: static void prepare_to_commit(struct commit_list *remoteheads)
    - {
    - 	struct strbuf msg = STRBUF_INIT;
      	const char *index_file = get_index_file();
    -+	int invoked_hook = 0;
      
    --	if (!no_verify && run_commit_hook(0 < option_edit, index_file, "pre-merge-commit", NULL))
    -+	if (!no_verify && run_commit_hook(0 < option_edit, index_file,
    -+					  &invoked_hook, "pre-merge-commit",
    -+					  NULL))
    - 		abort_commit(remoteheads, NULL);
    - 	/*
    --	 * Re-read the index as pre-merge-commit hook could have updated it,
    --	 * and write it out as a tree.  We must do this before we invoke
    -+	 * Re-read the index as the pre-merge-commit hook was invoked
    -+	 * and could have updated it. We must do this before we invoke
    - 	 * the editor and after we invoke run_status above.
    - 	 */
    --	if (hook_exists("pre-merge-commit"))
    -+	if (invoked_hook)
    - 		discard_cache();
    + 	if (!no_verify) {
    +-		if (run_commit_hook(0 < option_edit, index_file,
    ++		int invoked_hook;
    ++
    ++		if (run_commit_hook(0 < option_edit, index_file, &invoked_hook,
    + 				    "pre-merge-commit", NULL))
    + 			abort_commit(remoteheads, NULL);
    + 		/*
    +@@ builtin/merge.c: static void prepare_to_commit(struct commit_list *remoteheads)
    + 		 * and write it out as a tree.  We must do this before we invoke
    + 		 * the editor and after we invoke run_status above.
    + 		 */
    +-		if (hook_exists("pre-merge-commit"))
    ++		if (invoked_hook)
    + 			discard_cache();
    + 	}
      	read_cache_from(index_file);
    - 	strbuf_addbuf(&msg, &merge_msg);
     @@ builtin/merge.c: static void prepare_to_commit(struct commit_list *remoteheads)
      		append_signoff(&msg, ignore_non_trailer(msg.buf, msg.len), 0);
      	write_merge_heads(remoteheads);
    @@ builtin/receive-pack.c: static const char *push_to_deploy(unsigned char *sha1,
      	strvec_pushv(&opt.env, env->v);
     @@ builtin/receive-pack.c: static const char *update_worktree(unsigned char *sha1, const struct worktree *w
      {
    - 	const char *retval, *work_tree, *git_dir = NULL;
    + 	const char *retval, *git_dir;
      	struct strvec env = STRVEC_INIT;
    -+	int invoked_hook = 0;
    ++	int invoked_hook;
      
    - 	if (worktree && worktree->path)
    - 		work_tree = worktree->path;
    + 	if (!worktree || !worktree->path)
    + 		BUG("worktree->path must be non-NULL");
     @@ builtin/receive-pack.c: static const char *update_worktree(unsigned char *sha1, const struct worktree *w
      
      	strvec_pushf(&env, "GIT_DIR=%s", absolute_path(git_dir));
      
     -	if (!hook_exists(push_to_checkout_hook))
    -+	retval = push_to_checkout(sha1, &invoked_hook, &env, work_tree);
    ++	retval = push_to_checkout(sha1, &invoked_hook, &env, worktree->path);
     +	if (!invoked_hook)
    - 		retval = push_to_deploy(sha1, &env, work_tree);
    + 		retval = push_to_deploy(sha1, &env, worktree->path);
     -	else
    --		retval = push_to_checkout(sha1, &env, work_tree);
    +-		retval = push_to_checkout(sha1, &env, worktree->path);
      
      	strvec_clear(&env);
      	return retval;
    @@ commit.c: size_t ignore_non_trailer(const char *buf, size_t len)
      }
      
      int run_commit_hook(int editor_is_used, const char *index_file,
    -+		    int *invoked_hook,
    - 		    const char *name, ...)
    +-		    const char *name, ...)
    ++		    int *invoked_hook, const char *name, ...)
      {
      	struct run_hooks_opt opt = RUN_HOOKS_OPT_INIT;
    + 	va_list args;
     
      ## commit.h ##
     @@ commit.h: int compare_commits_by_commit_date(const void *a_, const void *b_, void *unused)
    @@ commit.h: int compare_commits_by_commit_date(const void *a_, const void *b_, voi
     
      ## hook.c ##
     @@ hook.c: static int notify_hook_finished(int result,
    + 				void *pp_task_cb)
    + {
    + 	struct hook_cb_data *hook_cb = pp_cb;
    ++	struct run_hooks_opt *opt = hook_cb->options;
      
      	hook_cb->rc |= result;
      
    -+	if (hook_cb->invoked_hook)
    -+		*hook_cb->invoked_hook = 1;
    ++	if (opt->invoked_hook)
    ++		*opt->invoked_hook = 1;
     +
      	return 0;
      }
      
    -@@ hook.c: int run_hooks(const char *hook_name, const char *hook_path,
    - 		.rc = 0,
    - 		.hook_name = hook_name,
    - 		.options = options,
    -+		.invoked_hook = options->invoked_hook,
    - 	};
    - 	int jobs = 1;
    +@@ hook.c: int run_hooks_opt(const char *hook_name, struct run_hooks_opt *options)
    + 	if (!options)
    + 		BUG("a struct run_hooks_opt must be provided to run_hooks");
    + 
    ++	if (options->invoked_hook)
    ++		*options->invoked_hook = 0;
    ++
    + 	if (!hook_path && !options->error_if_missing)
    + 		goto cleanup;
      
     
      ## hook.h ##
     @@ hook.h: struct run_hooks_opt
    - 	 * for an example.
    + 	 * translates to "struct child_process"'s "dir" member.
      	 */
    - 	consume_sideband_fn consume_sideband;
    + 	const char *dir;
     +
    -+	/*
    ++	/**
     +	 * A pointer which if provided will be set to 1 or 0 depending
     +	 * on if a hook was invoked (i.e. existed), regardless of
     +	 * whether or not that was successful. Used for avoiding
    @@ hook.h: struct run_hooks_opt
      };
      
      #define RUN_HOOKS_OPT_INIT { \
    -@@ hook.h: struct hook_cb_data {
    - 	const char *hook_name;
    - 	struct hook *run_me;
    - 	struct run_hooks_opt *options;
    -+	int *invoked_hook;
    - };
    - 
    - /**
     
      ## sequencer.c ##
     @@ sequencer.c: static int run_prepare_commit_msg_hook(struct repository *r,
-- 
2.35.1.1031.g277d4562d2e


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/2] merge: don't run post-hook logic on --no-verify
  2022-02-18 20:43 [PATCH 0/2] hooks: fix a race in hook execution Ævar Arnfjörð Bjarmason
@ 2022-02-18 20:43 ` Ævar Arnfjörð Bjarmason
  2022-02-18 23:57   ` Junio C Hamano
  2022-02-18 20:43 ` [PATCH 2/2] hooks: fix a TOCTOU in "did we run a hook?" heuristic Ævar Arnfjörð Bjarmason
  2022-03-07 12:33 ` [PATCH v2 0/2] hooks: fix a race in hook execution Ævar Arnfjörð Bjarmason
  2 siblings, 1 reply; 12+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-02-18 20:43 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Bagas Sanjaya, Emily Shaffer,
	Ævar Arnfjörð Bjarmason

Fix a minor bug introduced in bc40ce4de61 (merge: --no-verify to
bypass pre-merge-commit hook, 2019-08-07), when that change made the
--no-verify option bypass the "pre-merge-commit" hook it didn't update
the corresponding find_hook() (later hook_exists()) condition.

As can be seen in the preceding commit in 6098817fd7f (git-merge:
honor pre-merge-commit hook, 2019-08-07) the two should go hand in
hand. There's no point in invoking discard_cache() here if the hook
couldn't have possibly updated the index.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/merge.c | 21 ++++++++++++---------
 1 file changed, 12 insertions(+), 9 deletions(-)

diff --git a/builtin/merge.c b/builtin/merge.c
index 133831d42fd..fab553e3bc4 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -845,15 +845,18 @@ static void prepare_to_commit(struct commit_list *remoteheads)
 	struct strbuf msg = STRBUF_INIT;
 	const char *index_file = get_index_file();
 
-	if (!no_verify && run_commit_hook(0 < option_edit, index_file, "pre-merge-commit", NULL))
-		abort_commit(remoteheads, NULL);
-	/*
-	 * Re-read the index as pre-merge-commit hook could have updated it,
-	 * and write it out as a tree.  We must do this before we invoke
-	 * the editor and after we invoke run_status above.
-	 */
-	if (hook_exists("pre-merge-commit"))
-		discard_cache();
+	if (!no_verify) {
+		if (run_commit_hook(0 < option_edit, index_file,
+				    "pre-merge-commit", NULL))
+			abort_commit(remoteheads, NULL);
+		/*
+		 * Re-read the index as pre-merge-commit hook could have updated it,
+		 * and write it out as a tree.  We must do this before we invoke
+		 * the editor and after we invoke run_status above.
+		 */
+		if (hook_exists("pre-merge-commit"))
+			discard_cache();
+	}
 	read_cache_from(index_file);
 	strbuf_addbuf(&msg, &merge_msg);
 	if (squash)
-- 
2.35.1.1031.g277d4562d2e


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/2] hooks: fix a TOCTOU in "did we run a hook?" heuristic
  2022-02-18 20:43 [PATCH 0/2] hooks: fix a race in hook execution Ævar Arnfjörð Bjarmason
  2022-02-18 20:43 ` [PATCH 1/2] merge: don't run post-hook logic on --no-verify Ævar Arnfjörð Bjarmason
@ 2022-02-18 20:43 ` Ævar Arnfjörð Bjarmason
  2022-02-19  0:11   ` Junio C Hamano
  2022-02-19  4:08   ` Taylor Blau
  2022-03-07 12:33 ` [PATCH v2 0/2] hooks: fix a race in hook execution Ævar Arnfjörð Bjarmason
  2 siblings, 2 replies; 12+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-02-18 20:43 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Bagas Sanjaya, Emily Shaffer,
	Ævar Arnfjörð Bjarmason

Fix a Time-of-check to time-of-use (TOCTOU) race in code added in
680ee550d72 (commit: skip discarding the index if there is no
pre-commit hook, 2017-08-14).

We can fix the race passing around information about whether or not we
ran the hook in question, instead of running hook_exists() after the
fact to check if the hook in question exists. This problem has been
noted on-list when 680ee550d72 was discussed[1], but had not been
fixed.

In addition to fixing this for the pre-commit hook as suggested there
I'm also fixing this for the pre-merge-commit hook. See
6098817fd7f (git-merge: honor pre-merge-commit hook, 2019-08-07) for
the introduction of its previous behavior.

Let's also change this for the push-to-checkout hook. Now instead of
checking if the hook exists and either doing a push to checkout or a
push to deploy we'll always attempt a push to checkout. If the hook
doesn't exist we'll fall back on push to deploy. The same behavior as
before, without the TOCTOU race. See 0855331941b (receive-pack:
support push-to-checkout hook, 2014-12-01) for the introduction of the
previous behavior.

This leaves uses of hook_exists() in two places that matter. The
"reference-transaction" check in refs.c, see 67541597670 (refs:
implement reference transaction hook, 2020-06-19), and the
prepare-commit-msg hook, see 66618a50f9c (sequencer: run
'prepare-commit-msg' hook, 2018-01-24).

In both of those cases we're saving ourselves CPU time by not
preparing data for the hook that we'll then do nothing with if we
don't have the hook. So using this "invoked_hook" pattern doesn't make
sense in those cases.

More importantly, in those cases the worst we'll do is miss that we
"should" run the hook because a new hook appeared, whereas in the
pre-commit and pre-merge-commit cases we'll skip an important
discard_cache() on the bases of our faulty guess.

I do think none of these races really matter in practice. It would be
some one-off issue as a hook was added or removed. I did think it was
stupid that we didn't pass a "did this run?" flag instead of doing
this guessing at a distance though, so now we're not guessing anymore.

1. https://lore.kernel.org/git/20170810191613.kpmhzg4seyxy3cpq@sigill.intra.peff.net/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/commit.c       | 18 +++++++++++-------
 builtin/merge.c        | 11 +++++++----
 builtin/receive-pack.c |  8 +++++---
 commit.c               |  2 +-
 commit.h               |  3 ++-
 hook.c                 |  7 +++++++
 hook.h                 |  9 +++++++++
 sequencer.c            |  4 ++--
 8 files changed, 44 insertions(+), 18 deletions(-)

diff --git a/builtin/commit.c b/builtin/commit.c
index b9ed0374e30..bc5d34bc31f 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -725,11 +725,13 @@ static int prepare_to_commit(const char *index_file, const char *prefix,
 	int clean_message_contents = (cleanup_mode != COMMIT_MSG_CLEANUP_NONE);
 	int old_display_comment_prefix;
 	int merge_contains_scissors = 0;
+	int invoked_hook;
 
 	/* This checks and barfs if author is badly specified */
 	determine_author_info(author_ident);
 
-	if (!no_verify && run_commit_hook(use_editor, index_file, "pre-commit", NULL))
+	if (!no_verify && run_commit_hook(use_editor, index_file, &invoked_hook,
+					  "pre-commit", NULL))
 		return 0;
 
 	if (squash_message) {
@@ -1052,10 +1054,10 @@ static int prepare_to_commit(const char *index_file, const char *prefix,
 		return 0;
 	}
 
-	if (!no_verify && hook_exists("pre-commit")) {
+	if (!no_verify && invoked_hook) {
 		/*
-		 * Re-read the index as pre-commit hook could have updated it,
-		 * and write it out as a tree.  We must do this before we invoke
+		 * Re-read the index as the pre-commit-commit hook was invoked
+		 * and could have updated it. We must do this before we invoke
 		 * the editor and after we invoke run_status above.
 		 */
 		discard_cache();
@@ -1067,7 +1069,7 @@ static int prepare_to_commit(const char *index_file, const char *prefix,
 		return 0;
 	}
 
-	if (run_commit_hook(use_editor, index_file, "prepare-commit-msg",
+	if (run_commit_hook(use_editor, index_file, NULL, "prepare-commit-msg",
 			    git_path_commit_editmsg(), hook_arg1, hook_arg2, NULL))
 		return 0;
 
@@ -1084,7 +1086,8 @@ static int prepare_to_commit(const char *index_file, const char *prefix,
 	}
 
 	if (!no_verify &&
-	    run_commit_hook(use_editor, index_file, "commit-msg", git_path_commit_editmsg(), NULL)) {
+	    run_commit_hook(use_editor, index_file, NULL, "commit-msg",
+			    git_path_commit_editmsg(), NULL)) {
 		return 0;
 	}
 
@@ -1845,7 +1848,8 @@ int cmd_commit(int argc, const char **argv, const char *prefix)
 
 	repo_rerere(the_repository, 0);
 	run_auto_maintenance(quiet);
-	run_commit_hook(use_editor, get_index_file(), "post-commit", NULL);
+	run_commit_hook(use_editor, get_index_file(), NULL, "post-commit",
+			NULL);
 	if (amend && !no_post_rewrite) {
 		commit_post_rewrite(the_repository, current_head, &oid);
 	}
diff --git a/builtin/merge.c b/builtin/merge.c
index fab553e3bc4..4b2eab88b0d 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -846,7 +846,9 @@ static void prepare_to_commit(struct commit_list *remoteheads)
 	const char *index_file = get_index_file();
 
 	if (!no_verify) {
-		if (run_commit_hook(0 < option_edit, index_file,
+		int invoked_hook;
+
+		if (run_commit_hook(0 < option_edit, index_file, &invoked_hook,
 				    "pre-merge-commit", NULL))
 			abort_commit(remoteheads, NULL);
 		/*
@@ -854,7 +856,7 @@ static void prepare_to_commit(struct commit_list *remoteheads)
 		 * and write it out as a tree.  We must do this before we invoke
 		 * the editor and after we invoke run_status above.
 		 */
-		if (hook_exists("pre-merge-commit"))
+		if (invoked_hook)
 			discard_cache();
 	}
 	read_cache_from(index_file);
@@ -878,7 +880,8 @@ static void prepare_to_commit(struct commit_list *remoteheads)
 		append_signoff(&msg, ignore_non_trailer(msg.buf, msg.len), 0);
 	write_merge_heads(remoteheads);
 	write_file_buf(git_path_merge_msg(the_repository), msg.buf, msg.len);
-	if (run_commit_hook(0 < option_edit, get_index_file(), "prepare-commit-msg",
+	if (run_commit_hook(0 < option_edit, get_index_file(), NULL,
+			    "prepare-commit-msg",
 			    git_path_merge_msg(the_repository), "merge", NULL))
 		abort_commit(remoteheads, NULL);
 	if (0 < option_edit) {
@@ -887,7 +890,7 @@ static void prepare_to_commit(struct commit_list *remoteheads)
 	}
 
 	if (!no_verify && run_commit_hook(0 < option_edit, get_index_file(),
-					  "commit-msg",
+					  NULL, "commit-msg",
 					  git_path_merge_msg(the_repository), NULL))
 		abort_commit(remoteheads, NULL);
 
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index c427ca09aaf..75595c7af38 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -1408,10 +1408,12 @@ static const char *push_to_deploy(unsigned char *sha1,
 static const char *push_to_checkout_hook = "push-to-checkout";
 
 static const char *push_to_checkout(unsigned char *hash,
+				    int *invoked_hook,
 				    struct strvec *env,
 				    const char *work_tree)
 {
 	struct run_hooks_opt opt = RUN_HOOKS_OPT_INIT;
+	opt.invoked_hook = invoked_hook;
 
 	strvec_pushf(env, "GIT_WORK_TREE=%s", absolute_path(work_tree));
 	strvec_pushv(&opt.env, env->v);
@@ -1426,6 +1428,7 @@ static const char *update_worktree(unsigned char *sha1, const struct worktree *w
 {
 	const char *retval, *git_dir;
 	struct strvec env = STRVEC_INIT;
+	int invoked_hook;
 
 	if (!worktree || !worktree->path)
 		BUG("worktree->path must be non-NULL");
@@ -1436,10 +1439,9 @@ static const char *update_worktree(unsigned char *sha1, const struct worktree *w
 
 	strvec_pushf(&env, "GIT_DIR=%s", absolute_path(git_dir));
 
-	if (!hook_exists(push_to_checkout_hook))
+	retval = push_to_checkout(sha1, &invoked_hook, &env, worktree->path);
+	if (!invoked_hook)
 		retval = push_to_deploy(sha1, &env, worktree->path);
-	else
-		retval = push_to_checkout(sha1, &env, worktree->path);
 
 	strvec_clear(&env);
 	return retval;
diff --git a/commit.c b/commit.c
index d400f5dfa2b..396e14d7b32 100644
--- a/commit.c
+++ b/commit.c
@@ -1713,7 +1713,7 @@ size_t ignore_non_trailer(const char *buf, size_t len)
 }
 
 int run_commit_hook(int editor_is_used, const char *index_file,
-		    const char *name, ...)
+		    int *invoked_hook, const char *name, ...)
 {
 	struct run_hooks_opt opt = RUN_HOOKS_OPT_INIT;
 	va_list args;
diff --git a/commit.h b/commit.h
index 38cc5426615..3b174135bcf 100644
--- a/commit.h
+++ b/commit.h
@@ -369,7 +369,8 @@ int compare_commits_by_commit_date(const void *a_, const void *b_, void *unused)
 int compare_commits_by_gen_then_commit_date(const void *a_, const void *b_, void *unused);
 
 LAST_ARG_MUST_BE_NULL
-int run_commit_hook(int editor_is_used, const char *index_file, const char *name, ...);
+int run_commit_hook(int editor_is_used, const char *index_file,
+		    int *invoked_hook, const char *name, ...);
 
 /* Sign a commit or tag buffer, storing the result in a header. */
 int sign_with_header(struct strbuf *buf, const char *keyid);
diff --git a/hook.c b/hook.c
index 69a215b2c3c..1d51be3b77a 100644
--- a/hook.c
+++ b/hook.c
@@ -96,9 +96,13 @@ static int notify_hook_finished(int result,
 				void *pp_task_cb)
 {
 	struct hook_cb_data *hook_cb = pp_cb;
+	struct run_hooks_opt *opt = hook_cb->options;
 
 	hook_cb->rc |= result;
 
+	if (opt->invoked_hook)
+		*opt->invoked_hook = 1;
+
 	return 0;
 }
 
@@ -123,6 +127,9 @@ int run_hooks_opt(const char *hook_name, struct run_hooks_opt *options)
 	if (!options)
 		BUG("a struct run_hooks_opt must be provided to run_hooks");
 
+	if (options->invoked_hook)
+		*options->invoked_hook = 0;
+
 	if (!hook_path && !options->error_if_missing)
 		goto cleanup;
 
diff --git a/hook.h b/hook.h
index 18d90aedf14..735a7d97002 100644
--- a/hook.h
+++ b/hook.h
@@ -18,6 +18,15 @@ struct run_hooks_opt
 	 * translates to "struct child_process"'s "dir" member.
 	 */
 	const char *dir;
+
+	/**
+	 * A pointer which if provided will be set to 1 or 0 depending
+	 * on if a hook was invoked (i.e. existed), regardless of
+	 * whether or not that was successful. Used for avoiding
+	 * TOCTOU races in code that would otherwise call hook_exist()
+	 * after a "maybe hook run" to see if a hook was invoked.
+	 */
+	int *invoked_hook;
 };
 
 #define RUN_HOOKS_OPT_INIT { \
diff --git a/sequencer.c b/sequencer.c
index fb978a53a0f..967bf054709 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -1220,7 +1220,7 @@ static int run_prepare_commit_msg_hook(struct repository *r,
 	} else {
 		arg1 = "message";
 	}
-	if (run_commit_hook(0, r->index_file, "prepare-commit-msg", name,
+	if (run_commit_hook(0, r->index_file, NULL, "prepare-commit-msg", name,
 			    arg1, arg2, NULL))
 		ret = error(_("'prepare-commit-msg' hook failed"));
 
@@ -1552,7 +1552,7 @@ static int try_to_commit(struct repository *r,
 		goto out;
 	}
 
-	run_commit_hook(0, r->index_file, "post-commit", NULL);
+	run_commit_hook(0, r->index_file, NULL, "post-commit", NULL);
 	if (flags & AMEND_MSG)
 		commit_post_rewrite(r, current_head, oid);
 
-- 
2.35.1.1031.g277d4562d2e


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/2] merge: don't run post-hook logic on --no-verify
  2022-02-18 20:43 ` [PATCH 1/2] merge: don't run post-hook logic on --no-verify Ævar Arnfjörð Bjarmason
@ 2022-02-18 23:57   ` Junio C Hamano
  0 siblings, 0 replies; 12+ messages in thread
From: Junio C Hamano @ 2022-02-18 23:57 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git, Bagas Sanjaya, Emily Shaffer

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> diff --git a/builtin/merge.c b/builtin/merge.c
> index 133831d42fd..fab553e3bc4 100644
> --- a/builtin/merge.c
> +++ b/builtin/merge.c
> @@ -845,15 +845,18 @@ static void prepare_to_commit(struct commit_list *remoteheads)
>  	struct strbuf msg = STRBUF_INIT;
>  	const char *index_file = get_index_file();
>  
> -	if (!no_verify && run_commit_hook(0 < option_edit, index_file, "pre-merge-commit", NULL))
> -		abort_commit(remoteheads, NULL);
> -	/*
> -	 * Re-read the index as pre-merge-commit hook could have updated it,
> -	 * and write it out as a tree.  We must do this before we invoke
> -	 * the editor and after we invoke run_status above.
> -	 */
> -	if (hook_exists("pre-merge-commit"))
> -		discard_cache();
> +	if (!no_verify) {
> +		if (run_commit_hook(0 < option_edit, index_file,
> +				    "pre-merge-commit", NULL))
> +			abort_commit(remoteheads, NULL);
> +		/*
> +		 * Re-read the index as pre-merge-commit hook could have updated it,
> +		 * and write it out as a tree.  We must do this before we invoke
> +		 * the editor and after we invoke run_status above.
> +		 */
> +		if (hook_exists("pre-merge-commit"))
> +			discard_cache();
> +	}

The updated code not just is more correct, but it is much easier to
follow.

I wonder if run_commit_hook() can return "I failed to run the hook",
"I ran the hook and the hook failed", "I successfully run the hook",
and "I didn't find the hook", instead of the current "yes/no".  That
would allow us to express this part in an even cleaner way, I would
think.

Looking good.

Thanks.

>  	read_cache_from(index_file);
>  	strbuf_addbuf(&msg, &merge_msg);
>  	if (squash)

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/2] hooks: fix a TOCTOU in "did we run a hook?" heuristic
  2022-02-18 20:43 ` [PATCH 2/2] hooks: fix a TOCTOU in "did we run a hook?" heuristic Ævar Arnfjörð Bjarmason
@ 2022-02-19  0:11   ` Junio C Hamano
  2022-02-19  4:48     ` Ævar Arnfjörð Bjarmason
  2022-02-19  4:08   ` Taylor Blau
  1 sibling, 1 reply; 12+ messages in thread
From: Junio C Hamano @ 2022-02-19  0:11 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git, Bagas Sanjaya, Emily Shaffer

> -int run_commit_hook(int editor_is_used, const char *index_file, const char *name, ...);
> +int run_commit_hook(int editor_is_used, const char *index_file,
> +		    int *invoked_hook, const char *name, ...);
>  

Even though my gut feeling tells me that turning the "yes/no"
integer into an enum that includes "there was no such hook", "I
tried to run it, but it failed to run" [*], "I ran it and it was
happy".  would be a more viable approach for the longer term, I
guess this extra and ad-hoc parameter would be sufficient as a
shorter term improvement.

    Side note: optionally "failed to run" may be split into "failed
    to even start (e.g. ENOEXEC)" and "started successfully but
    exited with non-zero status".  There may or may not be callers
    that wants to see them as distinct cases right now, but an
    interface based on returned enum value would be easier to extend
    than having to add a pointer to return variable every time we
    need to know more details.

Thanks.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/2] hooks: fix a TOCTOU in "did we run a hook?" heuristic
  2022-02-18 20:43 ` [PATCH 2/2] hooks: fix a TOCTOU in "did we run a hook?" heuristic Ævar Arnfjörð Bjarmason
  2022-02-19  0:11   ` Junio C Hamano
@ 2022-02-19  4:08   ` Taylor Blau
  2022-02-19 10:46     ` Ævar Arnfjörð Bjarmason
  1 sibling, 1 reply; 12+ messages in thread
From: Taylor Blau @ 2022-02-19  4:08 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Junio C Hamano, Bagas Sanjaya, Emily Shaffer

On Fri, Feb 18, 2022 at 09:43:52PM +0100, Ævar Arnfjörð Bjarmason wrote:
> diff --git a/builtin/commit.c b/builtin/commit.c
> index b9ed0374e30..bc5d34bc31f 100644
> --- a/builtin/commit.c
> +++ b/builtin/commit.c
> @@ -725,11 +725,13 @@ static int prepare_to_commit(const char *index_file, const char *prefix,
>  	int clean_message_contents = (cleanup_mode != COMMIT_MSG_CLEANUP_NONE);
>  	int old_display_comment_prefix;
>  	int merge_contains_scissors = 0;
> +	int invoked_hook;
>
>  	/* This checks and barfs if author is badly specified */
>  	determine_author_info(author_ident);
>
> -	if (!no_verify && run_commit_hook(use_editor, index_file, "pre-commit", NULL))
> +	if (!no_verify && run_commit_hook(use_editor, index_file, &invoked_hook,
> +					  "pre-commit", NULL))
>  		return 0;
>
>  	if (squash_message) {
> @@ -1052,10 +1054,10 @@ static int prepare_to_commit(const char *index_file, const char *prefix,
>  		return 0;
>  	}
>
> -	if (!no_verify && hook_exists("pre-commit")) {
> +	if (!no_verify && invoked_hook) {
>  		/*
> -		 * Re-read the index as pre-commit hook could have updated it,
> -		 * and write it out as a tree.  We must do this before we invoke
> +		 * Re-read the index as the pre-commit-commit hook was invoked
> +		 * and could have updated it. We must do this before we invoke
>  		 * the editor and after we invoke run_status above.
>  		 */
>  		discard_cache();

Sanity checking my own understating of this race: if we ran the
pre-commit hook and it modified the index, but hook_exists() returns
false later on (e.g., because the hook itself went away, the directory
became unreadable, etc.), then we won't call discard_cache() when we
should have?

If so, OK. This definitely seems like a pretty niche race, but
independent of that I think the change here is an improvement in
readability, and makes it clearer that calling discard_cache() depends
on whether or not we *ran* the pre-commit hook, not whether we (still)
*have* a pre-commit hook.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/2] hooks: fix a TOCTOU in "did we run a hook?" heuristic
  2022-02-19  0:11   ` Junio C Hamano
@ 2022-02-19  4:48     ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 12+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-02-19  4:48 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Bagas Sanjaya, Emily Shaffer


On Fri, Feb 18 2022, Junio C Hamano wrote:

>> -int run_commit_hook(int editor_is_used, const char *index_file, const char *name, ...);
>> +int run_commit_hook(int editor_is_used, const char *index_file,
>> +		    int *invoked_hook, const char *name, ...);
>>  
>
> Even though my gut feeling tells me that turning the "yes/no"
> integer into an enum that includes "there was no such hook", "I
> tried to run it, but it failed to run" [*], "I ran it and it was
> happy".  would be a more viable approach for the longer term, I
> guess this extra and ad-hoc parameter would be sufficient as a
> shorter term improvement.
>
>     Side note: optionally "failed to run" may be split into "failed
>     to even start (e.g. ENOEXEC)" and "started successfully but
>     exited with non-zero status".  There may or may not be callers
>     that wants to see them as distinct cases right now, but an
>     interface based on returned enum value would be easier to extend
>     than having to add a pointer to return variable every time we
>     need to know more details.

Yes, I debated with myself whether I should add some more generic
interface to it, and decided just to do the bare minumum of adding
something the "struct run_hooks_opt".

FWIW the "yes/no" is not that, run_commit_hook() just returns the value
of run_hooks_opt(), which is currently either an <0 error, or the status
code from the hook. I.e. what gets passed to the "task_finished_fn"
callback for run_processes_parallel_tr2(). I.e. the finish_command()
return value.

We do cover the "ENOEXEC" case in ignoring it, since if we fail on
startup we won't say we ran the hook.

I think in practice what'll matter is this "invoked_hook". I.e. if we
failed to parse our config, the hook wasn't executable or whatever
that's just a <0 error, and we didn't run the hook.

Or, if we ran it at all (even if it failed) we'll know that we need to
e.g. discard_index(), since we can't guarantee that the hook didn't get
that far that we'll need to update our own assumptions.

A caller who ares about anything else will also need to deal with a lot
more complexity once we have config-based-hooks / parallel hooks by
default. I.e. was that ENOEXEC one of N hooks, all of them, did all/one
exit non-zero etc?

Whereas "int *invoked_hook" we can just set as long as we invoked any of
them at all.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/2] hooks: fix a TOCTOU in "did we run a hook?" heuristic
  2022-02-19  4:08   ` Taylor Blau
@ 2022-02-19 10:46     ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 12+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-02-19 10:46 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, Junio C Hamano, Bagas Sanjaya, Emily Shaffer


On Fri, Feb 18 2022, Taylor Blau wrote:

> On Fri, Feb 18, 2022 at 09:43:52PM +0100, Ævar Arnfjörð Bjarmason wrote:
>> diff --git a/builtin/commit.c b/builtin/commit.c
>> index b9ed0374e30..bc5d34bc31f 100644
>> --- a/builtin/commit.c
>> +++ b/builtin/commit.c
>> @@ -725,11 +725,13 @@ static int prepare_to_commit(const char *index_file, const char *prefix,
>>  	int clean_message_contents = (cleanup_mode != COMMIT_MSG_CLEANUP_NONE);
>>  	int old_display_comment_prefix;
>>  	int merge_contains_scissors = 0;
>> +	int invoked_hook;
>>
>>  	/* This checks and barfs if author is badly specified */
>>  	determine_author_info(author_ident);
>>
>> -	if (!no_verify && run_commit_hook(use_editor, index_file, "pre-commit", NULL))
>> +	if (!no_verify && run_commit_hook(use_editor, index_file, &invoked_hook,
>> +					  "pre-commit", NULL))
>>  		return 0;
>>
>>  	if (squash_message) {
>> @@ -1052,10 +1054,10 @@ static int prepare_to_commit(const char *index_file, const char *prefix,
>>  		return 0;
>>  	}
>>
>> -	if (!no_verify && hook_exists("pre-commit")) {
>> +	if (!no_verify && invoked_hook) {
>>  		/*
>> -		 * Re-read the index as pre-commit hook could have updated it,
>> -		 * and write it out as a tree.  We must do this before we invoke
>> +		 * Re-read the index as the pre-commit-commit hook was invoked
>> +		 * and could have updated it. We must do this before we invoke
>>  		 * the editor and after we invoke run_status above.
>>  		 */
>>  		discard_cache();
>
> Sanity checking my own understating of this race: if we ran the
> pre-commit hook and it modified the index, but hook_exists() returns
> false later on (e.g., because the hook itself went away, the directory
> became unreadable, etc.), then we won't call discard_cache() when we
> should have?

Yes, it's that obscure.

> If so, OK. This definitely seems like a pretty niche race, but
> independent of that I think the change here is an improvement in
> readability, and makes it clearer that calling discard_cache() depends
> on whether or not we *ran* the pre-commit hook, not whether we (still)
> *have* a pre-commit hook.

Yeah, that's the main reason to do it. I found this really hard to
follow before, why didn't we just remember if we have/ran the thing? Now
we do.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v2 0/2] hooks: fix a race in hook execution
  2022-02-18 20:43 [PATCH 0/2] hooks: fix a race in hook execution Ævar Arnfjörð Bjarmason
  2022-02-18 20:43 ` [PATCH 1/2] merge: don't run post-hook logic on --no-verify Ævar Arnfjörð Bjarmason
  2022-02-18 20:43 ` [PATCH 2/2] hooks: fix a TOCTOU in "did we run a hook?" heuristic Ævar Arnfjörð Bjarmason
@ 2022-03-07 12:33 ` Ævar Arnfjörð Bjarmason
  2022-03-07 12:33   ` [PATCH v2 1/2] merge: don't run post-hook logic on --no-verify Ævar Arnfjörð Bjarmason
  2022-03-07 12:33   ` [PATCH v2 2/2] hooks: fix an obscure TOCTOU "did we just run a hook?" race Ævar Arnfjörð Bjarmason
  2 siblings, 2 replies; 12+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-03-07 12:33 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Bagas Sanjaya, Emily Shaffer,
	Ævar Arnfjörð Bjarmason

A documentation & commit-message only change to this v1 which fixes an
obscure race condition in hook execution. For v1 see:
https://lore.kernel.org/git/cover-0.2-00000000000-20220218T203834Z-avarab@gmail.com/

Junio: This topic wasn't picked up yet, but hopefully will be with the
below, which should address coments you & Taylor had on the v1.

Ævar Arnfjörð Bjarmason (2):
  merge: don't run post-hook logic on --no-verify
  hooks: fix an obscure TOCTOU "did we just run a hook?" race

 builtin/commit.c       | 18 +++++++++++-------
 builtin/merge.c        | 28 +++++++++++++++++-----------
 builtin/receive-pack.c |  8 +++++---
 commit.c               |  2 +-
 commit.h               |  3 ++-
 hook.c                 |  7 +++++++
 hook.h                 | 12 ++++++++++++
 sequencer.c            |  4 ++--
 8 files changed, 57 insertions(+), 25 deletions(-)

Range-diff against v1:
1:  9b5144daee6 ! 1:  8f7b01ed758 merge: don't run post-hook logic on --no-verify
    @@ Commit message
         hand. There's no point in invoking discard_cache() here if the hook
         couldn't have possibly updated the index.
     
    +    It's buggy that we use "hook_exist()" here, and as discussed in the
    +    subsequent commit it's subject to obscure race conditions that we're
    +    about to fix, but for now this change is a strict improvement that
    +    retains any caveats to do with the use of "hooks_exist()" as-is.
    +
         Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
      ## builtin/merge.c ##
2:  d01d088073b ! 2:  9d16984898c hooks: fix a TOCTOU in "did we run a hook?" heuristic
    @@ Metadata
     Author: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
      ## Commit message ##
    -    hooks: fix a TOCTOU in "did we run a hook?" heuristic
    +    hooks: fix an obscure TOCTOU "did we just run a hook?" race
     
         Fix a Time-of-check to time-of-use (TOCTOU) race in code added in
         680ee550d72 (commit: skip discarding the index if there is no
         pre-commit hook, 2017-08-14).
     
    -    We can fix the race passing around information about whether or not we
    -    ran the hook in question, instead of running hook_exists() after the
    -    fact to check if the hook in question exists. This problem has been
    +    This obscure race condition can occur if we e.g. ran the "pre-commit"
    +    hook and it modified the index, but hook_exists() returns false later
    +    on (e.g., because the hook itself went away, the directory became
    +    unreadable, etc.). Then we won't call discard_cache() when we should
    +    have.
    +
    +    The race condition itself probably doesn't matter, and users would
    +    have been unlikely to run into it in practice. This problem has been
         noted on-list when 680ee550d72 was discussed[1], but had not been
         fixed.
     
    -    In addition to fixing this for the pre-commit hook as suggested there
    -    I'm also fixing this for the pre-merge-commit hook. See
    -    6098817fd7f (git-merge: honor pre-merge-commit hook, 2019-08-07) for
    -    the introduction of its previous behavior.
    +    This change is mainly intended to improve the readability of the code
    +    involved, and to make reasoning about it more straightforward. It
    +    wasn't as obvious what we were trying to do here, but by having an
    +    "invoked_hook" it's clearer that e.g. our discard_cache() is happening
    +    because of the earlier hook execution.
     
         Let's also change this for the push-to-checkout hook. Now instead of
         checking if the hook exists and either doing a push to checkout or a
    @@ Commit message
         This leaves uses of hook_exists() in two places that matter. The
         "reference-transaction" check in refs.c, see 67541597670 (refs:
         implement reference transaction hook, 2020-06-19), and the
    -    prepare-commit-msg hook, see 66618a50f9c (sequencer: run
    +    "prepare-commit-msg" hook, see 66618a50f9c (sequencer: run
         'prepare-commit-msg' hook, 2018-01-24).
     
         In both of those cases we're saving ourselves CPU time by not
    @@ Commit message
         don't have the hook. So using this "invoked_hook" pattern doesn't make
         sense in those cases.
     
    -    More importantly, in those cases the worst we'll do is miss that we
    -    "should" run the hook because a new hook appeared, whereas in the
    -    pre-commit and pre-merge-commit cases we'll skip an important
    -    discard_cache() on the bases of our faulty guess.
    -
    -    I do think none of these races really matter in practice. It would be
    -    some one-off issue as a hook was added or removed. I did think it was
    -    stupid that we didn't pass a "did this run?" flag instead of doing
    -    this guessing at a distance though, so now we're not guessing anymore.
    +    The "reference-transaction" and "prepare-commit-msg" hook also aren't
    +    racy. In those cases we'll skip the hook runs if we race with a new
    +    hook being added, whereas in the TOCTOU races being fixed here we were
    +    incorrectly skipping the required post-hook logic.
     
         1. https://lore.kernel.org/git/20170810191613.kpmhzg4seyxy3cpq@sigill.intra.peff.net/
     
    @@ hook.h: struct run_hooks_opt
     +
     +	/**
     +	 * A pointer which if provided will be set to 1 or 0 depending
    -+	 * on if a hook was invoked (i.e. existed), regardless of
    -+	 * whether or not that was successful. Used for avoiding
    -+	 * TOCTOU races in code that would otherwise call hook_exist()
    -+	 * after a "maybe hook run" to see if a hook was invoked.
    ++	 * on if a hook was started, regardless of whether or not that
    ++	 * was successful. I.e. if the underlying start_command() was
    ++	 * successful this will be set to 1.
    ++	 *
    ++	 * Used for avoiding TOCTOU races in code that would otherwise
    ++	 * call hook_exist() after a "maybe hook run" to see if a hook
    ++	 * was invoked.
     +	 */
     +	int *invoked_hook;
      };
-- 
2.35.1.1242.gfeba0eae32b


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v2 1/2] merge: don't run post-hook logic on --no-verify
  2022-03-07 12:33 ` [PATCH v2 0/2] hooks: fix a race in hook execution Ævar Arnfjörð Bjarmason
@ 2022-03-07 12:33   ` Ævar Arnfjörð Bjarmason
  2022-03-07 12:33   ` [PATCH v2 2/2] hooks: fix an obscure TOCTOU "did we just run a hook?" race Ævar Arnfjörð Bjarmason
  1 sibling, 0 replies; 12+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-03-07 12:33 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Bagas Sanjaya, Emily Shaffer,
	Ævar Arnfjörð Bjarmason

Fix a minor bug introduced in bc40ce4de61 (merge: --no-verify to
bypass pre-merge-commit hook, 2019-08-07), when that change made the
--no-verify option bypass the "pre-merge-commit" hook it didn't update
the corresponding find_hook() (later hook_exists()) condition.

As can be seen in the preceding commit in 6098817fd7f (git-merge:
honor pre-merge-commit hook, 2019-08-07) the two should go hand in
hand. There's no point in invoking discard_cache() here if the hook
couldn't have possibly updated the index.

It's buggy that we use "hook_exist()" here, and as discussed in the
subsequent commit it's subject to obscure race conditions that we're
about to fix, but for now this change is a strict improvement that
retains any caveats to do with the use of "hooks_exist()" as-is.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/merge.c | 21 ++++++++++++---------
 1 file changed, 12 insertions(+), 9 deletions(-)

diff --git a/builtin/merge.c b/builtin/merge.c
index a94a03384ae..b26b4c45157 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -845,15 +845,18 @@ static void prepare_to_commit(struct commit_list *remoteheads)
 	struct strbuf msg = STRBUF_INIT;
 	const char *index_file = get_index_file();
 
-	if (!no_verify && run_commit_hook(0 < option_edit, index_file, "pre-merge-commit", NULL))
-		abort_commit(remoteheads, NULL);
-	/*
-	 * Re-read the index as pre-merge-commit hook could have updated it,
-	 * and write it out as a tree.  We must do this before we invoke
-	 * the editor and after we invoke run_status above.
-	 */
-	if (hook_exists("pre-merge-commit"))
-		discard_cache();
+	if (!no_verify) {
+		if (run_commit_hook(0 < option_edit, index_file,
+				    "pre-merge-commit", NULL))
+			abort_commit(remoteheads, NULL);
+		/*
+		 * Re-read the index as pre-merge-commit hook could have updated it,
+		 * and write it out as a tree.  We must do this before we invoke
+		 * the editor and after we invoke run_status above.
+		 */
+		if (hook_exists("pre-merge-commit"))
+			discard_cache();
+	}
 	read_cache_from(index_file);
 	strbuf_addbuf(&msg, &merge_msg);
 	if (squash)
-- 
2.35.1.1242.gfeba0eae32b


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v2 2/2] hooks: fix an obscure TOCTOU "did we just run a hook?" race
  2022-03-07 12:33 ` [PATCH v2 0/2] hooks: fix a race in hook execution Ævar Arnfjörð Bjarmason
  2022-03-07 12:33   ` [PATCH v2 1/2] merge: don't run post-hook logic on --no-verify Ævar Arnfjörð Bjarmason
@ 2022-03-07 12:33   ` Ævar Arnfjörð Bjarmason
  2022-03-21 20:30     ` Jonathan Tan
  1 sibling, 1 reply; 12+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-03-07 12:33 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Bagas Sanjaya, Emily Shaffer,
	Ævar Arnfjörð Bjarmason

Fix a Time-of-check to time-of-use (TOCTOU) race in code added in
680ee550d72 (commit: skip discarding the index if there is no
pre-commit hook, 2017-08-14).

This obscure race condition can occur if we e.g. ran the "pre-commit"
hook and it modified the index, but hook_exists() returns false later
on (e.g., because the hook itself went away, the directory became
unreadable, etc.). Then we won't call discard_cache() when we should
have.

The race condition itself probably doesn't matter, and users would
have been unlikely to run into it in practice. This problem has been
noted on-list when 680ee550d72 was discussed[1], but had not been
fixed.

This change is mainly intended to improve the readability of the code
involved, and to make reasoning about it more straightforward. It
wasn't as obvious what we were trying to do here, but by having an
"invoked_hook" it's clearer that e.g. our discard_cache() is happening
because of the earlier hook execution.

Let's also change this for the push-to-checkout hook. Now instead of
checking if the hook exists and either doing a push to checkout or a
push to deploy we'll always attempt a push to checkout. If the hook
doesn't exist we'll fall back on push to deploy. The same behavior as
before, without the TOCTOU race. See 0855331941b (receive-pack:
support push-to-checkout hook, 2014-12-01) for the introduction of the
previous behavior.

This leaves uses of hook_exists() in two places that matter. The
"reference-transaction" check in refs.c, see 67541597670 (refs:
implement reference transaction hook, 2020-06-19), and the
"prepare-commit-msg" hook, see 66618a50f9c (sequencer: run
'prepare-commit-msg' hook, 2018-01-24).

In both of those cases we're saving ourselves CPU time by not
preparing data for the hook that we'll then do nothing with if we
don't have the hook. So using this "invoked_hook" pattern doesn't make
sense in those cases.

The "reference-transaction" and "prepare-commit-msg" hook also aren't
racy. In those cases we'll skip the hook runs if we race with a new
hook being added, whereas in the TOCTOU races being fixed here we were
incorrectly skipping the required post-hook logic.

1. https://lore.kernel.org/git/20170810191613.kpmhzg4seyxy3cpq@sigill.intra.peff.net/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/commit.c       | 18 +++++++++++-------
 builtin/merge.c        | 11 +++++++----
 builtin/receive-pack.c |  8 +++++---
 commit.c               |  2 +-
 commit.h               |  3 ++-
 hook.c                 |  7 +++++++
 hook.h                 | 12 ++++++++++++
 sequencer.c            |  4 ++--
 8 files changed, 47 insertions(+), 18 deletions(-)

diff --git a/builtin/commit.c b/builtin/commit.c
index 8b8bdad3959..009a1de0a3d 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -726,11 +726,13 @@ static int prepare_to_commit(const char *index_file, const char *prefix,
 	int clean_message_contents = (cleanup_mode != COMMIT_MSG_CLEANUP_NONE);
 	int old_display_comment_prefix;
 	int merge_contains_scissors = 0;
+	int invoked_hook;
 
 	/* This checks and barfs if author is badly specified */
 	determine_author_info(author_ident);
 
-	if (!no_verify && run_commit_hook(use_editor, index_file, "pre-commit", NULL))
+	if (!no_verify && run_commit_hook(use_editor, index_file, &invoked_hook,
+					  "pre-commit", NULL))
 		return 0;
 
 	if (squash_message) {
@@ -1053,10 +1055,10 @@ static int prepare_to_commit(const char *index_file, const char *prefix,
 		return 0;
 	}
 
-	if (!no_verify && hook_exists("pre-commit")) {
+	if (!no_verify && invoked_hook) {
 		/*
-		 * Re-read the index as pre-commit hook could have updated it,
-		 * and write it out as a tree.  We must do this before we invoke
+		 * Re-read the index as the pre-commit-commit hook was invoked
+		 * and could have updated it. We must do this before we invoke
 		 * the editor and after we invoke run_status above.
 		 */
 		discard_cache();
@@ -1068,7 +1070,7 @@ static int prepare_to_commit(const char *index_file, const char *prefix,
 		return 0;
 	}
 
-	if (run_commit_hook(use_editor, index_file, "prepare-commit-msg",
+	if (run_commit_hook(use_editor, index_file, NULL, "prepare-commit-msg",
 			    git_path_commit_editmsg(), hook_arg1, hook_arg2, NULL))
 		return 0;
 
@@ -1085,7 +1087,8 @@ static int prepare_to_commit(const char *index_file, const char *prefix,
 	}
 
 	if (!no_verify &&
-	    run_commit_hook(use_editor, index_file, "commit-msg", git_path_commit_editmsg(), NULL)) {
+	    run_commit_hook(use_editor, index_file, NULL, "commit-msg",
+			    git_path_commit_editmsg(), NULL)) {
 		return 0;
 	}
 
@@ -1841,7 +1844,8 @@ int cmd_commit(int argc, const char **argv, const char *prefix)
 
 	repo_rerere(the_repository, 0);
 	run_auto_maintenance(quiet);
-	run_commit_hook(use_editor, get_index_file(), "post-commit", NULL);
+	run_commit_hook(use_editor, get_index_file(), NULL, "post-commit",
+			NULL);
 	if (amend && !no_post_rewrite) {
 		commit_post_rewrite(the_repository, current_head, &oid);
 	}
diff --git a/builtin/merge.c b/builtin/merge.c
index b26b4c45157..f178f5a3ee1 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -846,7 +846,9 @@ static void prepare_to_commit(struct commit_list *remoteheads)
 	const char *index_file = get_index_file();
 
 	if (!no_verify) {
-		if (run_commit_hook(0 < option_edit, index_file,
+		int invoked_hook;
+
+		if (run_commit_hook(0 < option_edit, index_file, &invoked_hook,
 				    "pre-merge-commit", NULL))
 			abort_commit(remoteheads, NULL);
 		/*
@@ -854,7 +856,7 @@ static void prepare_to_commit(struct commit_list *remoteheads)
 		 * and write it out as a tree.  We must do this before we invoke
 		 * the editor and after we invoke run_status above.
 		 */
-		if (hook_exists("pre-merge-commit"))
+		if (invoked_hook)
 			discard_cache();
 	}
 	read_cache_from(index_file);
@@ -878,7 +880,8 @@ static void prepare_to_commit(struct commit_list *remoteheads)
 		append_signoff(&msg, ignore_non_trailer(msg.buf, msg.len), 0);
 	write_merge_heads(remoteheads);
 	write_file_buf(git_path_merge_msg(the_repository), msg.buf, msg.len);
-	if (run_commit_hook(0 < option_edit, get_index_file(), "prepare-commit-msg",
+	if (run_commit_hook(0 < option_edit, get_index_file(), NULL,
+			    "prepare-commit-msg",
 			    git_path_merge_msg(the_repository), "merge", NULL))
 		abort_commit(remoteheads, NULL);
 	if (0 < option_edit) {
@@ -887,7 +890,7 @@ static void prepare_to_commit(struct commit_list *remoteheads)
 	}
 
 	if (!no_verify && run_commit_hook(0 < option_edit, get_index_file(),
-					  "commit-msg",
+					  NULL, "commit-msg",
 					  git_path_merge_msg(the_repository), NULL))
 		abort_commit(remoteheads, NULL);
 
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index d10aeb7e78f..d4db5776694 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -1408,10 +1408,12 @@ static const char *push_to_deploy(unsigned char *sha1,
 static const char *push_to_checkout_hook = "push-to-checkout";
 
 static const char *push_to_checkout(unsigned char *hash,
+				    int *invoked_hook,
 				    struct strvec *env,
 				    const char *work_tree)
 {
 	struct run_hooks_opt opt = RUN_HOOKS_OPT_INIT;
+	opt.invoked_hook = invoked_hook;
 
 	strvec_pushf(env, "GIT_WORK_TREE=%s", absolute_path(work_tree));
 	strvec_pushv(&opt.env, env->v);
@@ -1426,6 +1428,7 @@ static const char *update_worktree(unsigned char *sha1, const struct worktree *w
 {
 	const char *retval, *git_dir;
 	struct strvec env = STRVEC_INIT;
+	int invoked_hook;
 
 	if (!worktree || !worktree->path)
 		BUG("worktree->path must be non-NULL");
@@ -1436,10 +1439,9 @@ static const char *update_worktree(unsigned char *sha1, const struct worktree *w
 
 	strvec_pushf(&env, "GIT_DIR=%s", absolute_path(git_dir));
 
-	if (!hook_exists(push_to_checkout_hook))
+	retval = push_to_checkout(sha1, &invoked_hook, &env, worktree->path);
+	if (!invoked_hook)
 		retval = push_to_deploy(sha1, &env, worktree->path);
-	else
-		retval = push_to_checkout(sha1, &env, worktree->path);
 
 	strvec_clear(&env);
 	return retval;
diff --git a/commit.c b/commit.c
index d400f5dfa2b..396e14d7b32 100644
--- a/commit.c
+++ b/commit.c
@@ -1713,7 +1713,7 @@ size_t ignore_non_trailer(const char *buf, size_t len)
 }
 
 int run_commit_hook(int editor_is_used, const char *index_file,
-		    const char *name, ...)
+		    int *invoked_hook, const char *name, ...)
 {
 	struct run_hooks_opt opt = RUN_HOOKS_OPT_INIT;
 	va_list args;
diff --git a/commit.h b/commit.h
index 38cc5426615..3b174135bcf 100644
--- a/commit.h
+++ b/commit.h
@@ -369,7 +369,8 @@ int compare_commits_by_commit_date(const void *a_, const void *b_, void *unused)
 int compare_commits_by_gen_then_commit_date(const void *a_, const void *b_, void *unused);
 
 LAST_ARG_MUST_BE_NULL
-int run_commit_hook(int editor_is_used, const char *index_file, const char *name, ...);
+int run_commit_hook(int editor_is_used, const char *index_file,
+		    int *invoked_hook, const char *name, ...);
 
 /* Sign a commit or tag buffer, storing the result in a header. */
 int sign_with_header(struct strbuf *buf, const char *keyid);
diff --git a/hook.c b/hook.c
index 69a215b2c3c..1d51be3b77a 100644
--- a/hook.c
+++ b/hook.c
@@ -96,9 +96,13 @@ static int notify_hook_finished(int result,
 				void *pp_task_cb)
 {
 	struct hook_cb_data *hook_cb = pp_cb;
+	struct run_hooks_opt *opt = hook_cb->options;
 
 	hook_cb->rc |= result;
 
+	if (opt->invoked_hook)
+		*opt->invoked_hook = 1;
+
 	return 0;
 }
 
@@ -123,6 +127,9 @@ int run_hooks_opt(const char *hook_name, struct run_hooks_opt *options)
 	if (!options)
 		BUG("a struct run_hooks_opt must be provided to run_hooks");
 
+	if (options->invoked_hook)
+		*options->invoked_hook = 0;
+
 	if (!hook_path && !options->error_if_missing)
 		goto cleanup;
 
diff --git a/hook.h b/hook.h
index 18d90aedf14..4258b13da0d 100644
--- a/hook.h
+++ b/hook.h
@@ -18,6 +18,18 @@ struct run_hooks_opt
 	 * translates to "struct child_process"'s "dir" member.
 	 */
 	const char *dir;
+
+	/**
+	 * A pointer which if provided will be set to 1 or 0 depending
+	 * on if a hook was started, regardless of whether or not that
+	 * was successful. I.e. if the underlying start_command() was
+	 * successful this will be set to 1.
+	 *
+	 * Used for avoiding TOCTOU races in code that would otherwise
+	 * call hook_exist() after a "maybe hook run" to see if a hook
+	 * was invoked.
+	 */
+	int *invoked_hook;
 };
 
 #define RUN_HOOKS_OPT_INIT { \
diff --git a/sequencer.c b/sequencer.c
index 35006c0cea6..84eed9e96bc 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -1220,7 +1220,7 @@ static int run_prepare_commit_msg_hook(struct repository *r,
 	} else {
 		arg1 = "message";
 	}
-	if (run_commit_hook(0, r->index_file, "prepare-commit-msg", name,
+	if (run_commit_hook(0, r->index_file, NULL, "prepare-commit-msg", name,
 			    arg1, arg2, NULL))
 		ret = error(_("'prepare-commit-msg' hook failed"));
 
@@ -1552,7 +1552,7 @@ static int try_to_commit(struct repository *r,
 		goto out;
 	}
 
-	run_commit_hook(0, r->index_file, "post-commit", NULL);
+	run_commit_hook(0, r->index_file, NULL, "post-commit", NULL);
 	if (flags & AMEND_MSG)
 		commit_post_rewrite(r, current_head, oid);
 
-- 
2.35.1.1242.gfeba0eae32b


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 2/2] hooks: fix an obscure TOCTOU "did we just run a hook?" race
  2022-03-07 12:33   ` [PATCH v2 2/2] hooks: fix an obscure TOCTOU "did we just run a hook?" race Ævar Arnfjörð Bjarmason
@ 2022-03-21 20:30     ` Jonathan Tan
  0 siblings, 0 replies; 12+ messages in thread
From: Jonathan Tan @ 2022-03-21 20:30 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Jonathan Tan, git, Junio C Hamano, Bagas Sanjaya, Emily Shaffer

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:
> diff --git a/builtin/commit.c b/builtin/commit.c
> index 8b8bdad3959..009a1de0a3d 100644
> --- a/builtin/commit.c
> +++ b/builtin/commit.c
> @@ -726,11 +726,13 @@ static int prepare_to_commit(const char *index_file, const char *prefix,
>  	int clean_message_contents = (cleanup_mode != COMMIT_MSG_CLEANUP_NONE);
>  	int old_display_comment_prefix;
>  	int merge_contains_scissors = 0;
> +	int invoked_hook;
>  
>  	/* This checks and barfs if author is badly specified */
>  	determine_author_info(author_ident);
>  
> -	if (!no_verify && run_commit_hook(use_editor, index_file, "pre-commit", NULL))
> +	if (!no_verify && run_commit_hook(use_editor, index_file, &invoked_hook,
> +					  "pre-commit", NULL))
>  		return 0;
>  
>  	if (squash_message) {
> @@ -1053,10 +1055,10 @@ static int prepare_to_commit(const char *index_file, const char *prefix,
>  		return 0;
>  	}
>  
> -	if (!no_verify && hook_exists("pre-commit")) {
> +	if (!no_verify && invoked_hook) {

This commit causes Git to fail Valgrind (tested using "cd t && sh
t5537*.sh -i -v --valgrind-only=10"). You can see here that a caller of
run_commit_hook() expects invoked_hook to be set, but...

> diff --git a/commit.c b/commit.c
> index d400f5dfa2b..396e14d7b32 100644
> --- a/commit.c
> +++ b/commit.c
> @@ -1713,7 +1713,7 @@ size_t ignore_non_trailer(const char *buf, size_t len)
>  }
>  
>  int run_commit_hook(int editor_is_used, const char *index_file,
> -		    const char *name, ...)
> +		    int *invoked_hook, const char *name, ...)
>  {
>  	struct run_hooks_opt opt = RUN_HOOKS_OPT_INIT;
>  	va_list args;
> diff --git a/commit.h b/commit.h

The quoted part is the entire diff of commit.c. You can see that we have
a new argument "int *invoked_hook", but don't actually do anything with
it. Could you (Ævar) take a look?

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2022-03-21 20:30 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-18 20:43 [PATCH 0/2] hooks: fix a race in hook execution Ævar Arnfjörð Bjarmason
2022-02-18 20:43 ` [PATCH 1/2] merge: don't run post-hook logic on --no-verify Ævar Arnfjörð Bjarmason
2022-02-18 23:57   ` Junio C Hamano
2022-02-18 20:43 ` [PATCH 2/2] hooks: fix a TOCTOU in "did we run a hook?" heuristic Ævar Arnfjörð Bjarmason
2022-02-19  0:11   ` Junio C Hamano
2022-02-19  4:48     ` Ævar Arnfjörð Bjarmason
2022-02-19  4:08   ` Taylor Blau
2022-02-19 10:46     ` Ævar Arnfjörð Bjarmason
2022-03-07 12:33 ` [PATCH v2 0/2] hooks: fix a race in hook execution Ævar Arnfjörð Bjarmason
2022-03-07 12:33   ` [PATCH v2 1/2] merge: don't run post-hook logic on --no-verify Ævar Arnfjörð Bjarmason
2022-03-07 12:33   ` [PATCH v2 2/2] hooks: fix an obscure TOCTOU "did we just run a hook?" race Ævar Arnfjörð Bjarmason
2022-03-21 20:30     ` Jonathan Tan

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).