git@vger.kernel.org mailing list mirror (one of many)
 help / Atom feed
* [PATCH 0/5] forking and threading
@ 2017-04-10 23:49 Brandon Williams
  2017-04-10 23:49 ` [PATCH 1/5] run-command: convert sane_execvp to sane_execvpe Brandon Williams
                   ` (7 more replies)
  0 siblings, 8 replies; 140+ messages in thread
From: Brandon Williams @ 2017-04-10 23:49 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams

Forking and threading is a difficult thing to get right due to potential
deadlocks which can occur if one thread holds a lock while another forks.  The
resulting process will still have the lock in a locked state with no hope of it
ever being released (since forking doesn't replicate the threads as well).

The aim of this series is to push the allocation done between fork/exec to
before the call to 'fork()' so that calls to malloc can't deadlock in a forked
child process.

Most standard implementations of malloc (e.g. glibc) do the appropriate thing
and register fork handlers in order ensure that locks are in a usable state
after forking.  Unfortunately other implementations don't do this so to account
for this lets just avoid calls to functions which may require locking.

As far as I understand the only instance of threading and forking which exists
in the current code base is 'git grep --recurse-submodules', and the standard
builds against glibc shouldn't exhibit any of this deadlocking.

Brandon Williams (5):
  run-command: convert sane_execvp to sane_execvpe
  run-command: prepare argv before forking
  run-command: allocate child_err before forking
  run-command: prepare child environment before forking
  run-command: add note about forking and threading

 cache.h       |   3 +-
 exec_cmd.c    |   2 +-
 run-command.c | 151 +++++++++++++++++++++++++++++++++++++++++++++-------------
 3 files changed, 119 insertions(+), 37 deletions(-)

-- 
2.12.2.715.g7642488e1d-goog


^ permalink raw reply	[flat|nested] 140+ messages in thread

* [PATCH 1/5] run-command: convert sane_execvp to sane_execvpe
  2017-04-10 23:49 [PATCH 0/5] forking and threading Brandon Williams
@ 2017-04-10 23:49 ` Brandon Williams
  2017-04-12 19:22   ` Brandon Williams
  2017-04-10 23:49 ` [PATCH 2/5] run-command: prepare argv before forking Brandon Williams
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 140+ messages in thread
From: Brandon Williams @ 2017-04-10 23:49 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams

Convert 'sane_execvp()' to 'sane_execvpe()' which optionally takes a
pointer to an array of 'char *' which should be used as the environment
for the process being exec'd.  If no environment is provided (by passing
NULL instead) then the already existing environment, as stored in
'environ', will be used.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 cache.h       |  3 +--
 exec_cmd.c    |  2 +-
 run-command.c | 15 ++++++++++-----
 3 files changed, 12 insertions(+), 8 deletions(-)

diff --git a/cache.h b/cache.h
index 5c8078291..10d40ecae 100644
--- a/cache.h
+++ b/cache.h
@@ -2185,8 +2185,7 @@ int checkout_fast_forward(const unsigned char *from,
 			  const unsigned char *to,
 			  int overwrite_ignore);
 
-
-int sane_execvp(const char *file, char *const argv[]);
+int sane_execvpe(const char *file, char *const argv[], char *const envp[]);
 
 /*
  * A struct to encapsulate the concept of whether a file has changed
diff --git a/exec_cmd.c b/exec_cmd.c
index fb94aeba9..c375f354d 100644
--- a/exec_cmd.c
+++ b/exec_cmd.c
@@ -118,7 +118,7 @@ int execv_git_cmd(const char **argv) {
 	trace_argv_printf(nargv.argv, "trace: exec:");
 
 	/* execvp() can only ever return if it fails */
-	sane_execvp("git", (char **)nargv.argv);
+	sane_execvpe("git", (char **)nargv.argv, NULL);
 
 	trace_printf("trace: exec failed: %s\n", strerror(errno));
 
diff --git a/run-command.c b/run-command.c
index 574b81d3e..682bc3ca5 100644
--- a/run-command.c
+++ b/run-command.c
@@ -168,10 +168,15 @@ static int exists_in_PATH(const char *file)
 	return r != NULL;
 }
 
-int sane_execvp(const char *file, char * const argv[])
+int sane_execvpe(const char *file, char * const argv[], char *const envp[])
 {
-	if (!execvp(file, argv))
-		return 0; /* cannot happen ;-) */
+	if (envp) {
+		if (!execvpe(file, argv, envp))
+			return 0; /* cannot happen ;-) */
+	} else {
+		if (!execvp(file, argv))
+			return 0; /* cannot happen ;-) */
+	}
 
 	/*
 	 * When a command can't be found because one of the directories
@@ -226,7 +231,7 @@ static int execv_shell_cmd(const char **argv)
 	struct argv_array nargv = ARGV_ARRAY_INIT;
 	prepare_shell_cmd(&nargv, argv);
 	trace_argv_printf(nargv.argv, "trace: exec:");
-	sane_execvp(nargv.argv[0], (char **)nargv.argv);
+	sane_execvpe(nargv.argv[0], (char **)nargv.argv, NULL);
 	argv_array_clear(&nargv);
 	return -1;
 }
@@ -442,7 +447,7 @@ int start_command(struct child_process *cmd)
 		else if (cmd->use_shell)
 			execv_shell_cmd(cmd->argv);
 		else
-			sane_execvp(cmd->argv[0], (char *const*) cmd->argv);
+			sane_execvpe(cmd->argv[0], (char *const*) cmd->argv, NULL);
 		if (errno == ENOENT) {
 			if (!cmd->silent_exec_failure)
 				error("cannot run %s: %s", cmd->argv[0],
-- 
2.12.2.715.g7642488e1d-goog


^ permalink raw reply	[flat|nested] 140+ messages in thread

* [PATCH 2/5] run-command: prepare argv before forking
  2017-04-10 23:49 [PATCH 0/5] forking and threading Brandon Williams
  2017-04-10 23:49 ` [PATCH 1/5] run-command: convert sane_execvp to sane_execvpe Brandon Williams
@ 2017-04-10 23:49 ` Brandon Williams
  2017-04-10 23:49 ` [PATCH 3/5] run-command: allocate child_err " Brandon Williams
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 140+ messages in thread
From: Brandon Williams @ 2017-04-10 23:49 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams

In order to avoid allocation between 'fork()' and 'exec()' the argv
array used in the exec call is prepared prior to forking the process.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 run-command.c | 34 ++++++++++++++++------------------
 1 file changed, 16 insertions(+), 18 deletions(-)

diff --git a/run-command.c b/run-command.c
index 682bc3ca5..2514b54bc 100644
--- a/run-command.c
+++ b/run-command.c
@@ -226,18 +226,6 @@ static const char **prepare_shell_cmd(struct argv_array *out, const char **argv)
 }
 
 #ifndef GIT_WINDOWS_NATIVE
-static int execv_shell_cmd(const char **argv)
-{
-	struct argv_array nargv = ARGV_ARRAY_INIT;
-	prepare_shell_cmd(&nargv, argv);
-	trace_argv_printf(nargv.argv, "trace: exec:");
-	sane_execvpe(nargv.argv[0], (char **)nargv.argv, NULL);
-	argv_array_clear(&nargv);
-	return -1;
-}
-#endif
-
-#ifndef GIT_WINDOWS_NATIVE
 static int child_notifier = -1;
 
 static void notify_parent(void)
@@ -377,9 +365,20 @@ int start_command(struct child_process *cmd)
 #ifndef GIT_WINDOWS_NATIVE
 {
 	int notify_pipe[2];
+	struct argv_array argv = ARGV_ARRAY_INIT;
+
 	if (pipe(notify_pipe))
 		notify_pipe[0] = notify_pipe[1] = -1;
 
+	if (cmd->git_cmd) {
+		argv_array_push(&argv, "git");
+		argv_array_pushv(&argv, cmd->argv);
+	} else if (cmd->use_shell) {
+		prepare_shell_cmd(&argv, cmd->argv);
+	} else {
+		argv_array_pushv(&argv, cmd->argv);
+	}
+
 	cmd->pid = fork();
 	failed_errno = errno;
 	if (!cmd->pid) {
@@ -442,12 +441,9 @@ int start_command(struct child_process *cmd)
 					unsetenv(*cmd->env);
 			}
 		}
-		if (cmd->git_cmd)
-			execv_git_cmd(cmd->argv);
-		else if (cmd->use_shell)
-			execv_shell_cmd(cmd->argv);
-		else
-			sane_execvpe(cmd->argv[0], (char *const*) cmd->argv, NULL);
+
+		sane_execvpe(argv.argv[0], (char *const*) argv.argv, NULL);
+
 		if (errno == ENOENT) {
 			if (!cmd->silent_exec_failure)
 				error("cannot run %s: %s", cmd->argv[0],
@@ -480,6 +476,8 @@ int start_command(struct child_process *cmd)
 		cmd->pid = -1;
 	}
 	close(notify_pipe[0]);
+
+	argv_array_clear(&argv);
 }
 #else
 {
-- 
2.12.2.715.g7642488e1d-goog


^ permalink raw reply	[flat|nested] 140+ messages in thread

* [PATCH 3/5] run-command: allocate child_err before forking
  2017-04-10 23:49 [PATCH 0/5] forking and threading Brandon Williams
  2017-04-10 23:49 ` [PATCH 1/5] run-command: convert sane_execvp to sane_execvpe Brandon Williams
  2017-04-10 23:49 ` [PATCH 2/5] run-command: prepare argv before forking Brandon Williams
@ 2017-04-10 23:49 ` " Brandon Williams
  2017-04-10 23:49 ` [PATCH 4/5] run-command: prepare child environment " Brandon Williams
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 140+ messages in thread
From: Brandon Williams @ 2017-04-10 23:49 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams

In order to avoid allocation between 'fork()' and 'exec()' open the
stream used for the child's error handeling prior to forking.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 run-command.c | 16 +++++++++++-----
 1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/run-command.c b/run-command.c
index 2514b54bc..029d41463 100644
--- a/run-command.c
+++ b/run-command.c
@@ -365,11 +365,18 @@ int start_command(struct child_process *cmd)
 #ifndef GIT_WINDOWS_NATIVE
 {
 	int notify_pipe[2];
+	FILE *child_err = NULL;
 	struct argv_array argv = ARGV_ARRAY_INIT;
 
 	if (pipe(notify_pipe))
 		notify_pipe[0] = notify_pipe[1] = -1;
 
+	if (cmd->no_stderr || need_err) {
+		int child_err_fd = dup(2);
+		set_cloexec(child_err_fd);
+		child_err = fdopen(child_err_fd, "w");
+	}
+
 	if (cmd->git_cmd) {
 		argv_array_push(&argv, "git");
 		argv_array_pushv(&argv, cmd->argv);
@@ -387,11 +394,8 @@ int start_command(struct child_process *cmd)
 		 * before redirecting the process's stderr so that all die()
 		 * in subsequent call paths use the parent's stderr.
 		 */
-		if (cmd->no_stderr || need_err) {
-			int child_err = dup(2);
-			set_cloexec(child_err);
-			set_error_handle(fdopen(child_err, "w"));
-		}
+		if (child_err)
+			set_error_handle(child_err);
 
 		close(notify_pipe[0]);
 		set_cloexec(notify_pipe[1]);
@@ -477,6 +481,8 @@ int start_command(struct child_process *cmd)
 	}
 	close(notify_pipe[0]);
 
+	if (child_err)
+		fclose(child_err);
 	argv_array_clear(&argv);
 }
 #else
-- 
2.12.2.715.g7642488e1d-goog


^ permalink raw reply	[flat|nested] 140+ messages in thread

* [PATCH 4/5] run-command: prepare child environment before forking
  2017-04-10 23:49 [PATCH 0/5] forking and threading Brandon Williams
                   ` (2 preceding siblings ...)
  2017-04-10 23:49 ` [PATCH 3/5] run-command: allocate child_err " Brandon Williams
@ 2017-04-10 23:49 ` " Brandon Williams
  2017-04-11  0:58   ` Jonathan Nieder
  2017-04-10 23:49 ` [PATCH 5/5] run-command: add note about forking and threading Brandon Williams
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 140+ messages in thread
From: Brandon Williams @ 2017-04-10 23:49 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams

In order to avoid allocation between 'fork()' and 'exec()' prepare the
environment to be used in the child process prior to forking.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 run-command.c | 84 ++++++++++++++++++++++++++++++++++++++++++++++++++++-------
 1 file changed, 75 insertions(+), 9 deletions(-)

diff --git a/run-command.c b/run-command.c
index 029d41463..84c63b209 100644
--- a/run-command.c
+++ b/run-command.c
@@ -291,6 +291,75 @@ static int wait_or_whine(pid_t pid, const char *argv0, int in_signal)
 	return code;
 }
 
+static int env_isequal(const char *e1, const char *e2)
+{
+	for (;;) {
+		char c1 = *e1++;
+		char c2 = *e2++;
+		c1 = (c1 == '=') ? '\0' : tolower(c1);
+		c2 = (c2 == '=') ? '\0' : tolower(c2);
+
+		if (c1 != c2)
+			return 0;
+		if (c1 == '\0')
+			return 1;
+	}
+}
+
+static int searchenv(char **env, const char *name)
+{
+	int pos = 0;
+
+	for (; env[pos]; pos++)
+		if (env_isequal(env[pos], name))
+			break;
+
+	return pos;
+}
+
+static int do_putenv(char **env, int env_nr, const char *name)
+{
+	int pos = searchenv(env, name);
+
+	if (strchr(name, '=')) {
+		/* ('key=value'), insert of replace entry */
+		if (pos >= env_nr)
+			env_nr++;
+		env[pos] = (char *) name;
+	} else if (pos < env_nr) {
+		/* otherwise ('key') remove existing entry */
+		env_nr--;
+		memmove(&env[pos], &env[pos + 1],
+			(env_nr - pos) * sizeof(char *));
+		env[env_nr] = NULL;
+	}
+
+	return env_nr;
+}
+
+static char **prep_childenv(const char *const *deltaenv)
+{
+	char **childenv;
+	int childenv_nr = 0, childenv_alloc = 0;
+	int i;
+
+	for (i = 0; environ[i]; i++)
+		childenv_nr++;
+	for (i = 0; deltaenv && deltaenv[i]; i++)
+		childenv_alloc++;
+	/* Add one for the NULL termination */
+	childenv_alloc += childenv_nr + 1;
+
+	childenv = xcalloc(childenv_alloc, sizeof(char *));
+	memcpy(childenv, environ, childenv_nr * sizeof(char *));
+
+	/* merge in deltaenv */
+	for (i = 0; deltaenv && deltaenv[i]; i++)
+		childenv_nr = do_putenv(childenv, childenv_nr, deltaenv[i]);
+
+	return childenv;
+}
+
 int start_command(struct child_process *cmd)
 {
 	int need_in, need_out, need_err;
@@ -365,12 +434,15 @@ int start_command(struct child_process *cmd)
 #ifndef GIT_WINDOWS_NATIVE
 {
 	int notify_pipe[2];
+	char **childenv;
 	FILE *child_err = NULL;
 	struct argv_array argv = ARGV_ARRAY_INIT;
 
 	if (pipe(notify_pipe))
 		notify_pipe[0] = notify_pipe[1] = -1;
 
+	childenv = prep_childenv(cmd->env);
+
 	if (cmd->no_stderr || need_err) {
 		int child_err_fd = dup(2);
 		set_cloexec(child_err_fd);
@@ -437,16 +509,9 @@ int start_command(struct child_process *cmd)
 		if (cmd->dir && chdir(cmd->dir))
 			die_errno("exec '%s': cd to '%s' failed", cmd->argv[0],
 			    cmd->dir);
-		if (cmd->env) {
-			for (; *cmd->env; cmd->env++) {
-				if (strchr(*cmd->env, '='))
-					putenv((char *)*cmd->env);
-				else
-					unsetenv(*cmd->env);
-			}
-		}
 
-		sane_execvpe(argv.argv[0], (char *const*) argv.argv, NULL);
+		sane_execvpe(argv.argv[0], (char *const*) argv.argv,
+			     (char *const*) childenv);
 
 		if (errno == ENOENT) {
 			if (!cmd->silent_exec_failure)
@@ -483,6 +548,7 @@ int start_command(struct child_process *cmd)
 
 	if (child_err)
 		fclose(child_err);
+	free(childenv);
 	argv_array_clear(&argv);
 }
 #else
-- 
2.12.2.715.g7642488e1d-goog


^ permalink raw reply	[flat|nested] 140+ messages in thread

* [PATCH 5/5] run-command: add note about forking and threading
  2017-04-10 23:49 [PATCH 0/5] forking and threading Brandon Williams
                   ` (3 preceding siblings ...)
  2017-04-10 23:49 ` [PATCH 4/5] run-command: prepare child environment " Brandon Williams
@ 2017-04-10 23:49 ` Brandon Williams
  2017-04-11  0:26   ` Jonathan Nieder
  2017-04-11  7:05 ` [PATCH 6/5] run-command: avoid potential dangers in forked child Eric Wong
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 140+ messages in thread
From: Brandon Williams @ 2017-04-10 23:49 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams

Allocation was pushed before forking in order to avoid potential
deadlocking when forking while multiple threads are running.  This
deadlocking is possible when a thread (other than the one forking) has
acquired a lock and didn't get around to releasing it before the fork.
This leaves the lock in a locked state in the resulting process with no
hope of it ever being released.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 run-command.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/run-command.c b/run-command.c
index 84c63b209..2b3249de4 100644
--- a/run-command.c
+++ b/run-command.c
@@ -458,6 +458,14 @@ int start_command(struct child_process *cmd)
 		argv_array_pushv(&argv, cmd->argv);
 	}
 
+	/*
+	 * NOTE: In order to prevent deadlocking when using threads special
+	 * care should be taken with the function calls made in between the
+	 * fork() and exec() calls.  No calls should be made to functions which
+	 * require acquiring a lock (e.g. malloc) as the lock could have been
+	 * held by another thread at the time of forking, causing the lock to
+	 * never be released in the child process.
+	 */
 	cmd->pid = fork();
 	failed_errno = errno;
 	if (!cmd->pid) {
-- 
2.12.2.715.g7642488e1d-goog


^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [PATCH 5/5] run-command: add note about forking and threading
  2017-04-10 23:49 ` [PATCH 5/5] run-command: add note about forking and threading Brandon Williams
@ 2017-04-11  0:26   ` Jonathan Nieder
  2017-04-11  0:53     ` Eric Wong
  0 siblings, 1 reply; 140+ messages in thread
From: Jonathan Nieder @ 2017-04-11  0:26 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git

Hi,

Brandon Williams wrote:

> --- a/run-command.c
> +++ b/run-command.c
> @@ -458,6 +458,14 @@ int start_command(struct child_process *cmd)
>  		argv_array_pushv(&argv, cmd->argv);
>  	}
>  
> +	/*
> +	 * NOTE: In order to prevent deadlocking when using threads special
> +	 * care should be taken with the function calls made in between the
> +	 * fork() and exec() calls.  No calls should be made to functions which
> +	 * require acquiring a lock (e.g. malloc) as the lock could have been
> +	 * held by another thread at the time of forking, causing the lock to
> +	 * never be released in the child process.
> +	 */
>  	cmd->pid = fork();

Why can't git use e.g. posix_spawn to avoid this?

fork()-ing in a threaded context is very painful for maintainability.
Any library function you are using could start taking a lock, and then
you have a deadlock.  So you have to make use of a very small
whitelisted list of library functions for this to work.

The function calls you have to audit are not only between fork() and
exec() in the normal control flow.  You have to worry about signal
handlers, too.

By comparison, posix_spawn() takes care of this all.  I really
strongly do not want to move in the direction of more fork() in a
multi-threaded context.  I'd rather turn off threads in the submodule
case of grep altogether, but that shouldn't be needed:

* it is possible to launch a command without fork+exec
* if that's not suitable, then using fork introduces parallelism that
  interacts much better with fork+exec than threads do

I really don't want to go down this path.  I've said that a few times.
I'm saying it again now.

My two cents,
Jonathan

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [PATCH 5/5] run-command: add note about forking and threading
  2017-04-11  0:26   ` Jonathan Nieder
@ 2017-04-11  0:53     ` Eric Wong
  2017-04-11 17:33       ` Jonathan Nieder
  2017-04-11 17:34       ` Brandon Williams
  0 siblings, 2 replies; 140+ messages in thread
From: Eric Wong @ 2017-04-11  0:53 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: Brandon Williams, git

Jonathan Nieder <jrnieder@gmail.com> wrote:
> Hi,
> 
> Brandon Williams wrote:
> 
> > --- a/run-command.c
> > +++ b/run-command.c
> > @@ -458,6 +458,14 @@ int start_command(struct child_process *cmd)
> >  		argv_array_pushv(&argv, cmd->argv);
> >  	}
> >  
> > +	/*
> > +	 * NOTE: In order to prevent deadlocking when using threads special
> > +	 * care should be taken with the function calls made in between the
> > +	 * fork() and exec() calls.  No calls should be made to functions which
> > +	 * require acquiring a lock (e.g. malloc) as the lock could have been
> > +	 * held by another thread at the time of forking, causing the lock to
> > +	 * never be released in the child process.
> > +	 */
> >  	cmd->pid = fork();
> 
> Why can't git use e.g. posix_spawn to avoid this?

posix_spawn does not support chdir, and it seems we run non-git
commands so no using "git -C" for those.

> fork()-ing in a threaded context is very painful for maintainability.
> Any library function you are using could start taking a lock, and then
> you have a deadlock.  So you have to make use of a very small
> whitelisted list of library functions for this to work.

Completely agreed.

On the other hand, I believe we should make run-command
vfork-compatible (and Brandon's series is a big (but incomplete)
step in the (IMHO) right direction); as anything which is
vfork-safe would also be safe in the presence of threads+(plain) fork.
With vfork; the two processes share heap until execve.

I posted some notes about it last year:

  https://public-inbox.org/git/20160629200142.GA17878@dcvr.yhbt.net/

> The function calls you have to audit are not only between fork() and
> exec() in the normal control flow.  You have to worry about signal
> handlers, too.

Yes, all that auditing is necessary for vfork; too, but totally
doable.  The mainline Ruby implementation has been using vfork
for spawning subprocesses for several years, now; and I think the
ruby-core developers (myself included) have fixed all the
problems with it; even in multi-threaded code which calls malloc.

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [PATCH 4/5] run-command: prepare child environment before forking
  2017-04-10 23:49 ` [PATCH 4/5] run-command: prepare child environment " Brandon Williams
@ 2017-04-11  0:58   ` Jonathan Nieder
  2017-04-11 17:27     ` Brandon Williams
  0 siblings, 1 reply; 140+ messages in thread
From: Jonathan Nieder @ 2017-04-11  0:58 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git

Brandon Williams wrote:

> In order to avoid allocation between 'fork()' and 'exec()' prepare the
> environment to be used in the child process prior to forking.

If using something like posix_spawn(), this would be needed anyway, so
I'll review it.

[...]
> +++ b/run-command.c
[...]
> +static char **prep_childenv(const char *const *deltaenv)
> +{
> +	char **childenv;
> +	int childenv_nr = 0, childenv_alloc = 0;
> +	int i;
> +
> +	for (i = 0; environ[i]; i++)
> +		childenv_nr++;
> +	for (i = 0; deltaenv && deltaenv[i]; i++)
> +		childenv_alloc++;
> +	/* Add one for the NULL termination */
> +	childenv_alloc += childenv_nr + 1;
> +
> +	childenv = xcalloc(childenv_alloc, sizeof(char *));
> +	memcpy(childenv, environ, childenv_nr * sizeof(char *));
> +
> +	/* merge in deltaenv */
> +	for (i = 0; deltaenv && deltaenv[i]; i++)
> +		childenv_nr = do_putenv(childenv, childenv_nr, deltaenv[i]);
> +
> +	return childenv;
> +}

This potentially copies around most of 'environ' several times as it
adjusts for each deltaenv item. Can it be simplified? E.g.

	struct argv_array result = ARGV_ARRAY_INIT;
	struct string_list mods = STRING_LIST_INIT_DUP;
	struct strbuf key = STRBUF_INIT;
	const char **p;

	for (p = cmd_env; *p; p++) {
		const char *equals = strchr(*p, '=');
		if (equals) {
			strbuf_reset(&key);
			strbuf_add(&key, *p, equals - *p);
			string_list_append(&mods, key.buf)->util = *p;
		} else {
			string_list_append(&mods, *p);
		}
	}
	string_list_sort(&mods);

	for (p = environ; *p; p++) {
		struct string_list_item *item;
		const char *equals = strchr(*p, '=');
		if (!equals)
			continue;
		strbuf_reset(&key);
		strbuf_add(&key, *p, equals - *p);
		item = string_list_lookup(&mods, key.buf);

		if (!item) /* no change */
			argv_array_push(&result, *p);
		else if (!item->util) /* unsetenv */
			; /* skip */
		else /* setenv */
			argv_array_push(&result, item->util);
	}

	strbuf_release(&key);
	string_list_clear(&mods);
	return argv_array_detach(&result);

If the string_list API provided a lookup function taking a buffer and
length as argument, the environ loop could be simplified further to
use *p instead of a copy.

Thanks and hope that helps,
Jonathan

^ permalink raw reply	[flat|nested] 140+ messages in thread

* [PATCH 6/5] run-command: avoid potential dangers in forked child
  2017-04-10 23:49 [PATCH 0/5] forking and threading Brandon Williams
                   ` (4 preceding siblings ...)
  2017-04-10 23:49 ` [PATCH 5/5] run-command: add note about forking and threading Brandon Williams
@ 2017-04-11  7:05 ` Eric Wong
  2017-04-11 16:29   ` Brandon Williams
  2017-04-11 17:37 ` [PATCH 0/5] forking and threading Jonathan Nieder
  2017-04-13 18:32 ` [PATCH v2 0/6] " Brandon Williams
  7 siblings, 1 reply; 140+ messages in thread
From: Eric Wong @ 2017-04-11  7:05 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git

Hi Brandon, this series tickles an old itch of mine, so I
started working off of it.  I'm only somewhat concerned
with the path resolution in execvp(e) pontentially calling
malloc on some libcs; but I suppose that's a separate patch
for another time.

Only lightly-tested at the moment, but things seem to work...

------8<-----
Subject: [PATCH] run-command: avoid potential dangers in forked child

All of our standard error handling paths have the potential to
call malloc or take stdio locks; so we must avoid them inside
the forked child.

Instead, the child only writes an 8 byte struct atomically to
the parent through the notification pipe to propagate an error.
All user-visible error reporting happens from the parent;
even avoiding functions like atexit(3) and exit(3).

Finally, we block signals and disable pthreads cancellation to
avoid nasty surprises from other threads and signal handlers
firing inside the child.

This prepares us for eventual use of vfork, where the child and
parent share heap until the child calls execve or _exit.

The only somewhat questionable part I see left is the PATH
searching in execvpe; which could be performed in the parent
(taking into account chdir usage).

Signed-off-by: Eric Wong <e@80x24.org>
---
 run-command.c | 273 +++++++++++++++++++++++++++++++++++++++++-----------------
 1 file changed, 196 insertions(+), 77 deletions(-)

diff --git a/run-command.c b/run-command.c
index 2b3249de4..3d7a57385 100644
--- a/run-command.c
+++ b/run-command.c
@@ -117,55 +117,35 @@ static inline void close_pair(int fd[2])
 	close(fd[1]);
 }
 
-#ifndef GIT_WINDOWS_NATIVE
-static inline void dup_devnull(int to)
-{
-	int fd = open("/dev/null", O_RDWR);
-	if (fd < 0)
-		die_errno(_("open /dev/null failed"));
-	if (dup2(fd, to) < 0)
-		die_errno(_("dup2(%d,%d) failed"), fd, to);
-	close(fd);
-}
-#endif
-
-static char *locate_in_PATH(const char *file)
+static int exists_in_PATH(const char *file)
 {
 	const char *p = getenv("PATH");
-	struct strbuf buf = STRBUF_INIT;
+	char buf[PATH_MAX];
 
 	if (!p || !*p)
-		return NULL;
+		return 0;
 
 	while (1) {
 		const char *end = strchrnul(p, ':');
-
-		strbuf_reset(&buf);
+		char *dst = buf;
 
 		/* POSIX specifies an empty entry as the current directory. */
 		if (end != p) {
-			strbuf_add(&buf, p, end - p);
-			strbuf_addch(&buf, '/');
+			memcpy(dst, p, end - p);
+			dst += end - p;
+			*dst++ = '/';
 		}
-		strbuf_addstr(&buf, file);
+		strcpy(dst, file);
 
-		if (!access(buf.buf, F_OK))
-			return strbuf_detach(&buf, NULL);
+		if (!access(buf, F_OK))
+			return 1;
 
 		if (!*end)
 			break;
 		p = end + 1;
 	}
 
-	strbuf_release(&buf);
-	return NULL;
-}
-
-static int exists_in_PATH(const char *file)
-{
-	char *r = locate_in_PATH(file);
-	free(r);
-	return r != NULL;
+	return 0;
 }
 
 int sane_execvpe(const char *file, char * const argv[], char *const envp[])
@@ -227,16 +207,145 @@ static const char **prepare_shell_cmd(struct argv_array *out, const char **argv)
 
 #ifndef GIT_WINDOWS_NATIVE
 static int child_notifier = -1;
+enum child_errcode {
+	CHILD_ERR_NULL_STDIN,
+	CHILD_ERR_NULL_STDOUT,
+	CHILD_ERR_NULL_STDERR,
+	CHILD_ERR_CHDIR,
+	CHILD_ERR_SIGPROCMASK,
+	CHILD_ERR_ENOENT,
+	CHILD_ERR_ENOENT_SILENT,
+	CHILD_ERR_ERRNO,
+};
+
+struct child_err {
+	enum child_errcode err;
+	int syserr; /* errno */
+};
 
-static void notify_parent(void)
+static void child_die(enum child_errcode err)
 {
-	/*
-	 * execvp failed.  If possible, we'd like to let start_command
-	 * know, so failures like ENOENT can be handled right away; but
-	 * otherwise, finish_command will still report the error.
-	 */
-	xwrite(child_notifier, "", 1);
+	struct child_err buf;
+
+	buf.err = err;
+	buf.syserr = errno;
+
+	/* write(2) on buf smaller than PIPE_BUF (min 512) is atomic: */
+	xwrite(child_notifier, &buf, sizeof(buf));
+	_exit(1);
+}
+
+/*
+ * parent will make it look like the child spewed a fatal error and died
+ * this is needed to prevent changes to t0061.
+ */
+static void fake_fatal(const char *err, va_list params)
+{
+	vreportf("fatal: ", err, params);
 }
+
+static void child_error_fn(const char *err, va_list params)
+{
+	const char msg[] = "error() should not be called in child\n"; \
+	xwrite(2, msg, sizeof(msg) - 1); \
+}
+
+static void child_warn_fn(const char *err, va_list params)
+{
+	const char msg[] = "warn() should not be called in child\n"; \
+	xwrite(2, msg, sizeof(msg) - 1); \
+}
+
+static void NORETURN child_die_fn(const char *err, va_list params)
+{
+	const char msg[] = "die() should not be called in child\n"; \
+	xwrite(2, msg, sizeof(msg) - 1); \
+	_exit(2);
+}
+
+/* this runs in the parent process */
+static void child_err_spew(struct child_process *cmd, struct child_err *cerr)
+{
+	static void (*old_errfn)(const char *err, va_list params);
+
+	old_errfn = get_error_routine();
+	set_error_routine(fake_fatal);
+	errno = cerr->syserr;
+
+	switch (cerr->err) {
+	case CHILD_ERR_NULL_STDIN:
+		error_errno("error redirecting stdin to /dev/null");
+		break;
+	case CHILD_ERR_NULL_STDOUT:
+		error_errno("error redirecting stdout to /dev/null");
+		break;
+	case CHILD_ERR_NULL_STDERR:
+		error_errno("error redirecting stderr to /dev/null");
+		break;
+	case CHILD_ERR_CHDIR:
+		error_errno("exec '%s': cd to '%s' failed",
+				cmd->argv[0], cmd->dir);
+		break;
+	case CHILD_ERR_SIGPROCMASK:
+		error_errno("sigprocmask failed restoring signals");
+	case CHILD_ERR_ENOENT:
+		error_errno("cannot run %s", cmd->argv[0]);
+	case CHILD_ERR_ENOENT_SILENT:
+		break;
+	case CHILD_ERR_ERRNO:
+		error_errno("cannot exec '%s'", cmd->argv[0]);
+	}
+	set_error_routine(old_errfn);
+}
+
+struct atfork_state {
+#ifndef NO_PTHREADS
+	int cs;
+#endif
+	sigset_t old;
+};
+
+#ifndef NO_PTHREADS
+static void bug_die(int err, const char *msg)
+{
+	if (err) {
+		errno = err;
+		die_errno("BUG: %s", msg);
+	}
+}
+#endif
+
+static void atfork_prepare(struct atfork_state *as)
+{
+	sigset_t all;
+
+	if (sigfillset(&all))
+		die_errno("sigfillset");
+#ifdef NO_PTHREADS
+	if (sigprocmask(SIG_SETMASK, &all, &as->old))
+		die_errno("sigprocmask");
+#else
+	bug_die(pthread_sigmask(SIG_SETMASK, &all, &as->old),
+		"blocking all signals");
+	bug_die(pthread_setcancelstate(PTHREAD_CANCEL_DISABLE, &as->cs),
+		"disabling cancellation");
+#endif
+	fflush(NULL);
+}
+
+static void atfork_parent(struct atfork_state *as)
+{
+#ifdef NO_PTHREADS
+	if (sigprocmask(SIG_SETMASK, &af->old, NULL))
+		die_errno("sigprocmask");
+#else
+	bug_die(pthread_setcancelstate(as->cs, NULL),
+		"re-enabling cancellation");
+	bug_die(pthread_sigmask(SIG_SETMASK, &as->old, NULL),
+		"restoring signal mask");
+#endif
+}
+
 #endif
 
 static inline void set_cloexec(int fd)
@@ -274,13 +383,6 @@ static int wait_or_whine(pid_t pid, const char *argv0, int in_signal)
 		code += 128;
 	} else if (WIFEXITED(status)) {
 		code = WEXITSTATUS(status);
-		/*
-		 * Convert special exit code when execvp failed.
-		 */
-		if (code == 127) {
-			code = -1;
-			failed_errno = ENOENT;
-		}
 	} else {
 		error("waitpid is confused (%s)", argv0);
 	}
@@ -435,18 +537,21 @@ int start_command(struct child_process *cmd)
 {
 	int notify_pipe[2];
 	char **childenv;
-	FILE *child_err = NULL;
 	struct argv_array argv = ARGV_ARRAY_INIT;
+	int null_fd = -1;
+	struct child_err cerr;
+	struct atfork_state as;
 
 	if (pipe(notify_pipe))
 		notify_pipe[0] = notify_pipe[1] = -1;
 
 	childenv = prep_childenv(cmd->env);
 
-	if (cmd->no_stderr || need_err) {
-		int child_err_fd = dup(2);
-		set_cloexec(child_err_fd);
-		child_err = fdopen(child_err_fd, "w");
+	if (cmd->no_stdin || cmd->no_stdout || cmd->no_stderr) {
+		null_fd = open("/dev/null", O_RDWR | O_CLOEXEC | O_NONBLOCK);
+		if (null_fd < 0)
+			die_errno(_("open /dev/null failed"));
+		set_cloexec(null_fd);
 	}
 
 	if (cmd->git_cmd) {
@@ -466,25 +571,28 @@ int start_command(struct child_process *cmd)
 	 * held by another thread at the time of forking, causing the lock to
 	 * never be released in the child process.
 	 */
+	atfork_prepare(&as);
 	cmd->pid = fork();
 	failed_errno = errno;
 	if (!cmd->pid) {
+		int sig;
+
 		/*
-		 * Redirect the channel to write syscall error messages to
-		 * before redirecting the process's stderr so that all die()
-		 * in subsequent call paths use the parent's stderr.
+		 * make sure the default routines do not get called,
+		 * they can take stdio locks and malloc:
 		 */
-		if (child_err)
-			set_error_handle(child_err);
+		set_die_routine(child_die_fn);
+		set_error_routine(child_error_fn);
+		set_warn_routine(child_warn_fn);
 
 		close(notify_pipe[0]);
 		set_cloexec(notify_pipe[1]);
 		child_notifier = notify_pipe[1];
-		atexit(notify_parent);
 
-		if (cmd->no_stdin)
-			dup_devnull(0);
-		else if (need_in) {
+		if (cmd->no_stdin) {
+			if (dup2(null_fd, 0) < 0)
+				child_die(CHILD_ERR_NULL_STDIN);
+		} else if (need_in) {
 			dup2(fdin[0], 0);
 			close_pair(fdin);
 		} else if (cmd->in) {
@@ -492,9 +600,10 @@ int start_command(struct child_process *cmd)
 			close(cmd->in);
 		}
 
-		if (cmd->no_stderr)
-			dup_devnull(2);
-		else if (need_err) {
+		if (cmd->no_stderr) {
+			if (dup2(null_fd, 2) < 0)
+				child_die(CHILD_ERR_NULL_STDERR);
+		} else if (need_err) {
 			dup2(fderr[1], 2);
 			close_pair(fderr);
 		} else if (cmd->err > 1) {
@@ -502,9 +611,10 @@ int start_command(struct child_process *cmd)
 			close(cmd->err);
 		}
 
-		if (cmd->no_stdout)
-			dup_devnull(1);
-		else if (cmd->stdout_to_stderr)
+		if (cmd->no_stdout) {
+			if (dup2(null_fd, 1) < 0)
+				child_die(CHILD_ERR_NULL_STDOUT);
+		} else if (cmd->stdout_to_stderr)
 			dup2(2, 1);
 		else if (need_out) {
 			dup2(fdout[1], 1);
@@ -515,21 +625,29 @@ int start_command(struct child_process *cmd)
 		}
 
 		if (cmd->dir && chdir(cmd->dir))
-			die_errno("exec '%s': cd to '%s' failed", cmd->argv[0],
-			    cmd->dir);
+			child_die(CHILD_ERR_CHDIR);
+
+		/*
+		 * restore default signal handlers here, in case
+		 * we catch a signal right before sane_execvpe below
+		 */
+		for (sig = 1; sig < NSIG; sig++)
+			(void)signal(sig, SIG_DFL);
+
+		if (sigprocmask(SIG_SETMASK, &as.old, NULL) != 0)
+			child_die(CHILD_ERR_SIGPROCMASK);
 
 		sane_execvpe(argv.argv[0], (char *const*) argv.argv,
 			     (char *const*) childenv);
-
 		if (errno == ENOENT) {
-			if (!cmd->silent_exec_failure)
-				error("cannot run %s: %s", cmd->argv[0],
-					strerror(ENOENT));
-			exit(127);
+			if (cmd->silent_exec_failure)
+				child_die(CHILD_ERR_ENOENT_SILENT);
+			child_die(CHILD_ERR_ENOENT);
 		} else {
-			die_errno("cannot exec '%s'", cmd->argv[0]);
+			child_die(CHILD_ERR_ERRNO);
 		}
 	}
+	atfork_parent(&as);
 	if (cmd->pid < 0)
 		error_errno("cannot fork() for %s", cmd->argv[0]);
 	else if (cmd->clean_on_exit)
@@ -538,24 +656,25 @@ int start_command(struct child_process *cmd)
 	/*
 	 * Wait for child's execvp. If the execvp succeeds (or if fork()
 	 * failed), EOF is seen immediately by the parent. Otherwise, the
-	 * child process sends a single byte.
+	 * child process sends a child_err struct.
 	 * Note that use of this infrastructure is completely advisory,
 	 * therefore, we keep error checks minimal.
 	 */
 	close(notify_pipe[1]);
-	if (read(notify_pipe[0], &notify_pipe[1], 1) == 1) {
+	if (null_fd >= 0)
+		close(null_fd);
+	if (xread(notify_pipe[0], &cerr, sizeof(cerr)) == sizeof(cerr)) {
 		/*
 		 * At this point we know that fork() succeeded, but execvp()
 		 * failed. Errors have been reported to our stderr.
 		 */
 		wait_or_whine(cmd->pid, cmd->argv[0], 0);
+		child_err_spew(cmd, &cerr);
 		failed_errno = errno;
 		cmd->pid = -1;
 	}
 	close(notify_pipe[0]);
 
-	if (child_err)
-		fclose(child_err);
 	free(childenv);
 	argv_array_clear(&argv);
 }
-- 
EW

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [PATCH 6/5] run-command: avoid potential dangers in forked child
  2017-04-11  7:05 ` [PATCH 6/5] run-command: avoid potential dangers in forked child Eric Wong
@ 2017-04-11 16:29   ` Brandon Williams
  2017-04-11 16:59     ` Eric Wong
  0 siblings, 1 reply; 140+ messages in thread
From: Brandon Williams @ 2017-04-11 16:29 UTC (permalink / raw)
  To: Eric Wong; +Cc: git

On 04/11, Eric Wong wrote:
> Hi Brandon, this series tickles an old itch of mine, so I
> started working off of it.  I'm only somewhat concerned
> with the path resolution in execvp(e) pontentially calling
> malloc on some libcs; but I suppose that's a separate patch
> for another time.
> 
> Only lightly-tested at the moment, but things seem to work...

Thanks Eric! I'll spend some time looking at this patch later today.  As
for the path resolution in execvp(e), I guess we could completely avoid
that if we did the path resolution ourselves, prior to forking, and then
just use execv(e) since it shouldn't have any calls to malloc in them
correct?

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [PATCH 6/5] run-command: avoid potential dangers in forked child
  2017-04-11 16:29   ` Brandon Williams
@ 2017-04-11 16:59     ` Eric Wong
  2017-04-11 17:17       ` Brandon Williams
  0 siblings, 1 reply; 140+ messages in thread
From: Eric Wong @ 2017-04-11 16:59 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git

Brandon Williams <bmwill@google.com> wrote:
> On 04/11, Eric Wong wrote:
> > Hi Brandon, this series tickles an old itch of mine, so I
> > started working off of it.  I'm only somewhat concerned
> > with the path resolution in execvp(e) pontentially calling
> > malloc on some libcs; but I suppose that's a separate patch
> > for another time.
> > 
> > Only lightly-tested at the moment, but things seem to work...
> 
> Thanks Eric! I'll spend some time looking at this patch later today.  As
> for the path resolution in execvp(e), I guess we could completely avoid
> that if we did the path resolution ourselves, prior to forking, and then
> just use execv(e) since it shouldn't have any calls to malloc in them
> correct?

Yeah.  I spent some time looking at it last night, but emulating
the existing ENOENT / EACCESS / ENOTDIR mapping made my head
hurt.

And I'm not sure if I introduced any off-by-one errors in
exists_in_PATH when removing strbuf usage; string manipulation
in plain C scares me :x   Since memcpy/strcpy/getenv in there
are not specified as async-signal safe, they could
theoretically take locks and cause breakage inside a child.


I also wonder if there's a way to annotate internal functions as
async-signal safe (and thus vfork-child safe) besides sprinkling
comments in certain functions like xwrite.

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [PATCH 6/5] run-command: avoid potential dangers in forked child
  2017-04-11 16:59     ` Eric Wong
@ 2017-04-11 17:17       ` Brandon Williams
  0 siblings, 0 replies; 140+ messages in thread
From: Brandon Williams @ 2017-04-11 17:17 UTC (permalink / raw)
  To: Eric Wong; +Cc: git

On 04/11, Eric Wong wrote:
> Brandon Williams <bmwill@google.com> wrote:
> > On 04/11, Eric Wong wrote:
> > > Hi Brandon, this series tickles an old itch of mine, so I
> > > started working off of it.  I'm only somewhat concerned
> > > with the path resolution in execvp(e) pontentially calling
> > > malloc on some libcs; but I suppose that's a separate patch
> > > for another time.
> > > 
> > > Only lightly-tested at the moment, but things seem to work...
> > 
> > Thanks Eric! I'll spend some time looking at this patch later today.  As
> > for the path resolution in execvp(e), I guess we could completely avoid
> > that if we did the path resolution ourselves, prior to forking, and then
> > just use execv(e) since it shouldn't have any calls to malloc in them
> > correct?
> 
> Yeah.  I spent some time looking at it last night, but emulating
> the existing ENOENT / EACCESS / ENOTDIR mapping made my head
> hurt.
> 
> And I'm not sure if I introduced any off-by-one errors in
> exists_in_PATH when removing strbuf usage; string manipulation
> in plain C scares me :x   Since memcpy/strcpy/getenv in there
> are not specified as async-signal safe, they could
> theoretically take locks and cause breakage inside a child.

Well if we move away from the (p) variant of exec, and do that
resolution ourselves, then we can use the existing
exists_in_PATH/locate_in_PATH logic (using strbufs!) and avoid needing
to do that string manipulation in plain C.  Of course we would then need
to adjust some of the errno handling (e.g. if it doesn't exist in the
path, we would then know that prior to forking so really no need to
handle that in the child anymore).

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [PATCH 4/5] run-command: prepare child environment before forking
  2017-04-11  0:58   ` Jonathan Nieder
@ 2017-04-11 17:27     ` Brandon Williams
  2017-04-11 17:30       ` Jonathan Nieder
  0 siblings, 1 reply; 140+ messages in thread
From: Brandon Williams @ 2017-04-11 17:27 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: git

On 04/10, Jonathan Nieder wrote:
> Brandon Williams wrote:
> 
> > In order to avoid allocation between 'fork()' and 'exec()' prepare the
> > environment to be used in the child process prior to forking.
> 
> If using something like posix_spawn(), this would be needed anyway, so
> I'll review it.
> 
> [...]
> > +++ b/run-command.c
> [...]
> > +static char **prep_childenv(const char *const *deltaenv)
> > +{
> > +	char **childenv;
> > +	int childenv_nr = 0, childenv_alloc = 0;
> > +	int i;
> > +
> > +	for (i = 0; environ[i]; i++)
> > +		childenv_nr++;
> > +	for (i = 0; deltaenv && deltaenv[i]; i++)
> > +		childenv_alloc++;
> > +	/* Add one for the NULL termination */
> > +	childenv_alloc += childenv_nr + 1;
> > +
> > +	childenv = xcalloc(childenv_alloc, sizeof(char *));
> > +	memcpy(childenv, environ, childenv_nr * sizeof(char *));
> > +
> > +	/* merge in deltaenv */
> > +	for (i = 0; deltaenv && deltaenv[i]; i++)
> > +		childenv_nr = do_putenv(childenv, childenv_nr, deltaenv[i]);
> > +
> > +	return childenv;
> > +}
> 
> This potentially copies around most of 'environ' several times as it
> adjusts for each deltaenv item. Can it be simplified? E.g.

The only time it copies anything is that first memcpy which makes a
duplicate of the environ, and then when an item is being removed in
do_putenv it will move the back of the array to fill in for the entry
being removed.

> 
> 	struct argv_array result = ARGV_ARRAY_INIT;
> 	struct string_list mods = STRING_LIST_INIT_DUP;
> 	struct strbuf key = STRBUF_INIT;
> 	const char **p;
> 
> 	for (p = cmd_env; *p; p++) {
> 		const char *equals = strchr(*p, '=');
> 		if (equals) {
> 			strbuf_reset(&key);
> 			strbuf_add(&key, *p, equals - *p);
> 			string_list_append(&mods, key.buf)->util = *p;
> 		} else {
> 			string_list_append(&mods, *p);
> 		}
> 	}
> 	string_list_sort(&mods);
> 
> 	for (p = environ; *p; p++) {
> 		struct string_list_item *item;
> 		const char *equals = strchr(*p, '=');
> 		if (!equals)
> 			continue;
> 		strbuf_reset(&key);
> 		strbuf_add(&key, *p, equals - *p);
> 		item = string_list_lookup(&mods, key.buf);
> 
> 		if (!item) /* no change */
> 			argv_array_push(&result, *p);
> 		else if (!item->util) /* unsetenv */
> 			; /* skip */
> 		else /* setenv */
> 			argv_array_push(&result, item->util);
> 	}
> 
> 	strbuf_release(&key);
> 	string_list_clear(&mods);
> 	return argv_array_detach(&result);

This is probably still incomplete as I don't see how this accounts for
entries in 'cmd_env' which are being added to the environment and not
just replacing existing ones.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [PATCH 4/5] run-command: prepare child environment before forking
  2017-04-11 17:27     ` Brandon Williams
@ 2017-04-11 17:30       ` Jonathan Nieder
  0 siblings, 0 replies; 140+ messages in thread
From: Jonathan Nieder @ 2017-04-11 17:30 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git

Brandon Williams wrote:
> On 04/10, Jonathan Nieder wrote:

>> 	struct argv_array result = ARGV_ARRAY_INIT;
>> 	struct string_list mods = STRING_LIST_INIT_DUP;
>> 	struct strbuf key = STRBUF_INIT;
>> 	const char **p;
>> 
>> 	for (p = cmd_env; *p; p++) {
>> 		const char *equals = strchr(*p, '=');
>> 		if (equals) {
>> 			strbuf_reset(&key);
>> 			strbuf_add(&key, *p, equals - *p);
>> 			string_list_append(&mods, key.buf)->util = *p;
>> 		} else {
>> 			string_list_append(&mods, *p);
>> 		}
>> 	}
>> 	string_list_sort(&mods);
>> 
>> 	for (p = environ; *p; p++) {
>> 		struct string_list_item *item;
>> 		const char *equals = strchr(*p, '=');
>> 		if (!equals)
>> 			continue;
>> 		strbuf_reset(&key);
>> 		strbuf_add(&key, *p, equals - *p);
>> 		item = string_list_lookup(&mods, key.buf);
>> 
>> 		if (!item) /* no change */
>> 			argv_array_push(&result, *p);
>> 		else if (!item->util) /* unsetenv */
>> 			; /* skip */
>> 		else /* setenv */
>> 			argv_array_push(&result, item->util);
>> 	}
>> 
>> 	strbuf_release(&key);
>> 	string_list_clear(&mods);
>> 	return argv_array_detach(&result);
>
> This is probably still incomplete as I don't see how this accounts for
> entries in 'cmd_env' which are being added to the environment and not
> just replacing existing ones.

Yes, that's true.  This sample code is incomplete since it doesn't
handle those.

Jonathan

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [PATCH 5/5] run-command: add note about forking and threading
  2017-04-11  0:53     ` Eric Wong
@ 2017-04-11 17:33       ` Jonathan Nieder
  2017-04-11 17:34       ` Brandon Williams
  1 sibling, 0 replies; 140+ messages in thread
From: Jonathan Nieder @ 2017-04-11 17:33 UTC (permalink / raw)
  To: Eric Wong; +Cc: Brandon Williams, git

Eric Wong wrote:
> Jonathan Nieder <jrnieder@gmail.com> wrote:

>> Why can't git use e.g. posix_spawn to avoid this?
>
> posix_spawn does not support chdir, and it seems we run non-git
> commands so no using "git -C" for those.

On the other hand, a number of the non-git commands we run are in a
shell.  At the cost of a wasted shell process, other commands can
be spawned using posix_spawn by passing the chosen directory and
command to

	sh -c 'cd "$1" && shift && exec "$@"' -

[...]
> I posted some notes about it last year:
>
>   https://public-inbox.org/git/20160629200142.GA17878@dcvr.yhbt.net/

Thanks for these notes.

Sincerely,
Jonathan

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [PATCH 5/5] run-command: add note about forking and threading
  2017-04-11  0:53     ` Eric Wong
  2017-04-11 17:33       ` Jonathan Nieder
@ 2017-04-11 17:34       ` Brandon Williams
  2017-04-11 17:40         ` Eric Wong
  1 sibling, 1 reply; 140+ messages in thread
From: Brandon Williams @ 2017-04-11 17:34 UTC (permalink / raw)
  To: Eric Wong; +Cc: Jonathan Nieder, git

On 04/11, Eric Wong wrote:
> Jonathan Nieder <jrnieder@gmail.com> wrote:
> > Why can't git use e.g. posix_spawn to avoid this?
> 
> posix_spawn does not support chdir, and it seems we run non-git
> commands so no using "git -C" for those.

This is actually the biggest reason why I didn't go down that route from
the start.  I didn't want to dig through each and every user of
run-command and verify that removing chdir wouldn't break them (or add
in some other way to do it).

> 
> > fork()-ing in a threaded context is very painful for maintainability.
> > Any library function you are using could start taking a lock, and then
> > you have a deadlock.  So you have to make use of a very small
> > whitelisted list of library functions for this to work.
> 
> Completely agreed.

Yes it is difficult to get right, but it seems very doable to just do
all of the heavy-lifting prior to fork/exec making it easier to just use
async-safe between the fork/exec in the child.

> 
> On the other hand, I believe we should make run-command
> vfork-compatible (and Brandon's series is a big (but incomplete)
> step in the (IMHO) right direction); as anything which is
> vfork-safe would also be safe in the presence of threads+(plain) fork.
> With vfork; the two processes share heap until execve.

I haven't looked to much into vfork, one of the benefits of vfork is
that it is slightly more preferment than vanilla fork correct?  What are
some of the other benefits of using vfork over fork?

> 
> I posted some notes about it last year:
> 
>   https://public-inbox.org/git/20160629200142.GA17878@dcvr.yhbt.net/
> 
> > The function calls you have to audit are not only between fork() and
> > exec() in the normal control flow.  You have to worry about signal
> > handlers, too.
> 
> Yes, all that auditing is necessary for vfork; too, but totally
> doable.  The mainline Ruby implementation has been using vfork
> for spawning subprocesses for several years, now; and I think the
> ruby-core developers (myself included) have fixed all the
> problems with it; even in multi-threaded code which calls malloc.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [PATCH 0/5] forking and threading
  2017-04-10 23:49 [PATCH 0/5] forking and threading Brandon Williams
                   ` (5 preceding siblings ...)
  2017-04-11  7:05 ` [PATCH 6/5] run-command: avoid potential dangers in forked child Eric Wong
@ 2017-04-11 17:37 ` Jonathan Nieder
  2017-04-11 17:54   ` Brandon Williams
  2017-04-13 18:32 ` [PATCH v2 0/6] " Brandon Williams
  7 siblings, 1 reply; 140+ messages in thread
From: Jonathan Nieder @ 2017-04-11 17:37 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, Jonathan Tan

Brandon Williams wrote:

> As far as I understand the only instance of threading and forking which exists
> in the current code base is 'git grep --recurse-submodules', and the standard
> builds against glibc shouldn't exhibit any of this deadlocking.

I don't think we consider builds against glibc to be the only standard
way to build git.  So I do think we need to fix this (and not, e.g.,
to modify our build instructions to require use of a malloc
implementation that registers an atfork handler that unlocks all of
its locks).  Thanks for your work on that.

Jonathan Tan had an idea about how to side-step the issue: what if
"grep" forks an appropriate set of child processes before creating any
threads and then communicates with those children using pipes?
Because no threads have been spawned yet in the children, the child
processes can use ordinary run_command.  When run_command finishes,
the child is still available to launch another command using
run_command.

Thoughts?
Jonathan

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [PATCH 5/5] run-command: add note about forking and threading
  2017-04-11 17:34       ` Brandon Williams
@ 2017-04-11 17:40         ` Eric Wong
  0 siblings, 0 replies; 140+ messages in thread
From: Eric Wong @ 2017-04-11 17:40 UTC (permalink / raw)
  To: Brandon Williams; +Cc: Jonathan Nieder, git

Brandon Williams <bmwill@google.com> wrote:
> On 04/11, Eric Wong wrote:
> > On the other hand, I believe we should make run-command
> > vfork-compatible (and Brandon's series is a big (but incomplete)
> > step in the (IMHO) right direction); as anything which is
> > vfork-safe would also be safe in the presence of threads+(plain) fork.
> > With vfork; the two processes share heap until execve.
> 
> I haven't looked to much into vfork, one of the benefits of vfork is
> that it is slightly more preferment than vanilla fork correct?  What are
> some of the other benefits of using vfork over fork?

Yes, mainly performance and perhaps portability...  Last I
checked (over a decade ago); uCLinux without MMU could not
fork processes; only vfork.

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [PATCH 0/5] forking and threading
  2017-04-11 17:37 ` [PATCH 0/5] forking and threading Jonathan Nieder
@ 2017-04-11 17:54   ` Brandon Williams
  0 siblings, 0 replies; 140+ messages in thread
From: Brandon Williams @ 2017-04-11 17:54 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: git, Jonathan Tan

On 04/11, Jonathan Nieder wrote:
> Brandon Williams wrote:
> Jonathan Tan had an idea about how to side-step the issue: what if
> "grep" forks an appropriate set of child processes before creating any
> threads and then communicates with those children using pipes?
> Because no threads have been spawned yet in the children, the child
> processes can use ordinary run_command.  When run_command finishes,
> the child is still available to launch another command using
> run_command.

While that would be one way to solve the issue, I think that doing that
would require more work refactoring and make grep's code path more
complex than making the adjustments in run-command itself.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [PATCH 1/5] run-command: convert sane_execvp to sane_execvpe
  2017-04-10 23:49 ` [PATCH 1/5] run-command: convert sane_execvp to sane_execvpe Brandon Williams
@ 2017-04-12 19:22   ` Brandon Williams
  0 siblings, 0 replies; 140+ messages in thread
From: Brandon Williams @ 2017-04-12 19:22 UTC (permalink / raw)
  To: git

On 04/10, Brandon Williams wrote:
> Convert 'sane_execvp()' to 'sane_execvpe()' which optionally takes a
> pointer to an array of 'char *' which should be used as the environment
> for the process being exec'd.  If no environment is provided (by passing
> NULL instead) then the already existing environment, as stored in
> 'environ', will be used.

Turns out we probably can't use execvpe since it isn't portable.  From
some of the other discussion it makes more sense to just move to using
execve instead.

> 
> Signed-off-by: Brandon Williams <bmwill@google.com>
> ---
>  cache.h       |  3 +--
>  exec_cmd.c    |  2 +-
>  run-command.c | 15 ++++++++++-----
>  3 files changed, 12 insertions(+), 8 deletions(-)
> 
> diff --git a/cache.h b/cache.h
> index 5c8078291..10d40ecae 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -2185,8 +2185,7 @@ int checkout_fast_forward(const unsigned char *from,
>  			  const unsigned char *to,
>  			  int overwrite_ignore);
>  
> -
> -int sane_execvp(const char *file, char *const argv[]);
> +int sane_execvpe(const char *file, char *const argv[], char *const envp[]);
>  
>  /*
>   * A struct to encapsulate the concept of whether a file has changed
> diff --git a/exec_cmd.c b/exec_cmd.c
> index fb94aeba9..c375f354d 100644
> --- a/exec_cmd.c
> +++ b/exec_cmd.c
> @@ -118,7 +118,7 @@ int execv_git_cmd(const char **argv) {
>  	trace_argv_printf(nargv.argv, "trace: exec:");
>  
>  	/* execvp() can only ever return if it fails */
> -	sane_execvp("git", (char **)nargv.argv);
> +	sane_execvpe("git", (char **)nargv.argv, NULL);
>  
>  	trace_printf("trace: exec failed: %s\n", strerror(errno));
>  
> diff --git a/run-command.c b/run-command.c
> index 574b81d3e..682bc3ca5 100644
> --- a/run-command.c
> +++ b/run-command.c
> @@ -168,10 +168,15 @@ static int exists_in_PATH(const char *file)
>  	return r != NULL;
>  }
>  
> -int sane_execvp(const char *file, char * const argv[])
> +int sane_execvpe(const char *file, char * const argv[], char *const envp[])
>  {
> -	if (!execvp(file, argv))
> -		return 0; /* cannot happen ;-) */
> +	if (envp) {
> +		if (!execvpe(file, argv, envp))
> +			return 0; /* cannot happen ;-) */
> +	} else {
> +		if (!execvp(file, argv))
> +			return 0; /* cannot happen ;-) */
> +	}
>  
>  	/*
>  	 * When a command can't be found because one of the directories
> @@ -226,7 +231,7 @@ static int execv_shell_cmd(const char **argv)
>  	struct argv_array nargv = ARGV_ARRAY_INIT;
>  	prepare_shell_cmd(&nargv, argv);
>  	trace_argv_printf(nargv.argv, "trace: exec:");
> -	sane_execvp(nargv.argv[0], (char **)nargv.argv);
> +	sane_execvpe(nargv.argv[0], (char **)nargv.argv, NULL);
>  	argv_array_clear(&nargv);
>  	return -1;
>  }
> @@ -442,7 +447,7 @@ int start_command(struct child_process *cmd)
>  		else if (cmd->use_shell)
>  			execv_shell_cmd(cmd->argv);
>  		else
> -			sane_execvp(cmd->argv[0], (char *const*) cmd->argv);
> +			sane_execvpe(cmd->argv[0], (char *const*) cmd->argv, NULL);
>  		if (errno == ENOENT) {
>  			if (!cmd->silent_exec_failure)
>  				error("cannot run %s: %s", cmd->argv[0],
> -- 
> 2.12.2.715.g7642488e1d-goog
> 

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 140+ messages in thread

* [PATCH v2 0/6] forking and threading
  2017-04-10 23:49 [PATCH 0/5] forking and threading Brandon Williams
                   ` (6 preceding siblings ...)
  2017-04-11 17:37 ` [PATCH 0/5] forking and threading Jonathan Nieder
@ 2017-04-13 18:32 ` " Brandon Williams
  2017-04-13 18:32   ` [PATCH v2 1/6] t5550: use write_script to generate post-update hook Brandon Williams
                     ` (8 more replies)
  7 siblings, 9 replies; 140+ messages in thread
From: Brandon Williams @ 2017-04-13 18:32 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, jrnieder, e

v2 does a bit of restructuring based on comments from reviewers.  I took the
patch by Eric and broke it up and tweaked it a bit to flow better with v2.  I
left out the part of Eric's patch which did signal manipulation as I wasn't
experienced enough to know what it was doing or why it was necessary.  Though I
believe the code is structured in such a way that Eric could make another patch
on top of this series with just the signal changes.

I switched to using 'execve' instead of 'execvpe' because 'execvpe' isn't a
portable call and doesn't work on systems like macOS.  This means that the path
resolution needs to be done by hand before forking (which there already existed
a function to do just that).

From what I can see, there are now no calls in the child process (after fork
and before exec/_exit) which are not Async-Signal-Safe.  This means that
fork/exec in a threaded context should work without deadlock and we could
potentially move to using vfork instead of fork, though I'll let others more
experienced make that decision.

Brandon Williams (6):
  t5550: use write_script to generate post-update hook
  run-command: prepare command before forking
  run-command: prepare child environment before forking
  run-command: don't die in child when duping /dev/null
  run-command: eliminate calls to error handling functions in child
  run-command: add note about forking and threading

 run-command.c              | 291 ++++++++++++++++++++++++++++++++++-----------
 t/t5550-http-fetch-dumb.sh |   5 +-
 2 files changed, 223 insertions(+), 73 deletions(-)

-- 
2.12.2.762.g0e3151a226-goog


^ permalink raw reply	[flat|nested] 140+ messages in thread

* [PATCH v2 1/6] t5550: use write_script to generate post-update hook
  2017-04-13 18:32 ` [PATCH v2 0/6] " Brandon Williams
@ 2017-04-13 18:32   ` Brandon Williams
  2017-04-13 20:43     ` Jonathan Nieder
  2017-04-13 18:32   ` [PATCH v2 2/6] run-command: prepare command before forking Brandon Williams
                     ` (7 subsequent siblings)
  8 siblings, 1 reply; 140+ messages in thread
From: Brandon Williams @ 2017-04-13 18:32 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, jrnieder, e

The post-update hooks created in t5550-http-fetch-dumb.sh is missing the
"!#/bin/sh" line which can cause issues with portability.  Instead
create the hook using the 'write_script' function which includes the
proper "#!" line.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 t/t5550-http-fetch-dumb.sh | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/t/t5550-http-fetch-dumb.sh b/t/t5550-http-fetch-dumb.sh
index 87308cdce..8552184e7 100755
--- a/t/t5550-http-fetch-dumb.sh
+++ b/t/t5550-http-fetch-dumb.sh
@@ -20,8 +20,9 @@ test_expect_success 'create http-accessible bare repository with loose objects'
 	(cd "$HTTPD_DOCUMENT_ROOT_PATH/repo.git" &&
 	 git config core.bare true &&
 	 mkdir -p hooks &&
-	 echo "exec git update-server-info" >hooks/post-update &&
-	 chmod +x hooks/post-update &&
+	 write_script "hooks/post-update" <<-\EOF &&
+	 exec git update-server-info
+	EOF
 	 hooks/post-update
 	) &&
 	git remote add public "$HTTPD_DOCUMENT_ROOT_PATH/repo.git" &&
-- 
2.12.2.762.g0e3151a226-goog


^ permalink raw reply	[flat|nested] 140+ messages in thread

* [PATCH v2 2/6] run-command: prepare command before forking
  2017-04-13 18:32 ` [PATCH v2 0/6] " Brandon Williams
  2017-04-13 18:32   ` [PATCH v2 1/6] t5550: use write_script to generate post-update hook Brandon Williams
@ 2017-04-13 18:32   ` Brandon Williams
  2017-04-13 21:14     ` Jonathan Nieder
  2017-04-13 18:32   ` [PATCH v2 3/6] run-command: prepare child environment " Brandon Williams
                     ` (6 subsequent siblings)
  8 siblings, 1 reply; 140+ messages in thread
From: Brandon Williams @ 2017-04-13 18:32 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, jrnieder, e

In order to avoid allocation between 'fork()' and 'exec()' the argv
array used in the exec call is prepared prior to forking the process.

In addition to this, the function used to exec is changed from
'execvp()' to 'execv()' as the (p) variant of exec has the potential to
call malloc during the path resolution it performs.  Instead we simply
do the path resolution ourselves during the preparation stage prior to
forking.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 run-command.c | 60 +++++++++++++++++++++++++++++++++++++++--------------------
 1 file changed, 40 insertions(+), 20 deletions(-)

diff --git a/run-command.c b/run-command.c
index 574b81d3e..9ee9fde97 100644
--- a/run-command.c
+++ b/run-command.c
@@ -221,18 +221,6 @@ static const char **prepare_shell_cmd(struct argv_array *out, const char **argv)
 }
 
 #ifndef GIT_WINDOWS_NATIVE
-static int execv_shell_cmd(const char **argv)
-{
-	struct argv_array nargv = ARGV_ARRAY_INIT;
-	prepare_shell_cmd(&nargv, argv);
-	trace_argv_printf(nargv.argv, "trace: exec:");
-	sane_execvp(nargv.argv[0], (char **)nargv.argv);
-	argv_array_clear(&nargv);
-	return -1;
-}
-#endif
-
-#ifndef GIT_WINDOWS_NATIVE
 static int child_notifier = -1;
 
 static void notify_parent(void)
@@ -244,6 +232,35 @@ static void notify_parent(void)
 	 */
 	xwrite(child_notifier, "", 1);
 }
+
+static void prepare_cmd(struct argv_array *out, const struct child_process *cmd)
+{
+	if (!cmd->argv[0])
+		die("BUG: command is empty");
+
+	if (cmd->git_cmd) {
+		argv_array_push(out, "git");
+		argv_array_pushv(out, cmd->argv);
+	} else if (cmd->use_shell) {
+		prepare_shell_cmd(out, cmd->argv);
+	} else {
+		argv_array_pushv(out, cmd->argv);
+	}
+
+	/*
+	 * If there are no '/' characters in the command then perform a path
+	 * lookup and use the resolved path as the command to exec.  If there
+	 * are no '/' characters or if the command wasn't found in the path,
+	 * have exec attempt to invoke the command directly.
+	 */
+	if (!strchr(out->argv[0], '/')) {
+		char *program = locate_in_PATH(out->argv[0]);
+		if (program) {
+			free((char *)out->argv[0]);
+			out->argv[0] = program;
+		}
+	}
+}
 #endif
 
 static inline void set_cloexec(int fd)
@@ -372,9 +389,13 @@ int start_command(struct child_process *cmd)
 #ifndef GIT_WINDOWS_NATIVE
 {
 	int notify_pipe[2];
+	struct argv_array argv = ARGV_ARRAY_INIT;
+
 	if (pipe(notify_pipe))
 		notify_pipe[0] = notify_pipe[1] = -1;
 
+	prepare_cmd(&argv, cmd);
+
 	cmd->pid = fork();
 	failed_errno = errno;
 	if (!cmd->pid) {
@@ -437,12 +458,9 @@ int start_command(struct child_process *cmd)
 					unsetenv(*cmd->env);
 			}
 		}
-		if (cmd->git_cmd)
-			execv_git_cmd(cmd->argv);
-		else if (cmd->use_shell)
-			execv_shell_cmd(cmd->argv);
-		else
-			sane_execvp(cmd->argv[0], (char *const*) cmd->argv);
+
+		execv(argv.argv[0], (char *const *) argv.argv);
+
 		if (errno == ENOENT) {
 			if (!cmd->silent_exec_failure)
 				error("cannot run %s: %s", cmd->argv[0],
@@ -458,7 +476,7 @@ int start_command(struct child_process *cmd)
 		mark_child_for_cleanup(cmd->pid, cmd);
 
 	/*
-	 * Wait for child's execvp. If the execvp succeeds (or if fork()
+	 * Wait for child's exec. If the exec succeeds (or if fork()
 	 * failed), EOF is seen immediately by the parent. Otherwise, the
 	 * child process sends a single byte.
 	 * Note that use of this infrastructure is completely advisory,
@@ -467,7 +485,7 @@ int start_command(struct child_process *cmd)
 	close(notify_pipe[1]);
 	if (read(notify_pipe[0], &notify_pipe[1], 1) == 1) {
 		/*
-		 * At this point we know that fork() succeeded, but execvp()
+		 * At this point we know that fork() succeeded, but exec()
 		 * failed. Errors have been reported to our stderr.
 		 */
 		wait_or_whine(cmd->pid, cmd->argv[0], 0);
@@ -475,6 +493,8 @@ int start_command(struct child_process *cmd)
 		cmd->pid = -1;
 	}
 	close(notify_pipe[0]);
+
+	argv_array_clear(&argv);
 }
 #else
 {
-- 
2.12.2.762.g0e3151a226-goog


^ permalink raw reply	[flat|nested] 140+ messages in thread

* [PATCH v2 3/6] run-command: prepare child environment before forking
  2017-04-13 18:32 ` [PATCH v2 0/6] " Brandon Williams
  2017-04-13 18:32   ` [PATCH v2 1/6] t5550: use write_script to generate post-update hook Brandon Williams
  2017-04-13 18:32   ` [PATCH v2 2/6] run-command: prepare command before forking Brandon Williams
@ 2017-04-13 18:32   ` " Brandon Williams
  2017-04-13 18:32   ` [PATCH v2 4/6] run-command: don't die in child when duping /dev/null Brandon Williams
                     ` (5 subsequent siblings)
  8 siblings, 0 replies; 140+ messages in thread
From: Brandon Williams @ 2017-04-13 18:32 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, jrnieder, e

In order to avoid allocation between 'fork()' and 'exec()' prepare the
environment to be used in the child process prior to forking.

Switch to using 'execve()' so that the construct child environment can
used in the exec'd process.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 run-command.c | 83 ++++++++++++++++++++++++++++++++++++++++++++++++++++-------
 1 file changed, 74 insertions(+), 9 deletions(-)

diff --git a/run-command.c b/run-command.c
index 9ee9fde97..5e2a03145 100644
--- a/run-command.c
+++ b/run-command.c
@@ -261,6 +261,75 @@ static void prepare_cmd(struct argv_array *out, const struct child_process *cmd)
 		}
 	}
 }
+
+static int env_isequal(const char *e1, const char *e2)
+{
+	for (;;) {
+		char c1 = *e1++;
+		char c2 = *e2++;
+		c1 = (c1 == '=') ? '\0' : tolower(c1);
+		c2 = (c2 == '=') ? '\0' : tolower(c2);
+
+		if (c1 != c2)
+			return 0;
+		if (c1 == '\0')
+			return 1;
+	}
+}
+
+static int searchenv(char **env, const char *name)
+{
+	int pos = 0;
+
+	for (; env[pos]; pos++)
+		if (env_isequal(env[pos], name))
+			break;
+
+	return pos;
+}
+
+static int do_putenv(char **env, int env_nr, const char *name)
+{
+	int pos = searchenv(env, name);
+
+	if (strchr(name, '=')) {
+		/* ('key=value'), insert of replace entry */
+		if (pos >= env_nr)
+			env_nr++;
+		env[pos] = (char *) name;
+	} else if (pos < env_nr) {
+		/* otherwise ('key') remove existing entry */
+		env_nr--;
+		memmove(&env[pos], &env[pos + 1],
+			(env_nr - pos) * sizeof(char *));
+		env[env_nr] = NULL;
+	}
+
+	return env_nr;
+}
+
+static char **prep_childenv(const char *const *deltaenv)
+{
+	char **childenv;
+	int childenv_nr = 0, childenv_alloc = 0;
+	int i;
+
+	for (i = 0; environ[i]; i++)
+		childenv_nr++;
+	for (i = 0; deltaenv && deltaenv[i]; i++)
+		childenv_alloc++;
+	/* Add one for the NULL termination */
+	childenv_alloc += childenv_nr + 1;
+
+	childenv = xcalloc(childenv_alloc, sizeof(char *));
+	memcpy(childenv, environ, childenv_nr * sizeof(char *));
+
+	/* merge in deltaenv */
+	for (i = 0; deltaenv && deltaenv[i]; i++)
+		childenv_nr = do_putenv(childenv, childenv_nr, deltaenv[i]);
+
+	return childenv;
+}
 #endif
 
 static inline void set_cloexec(int fd)
@@ -389,12 +458,14 @@ int start_command(struct child_process *cmd)
 #ifndef GIT_WINDOWS_NATIVE
 {
 	int notify_pipe[2];
+	char **childenv;
 	struct argv_array argv = ARGV_ARRAY_INIT;
 
 	if (pipe(notify_pipe))
 		notify_pipe[0] = notify_pipe[1] = -1;
 
 	prepare_cmd(&argv, cmd);
+	childenv = prep_childenv(cmd->env);
 
 	cmd->pid = fork();
 	failed_errno = errno;
@@ -450,16 +521,9 @@ int start_command(struct child_process *cmd)
 		if (cmd->dir && chdir(cmd->dir))
 			die_errno("exec '%s': cd to '%s' failed", cmd->argv[0],
 			    cmd->dir);
-		if (cmd->env) {
-			for (; *cmd->env; cmd->env++) {
-				if (strchr(*cmd->env, '='))
-					putenv((char *)*cmd->env);
-				else
-					unsetenv(*cmd->env);
-			}
-		}
 
-		execv(argv.argv[0], (char *const *) argv.argv);
+		execve(argv.argv[0], (char *const *) argv.argv,
+		       (char *const *) childenv);
 
 		if (errno == ENOENT) {
 			if (!cmd->silent_exec_failure)
@@ -495,6 +559,7 @@ int start_command(struct child_process *cmd)
 	close(notify_pipe[0]);
 
 	argv_array_clear(&argv);
+	free(childenv);
 }
 #else
 {
-- 
2.12.2.762.g0e3151a226-goog


^ permalink raw reply	[flat|nested] 140+ messages in thread

* [PATCH v2 4/6] run-command: don't die in child when duping /dev/null
  2017-04-13 18:32 ` [PATCH v2 0/6] " Brandon Williams
                     ` (2 preceding siblings ...)
  2017-04-13 18:32   ` [PATCH v2 3/6] run-command: prepare child environment " Brandon Williams
@ 2017-04-13 18:32   ` Brandon Williams
  2017-04-13 19:29     ` Eric Wong
  2017-04-13 18:32   ` [PATCH v2 5/6] run-command: eliminate calls to error handling functions in child Brandon Williams
                     ` (4 subsequent siblings)
  8 siblings, 1 reply; 140+ messages in thread
From: Brandon Williams @ 2017-04-13 18:32 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, jrnieder, e

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 run-command.c | 28 +++++++++++++---------------
 1 file changed, 13 insertions(+), 15 deletions(-)

diff --git a/run-command.c b/run-command.c
index 5e2a03145..6751b8319 100644
--- a/run-command.c
+++ b/run-command.c
@@ -117,18 +117,6 @@ static inline void close_pair(int fd[2])
 	close(fd[1]);
 }
 
-#ifndef GIT_WINDOWS_NATIVE
-static inline void dup_devnull(int to)
-{
-	int fd = open("/dev/null", O_RDWR);
-	if (fd < 0)
-		die_errno(_("open /dev/null failed"));
-	if (dup2(fd, to) < 0)
-		die_errno(_("dup2(%d,%d) failed"), fd, to);
-	close(fd);
-}
-#endif
-
 static char *locate_in_PATH(const char *file)
 {
 	const char *p = getenv("PATH");
@@ -458,12 +446,20 @@ int start_command(struct child_process *cmd)
 #ifndef GIT_WINDOWS_NATIVE
 {
 	int notify_pipe[2];
+	int null_fd = -1;
 	char **childenv;
 	struct argv_array argv = ARGV_ARRAY_INIT;
 
 	if (pipe(notify_pipe))
 		notify_pipe[0] = notify_pipe[1] = -1;
 
+	if (cmd->no_stdin || cmd->no_stdout || cmd->no_stderr) {
+		null_fd = open("/dev/null", O_RDWR | O_CLOEXEC | O_NONBLOCK);
+		if (null_fd < 0)
+			die_errno(_("open /dev/null failed"));
+		set_cloexec(null_fd);
+	}
+
 	prepare_cmd(&argv, cmd);
 	childenv = prep_childenv(cmd->env);
 
@@ -487,7 +483,7 @@ int start_command(struct child_process *cmd)
 		atexit(notify_parent);
 
 		if (cmd->no_stdin)
-			dup_devnull(0);
+			dup2(null_fd, 0);
 		else if (need_in) {
 			dup2(fdin[0], 0);
 			close_pair(fdin);
@@ -497,7 +493,7 @@ int start_command(struct child_process *cmd)
 		}
 
 		if (cmd->no_stderr)
-			dup_devnull(2);
+			dup2(null_fd, 2);
 		else if (need_err) {
 			dup2(fderr[1], 2);
 			close_pair(fderr);
@@ -507,7 +503,7 @@ int start_command(struct child_process *cmd)
 		}
 
 		if (cmd->no_stdout)
-			dup_devnull(1);
+			dup2(null_fd, 1);
 		else if (cmd->stdout_to_stderr)
 			dup2(2, 1);
 		else if (need_out) {
@@ -558,6 +554,8 @@ int start_command(struct child_process *cmd)
 	}
 	close(notify_pipe[0]);
 
+	if (null_fd > 0)
+		close(null_fd);
 	argv_array_clear(&argv);
 	free(childenv);
 }
-- 
2.12.2.762.g0e3151a226-goog


^ permalink raw reply	[flat|nested] 140+ messages in thread

* [PATCH v2 5/6] run-command: eliminate calls to error handling functions in child
  2017-04-13 18:32 ` [PATCH v2 0/6] " Brandon Williams
                     ` (3 preceding siblings ...)
  2017-04-13 18:32   ` [PATCH v2 4/6] run-command: don't die in child when duping /dev/null Brandon Williams
@ 2017-04-13 18:32   ` Brandon Williams
  2017-04-13 18:32   ` [PATCH v2 6/6] run-command: add note about forking and threading Brandon Williams
                     ` (3 subsequent siblings)
  8 siblings, 0 replies; 140+ messages in thread
From: Brandon Williams @ 2017-04-13 18:32 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, jrnieder, e

All of our standard error handling paths have the potential to
call malloc or take stdio locks; so we must avoid them inside
the forked child.

Instead, the child only writes an 8 byte struct atomically to
the parent through the notification pipe to propagate an error.
All user-visible error reporting happens from the parent;
even avoiding functions like atexit(3) and exit(3).

Helped-by: Eric Wong <e@80x24.org>
Signed-off-by: Brandon Williams <bmwill@google.com>
---
 run-command.c | 121 ++++++++++++++++++++++++++++++++++++++++++----------------
 1 file changed, 89 insertions(+), 32 deletions(-)

diff --git a/run-command.c b/run-command.c
index 6751b8319..4230c4933 100644
--- a/run-command.c
+++ b/run-command.c
@@ -211,14 +211,82 @@ static const char **prepare_shell_cmd(struct argv_array *out, const char **argv)
 #ifndef GIT_WINDOWS_NATIVE
 static int child_notifier = -1;
 
-static void notify_parent(void)
+enum child_errcode {
+	CHILD_ERR_CHDIR,
+	CHILD_ERR_ENOENT,
+	CHILD_ERR_SILENT,
+	CHILD_ERR_ERRNO,
+};
+
+struct child_err {
+	enum child_errcode err;
+	int syserr; /* errno */
+};
+
+static void child_die(enum child_errcode err)
 {
-	/*
-	 * execvp failed.  If possible, we'd like to let start_command
-	 * know, so failures like ENOENT can be handled right away; but
-	 * otherwise, finish_command will still report the error.
-	 */
-	xwrite(child_notifier, "", 1);
+	struct child_err buf;
+
+	buf.err = err;
+	buf.syserr = errno;
+
+	/* write(2) on buf smaller than PIPE_BUF (min 512) is atomic: */
+	xwrite(child_notifier, &buf, sizeof(buf));
+	_exit(1);
+}
+
+/*
+ * parent will make it look like the child spewed a fatal error and died
+ * this is needed to prevent changes to t0061.
+ */
+static void fake_fatal(const char *err, va_list params)
+{
+	vreportf("fatal: ", err, params);
+}
+
+static void child_error_fn(const char *err, va_list params)
+{
+	const char msg[] = "error() should not be called in child\n";
+	xwrite(2, msg, sizeof(msg) - 1);
+}
+
+static void child_warn_fn(const char *err, va_list params)
+{
+	const char msg[] = "warn() should not be called in child\n";
+	xwrite(2, msg, sizeof(msg) - 1);
+}
+
+static void NORETURN child_die_fn(const char *err, va_list params)
+{
+	const char msg[] = "die() should not be called in child\n";
+	xwrite(2, msg, sizeof(msg) - 1);
+	_exit(2);
+}
+
+/* this runs in the parent process */
+static void child_err_spew(struct child_process *cmd, struct child_err *cerr)
+{
+	static void (*old_errfn)(const char *err, va_list params);
+
+	old_errfn = get_error_routine();
+	set_error_routine(fake_fatal);
+	errno = cerr->syserr;
+
+	switch (cerr->err) {
+	case CHILD_ERR_CHDIR:
+		error_errno("exec '%s': cd to '%s' failed",
+			    cmd->argv[0], cmd->dir);
+		break;
+	case CHILD_ERR_ENOENT:
+		error_errno("cannot run %s", cmd->argv[0]);
+		break;
+	case CHILD_ERR_SILENT:
+		break;
+	case CHILD_ERR_ERRNO:
+		error_errno("cannot exec '%s'", cmd->argv[0]);
+		break;
+	}
+	set_error_routine(old_errfn);
 }
 
 static void prepare_cmd(struct argv_array *out, const struct child_process *cmd)
@@ -355,13 +423,6 @@ static int wait_or_whine(pid_t pid, const char *argv0, int in_signal)
 		code += 128;
 	} else if (WIFEXITED(status)) {
 		code = WEXITSTATUS(status);
-		/*
-		 * Convert special exit code when execvp failed.
-		 */
-		if (code == 127) {
-			code = -1;
-			failed_errno = ENOENT;
-		}
 	} else {
 		error("waitpid is confused (%s)", argv0);
 	}
@@ -449,6 +510,7 @@ int start_command(struct child_process *cmd)
 	int null_fd = -1;
 	char **childenv;
 	struct argv_array argv = ARGV_ARRAY_INIT;
+	struct child_err cerr;
 
 	if (pipe(notify_pipe))
 		notify_pipe[0] = notify_pipe[1] = -1;
@@ -467,20 +529,16 @@ int start_command(struct child_process *cmd)
 	failed_errno = errno;
 	if (!cmd->pid) {
 		/*
-		 * Redirect the channel to write syscall error messages to
-		 * before redirecting the process's stderr so that all die()
-		 * in subsequent call paths use the parent's stderr.
+		 * Ensure the default die/error/warn routines do not get
+		 * called, they can take stdio locks and malloc.
 		 */
-		if (cmd->no_stderr || need_err) {
-			int child_err = dup(2);
-			set_cloexec(child_err);
-			set_error_handle(fdopen(child_err, "w"));
-		}
+		set_die_routine(child_die_fn);
+		set_error_routine(child_error_fn);
+		set_warn_routine(child_warn_fn);
 
 		close(notify_pipe[0]);
 		set_cloexec(notify_pipe[1]);
 		child_notifier = notify_pipe[1];
-		atexit(notify_parent);
 
 		if (cmd->no_stdin)
 			dup2(null_fd, 0);
@@ -515,19 +573,17 @@ int start_command(struct child_process *cmd)
 		}
 
 		if (cmd->dir && chdir(cmd->dir))
-			die_errno("exec '%s': cd to '%s' failed", cmd->argv[0],
-			    cmd->dir);
+			child_die(CHILD_ERR_CHDIR);
 
 		execve(argv.argv[0], (char *const *) argv.argv,
 		       (char *const *) childenv);
 
 		if (errno == ENOENT) {
-			if (!cmd->silent_exec_failure)
-				error("cannot run %s: %s", cmd->argv[0],
-					strerror(ENOENT));
-			exit(127);
+			if (cmd->silent_exec_failure)
+				child_die(CHILD_ERR_SILENT);
+			child_die(CHILD_ERR_ENOENT);
 		} else {
-			die_errno("cannot exec '%s'", cmd->argv[0]);
+			child_die(CHILD_ERR_ERRNO);
 		}
 	}
 	if (cmd->pid < 0)
@@ -538,17 +594,18 @@ int start_command(struct child_process *cmd)
 	/*
 	 * Wait for child's exec. If the exec succeeds (or if fork()
 	 * failed), EOF is seen immediately by the parent. Otherwise, the
-	 * child process sends a single byte.
+	 * child process sends a child_err struct.
 	 * Note that use of this infrastructure is completely advisory,
 	 * therefore, we keep error checks minimal.
 	 */
 	close(notify_pipe[1]);
-	if (read(notify_pipe[0], &notify_pipe[1], 1) == 1) {
+	if (xread(notify_pipe[0], &cerr, sizeof(cerr)) == sizeof(cerr)) {
 		/*
 		 * At this point we know that fork() succeeded, but exec()
 		 * failed. Errors have been reported to our stderr.
 		 */
 		wait_or_whine(cmd->pid, cmd->argv[0], 0);
+		child_err_spew(cmd, &cerr);
 		failed_errno = errno;
 		cmd->pid = -1;
 	}
-- 
2.12.2.762.g0e3151a226-goog


^ permalink raw reply	[flat|nested] 140+ messages in thread

* [PATCH v2 6/6] run-command: add note about forking and threading
  2017-04-13 18:32 ` [PATCH v2 0/6] " Brandon Williams
                     ` (4 preceding siblings ...)
  2017-04-13 18:32   ` [PATCH v2 5/6] run-command: eliminate calls to error handling functions in child Brandon Williams
@ 2017-04-13 18:32   ` Brandon Williams
  2017-04-13 20:50   ` [PATCH v2 0/6] " Jonathan Nieder
                     ` (2 subsequent siblings)
  8 siblings, 0 replies; 140+ messages in thread
From: Brandon Williams @ 2017-04-13 18:32 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, jrnieder, e

All non-Async-Signal-Safe functions (e.g. malloc and die) were removed
between 'fork' and 'exec' in start_command in order to avoid potential
deadlocking when forking while multiple threads are running.  This
deadlocking is possible when a thread (other than the one forking) has
acquired a lock and didn't get around to releasing it before the fork.
This leaves the lock in a locked state in the resulting process with no
hope of it ever being released.

Add a note describing this potential pitfall before the call to 'fork()'
so people working in this section of the code know to only use
Async-Signal-Safe functions in the child process.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 run-command.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/run-command.c b/run-command.c
index 4230c4933..1c36e692d 100644
--- a/run-command.c
+++ b/run-command.c
@@ -525,6 +525,15 @@ int start_command(struct child_process *cmd)
 	prepare_cmd(&argv, cmd);
 	childenv = prep_childenv(cmd->env);
 
+	/*
+	 * NOTE: In order to prevent deadlocking when using threads special
+	 * care should be taken with the function calls made in between the
+	 * fork() and exec() calls.  No calls should be made to functions which
+	 * require acquiring a lock (e.g. malloc) as the lock could have been
+	 * held by another thread at the time of forking, causing the lock to
+	 * never be released in the child process.  This means only
+	 * Async-Signal-Safe functions are permitted in the child.
+	 */
 	cmd->pid = fork();
 	failed_errno = errno;
 	if (!cmd->pid) {
-- 
2.12.2.762.g0e3151a226-goog


^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [PATCH v2 4/6] run-command: don't die in child when duping /dev/null
  2017-04-13 18:32   ` [PATCH v2 4/6] run-command: don't die in child when duping /dev/null Brandon Williams
@ 2017-04-13 19:29     ` Eric Wong
  2017-04-13 19:43       ` Brandon Williams
  0 siblings, 1 reply; 140+ messages in thread
From: Eric Wong @ 2017-04-13 19:29 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, jrnieder

Brandon Williams <bmwill@google.com> wrote:
> @@ -487,7 +483,7 @@ int start_command(struct child_process *cmd)
>  		atexit(notify_parent);
>  
>  		if (cmd->no_stdin)
> -			dup_devnull(0);
> +			dup2(null_fd, 0);

I prefer we keep error checking for dup2 failures,
and also add more error checking for unchecked dup2 calls.
Can be a separate patch, I suppose.

Ditto for other dup2 changes

> @@ -558,6 +554,8 @@ int start_command(struct child_process *cmd)
>  	}
>  	close(notify_pipe[0]);
>  
> +	if (null_fd > 0)
> +		close(null_fd);

I would prefer:

	if (null_fd >= 0)

here, even if we currently do not release stdin.

>  	argv_array_clear(&argv);
>  	free(childenv);

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [PATCH v2 4/6] run-command: don't die in child when duping /dev/null
  2017-04-13 19:29     ` Eric Wong
@ 2017-04-13 19:43       ` Brandon Williams
  0 siblings, 0 replies; 140+ messages in thread
From: Brandon Williams @ 2017-04-13 19:43 UTC (permalink / raw)
  To: Eric Wong; +Cc: git, jrnieder

On 04/13, Eric Wong wrote:
> Brandon Williams <bmwill@google.com> wrote:
> > @@ -487,7 +483,7 @@ int start_command(struct child_process *cmd)
> >  		atexit(notify_parent);
> >  
> >  		if (cmd->no_stdin)
> > -			dup_devnull(0);
> > +			dup2(null_fd, 0);
> 
> I prefer we keep error checking for dup2 failures,
> and also add more error checking for unchecked dup2 calls.
> Can be a separate patch, I suppose.
> 
> Ditto for other dup2 changes

I simply figured that since we weren't doing the checks on the other
dup2 calls that I would keep it consistent.  But if we wanted to add in
more error checking then we can can in another patch.

> 
> > @@ -558,6 +554,8 @@ int start_command(struct child_process *cmd)
> >  	}
> >  	close(notify_pipe[0]);
> >  
> > +	if (null_fd > 0)
> > +		close(null_fd);
> 
> I would prefer:
> 
> 	if (null_fd >= 0)
> 
> here, even if we currently do not release stdin.

K will do.

> 
> >  	argv_array_clear(&argv);
> >  	free(childenv);

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [PATCH v2 1/6] t5550: use write_script to generate post-update hook
  2017-04-13 18:32   ` [PATCH v2 1/6] t5550: use write_script to generate post-update hook Brandon Williams
@ 2017-04-13 20:43     ` Jonathan Nieder
  2017-04-13 20:59       ` Eric Wong
  0 siblings, 1 reply; 140+ messages in thread
From: Jonathan Nieder @ 2017-04-13 20:43 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, e

Hi,

Brandon Williams wrote:

> The post-update hooks created in t5550-http-fetch-dumb.sh is missing the
> "!#/bin/sh" line which can cause issues with portability.  Instead
> create the hook using the 'write_script' function which includes the
> proper "#!" line.
>
> Signed-off-by: Brandon Williams <bmwill@google.com>
> ---
>  t/t5550-http-fetch-dumb.sh | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)

This would allow later patches to regress a previously supported
behavior.

I agree that it's silly to test that behavior as a side-effect of this
unrelated test, but I don't think we want to lose the test coverage.

Thanks for tracking this down and hope that helps,
Jonathan

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [PATCH v2 0/6] forking and threading
  2017-04-13 18:32 ` [PATCH v2 0/6] " Brandon Williams
                     ` (5 preceding siblings ...)
  2017-04-13 18:32   ` [PATCH v2 6/6] run-command: add note about forking and threading Brandon Williams
@ 2017-04-13 20:50   ` " Jonathan Nieder
  2017-04-13 23:44     ` Brandon Williams
  2017-04-13 21:14   ` [PATCH 7/6] run-command: block signals between fork and execve Eric Wong
  2017-04-14 16:58   ` [PATCH v3 00/10] forking and threading Brandon Williams
  8 siblings, 1 reply; 140+ messages in thread
From: Jonathan Nieder @ 2017-04-13 20:50 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, e

Brandon Williams wrote:

> From what I can see, there are now no calls in the child process (after fork
> and before exec/_exit) which are not Async-Signal-Safe.  This means that
> fork/exec in a threaded context should work without deadlock

I don't see why the former implies the latter.  Can you explain
further?

You already know my opinions about fork+threads by now.  I continue to
think this is heading in a direction of decreased maintainability that
I dread.

That's not to say that this is wasted work.  I would prefer an approach
like the following:

 1. First, make grep work without running fork() from threads.
    Jonathan Tan's approach would be one way to do this.  Another way
    would be to simply disable threads in --recurse-submodules mode.

    This would be the first thing to do because it would make tests
    reliable again, without having to wait for deeper changes.

 2. Then, teaching run_command to prepare the environment and do $PATH
    lookup before forking.  This might make it possible for run_command
    to use vfork or might not.

 3. Teaching run_command to delegate chdir to the child, using -C for
    git commands and 'cd' for shell commands, and using a shell where
    necessary where it didn't before.

 4. Switching run_command to use posix_spawn on platforms where it is
    available, which would make it possible to use in a threaded
    context on those platforms.

Thoughts?
Jonathan

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [PATCH v2 1/6] t5550: use write_script to generate post-update hook
  2017-04-13 20:43     ` Jonathan Nieder
@ 2017-04-13 20:59       ` Eric Wong
  2017-04-13 21:35         ` Brandon Williams
  0 siblings, 1 reply; 140+ messages in thread
From: Eric Wong @ 2017-04-13 20:59 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: Brandon Williams, git

Jonathan Nieder <jrnieder@gmail.com> wrote:
> Brandon Williams wrote:
> > The post-update hooks created in t5550-http-fetch-dumb.sh is missing the
> > "!#/bin/sh" line which can cause issues with portability.  Instead
> > create the hook using the 'write_script' function which includes the
> > proper "#!" line.

> This would allow later patches to regress a previously supported
> behavior.
> 
> I agree that it's silly to test that behavior as a side-effect of this
> unrelated test, but I don't think we want to lose the test coverage.

I was about to write something similar about this regression.
The new execve-using code should handle ENOEXEC as execvpe does
and probably a new test for it needs to be written.

^ permalink raw reply	[flat|nested] 140+ messages in thread

* [PATCH 7/6] run-command: block signals between fork and execve
  2017-04-13 18:32 ` [PATCH v2 0/6] " Brandon Williams
                     ` (6 preceding siblings ...)
  2017-04-13 20:50   ` [PATCH v2 0/6] " Jonathan Nieder
@ 2017-04-13 21:14   ` Eric Wong
  2017-04-13 23:37     ` Brandon Williams
  2017-04-14  2:42     ` Brandon Williams
  2017-04-14 16:58   ` [PATCH v3 00/10] forking and threading Brandon Williams
  8 siblings, 2 replies; 140+ messages in thread
From: Eric Wong @ 2017-04-13 21:14 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, jrnieder

Brandon Williams <bmwill@google.com> wrote:
> v2 does a bit of restructuring based on comments from reviewers.  I took the
> patch by Eric and broke it up and tweaked it a bit to flow better with v2.  I
> left out the part of Eric's patch which did signal manipulation as I wasn't
> experienced enough to know what it was doing or why it was necessary.  Though I
> believe the code is structured in such a way that Eric could make another patch
> on top of this series with just the signal changes.

Yeah, I think a separate commit message might be necessary to
explain the signal changes.

-------8<-----
Subject: [PATCH] run-command: block signals between fork and execve

Signal handlers of the parent firing in the forked child may
have unintended side effects.  Rather than auditing every signal
handler we have and will ever have, block signals while forking
and restore default signal handlers in the child before execve.

Restoring default signal handlers is required because
execve does not unblock signals, it only restores default
signal handlers.  So we must restore them with sigprocmask
before execve, leaving a window when signal handlers
we control can fire in the child.  Continue ignoring
ignored signals, but reset the rest to defaults.

Similarly, disable pthread cancellation to future-proof our code
in case we start using cancellation; as cancellation is
implemented with signals in glibc.

Signed-off-by: Eric Wong <e@80x24.org>
---
  Changes from my original in <20170411070534.GA10552@whir>:

  - fixed typo in NO_PTHREADS code

  - dropped fflush(NULL) before fork, consider us screwed anyways
    if the child uses stdio

  - respect SIG_IGN in child; that seems to be the prevailing
    wisdom from reading https://ewontfix.com/7/ and process.c
    in ruby (git clone https://github.com/ruby/ruby.git)

 run-command.c | 69 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 69 insertions(+)

diff --git a/run-command.c b/run-command.c
index 1c36e692d..59a8b4806 100644
--- a/run-command.c
+++ b/run-command.c
@@ -213,6 +213,7 @@ static int child_notifier = -1;
 
 enum child_errcode {
 	CHILD_ERR_CHDIR,
+	CHILD_ERR_SIGPROCMASK,
 	CHILD_ERR_ENOENT,
 	CHILD_ERR_SILENT,
 	CHILD_ERR_ERRNO,
@@ -277,6 +278,8 @@ static void child_err_spew(struct child_process *cmd, struct child_err *cerr)
 		error_errno("exec '%s': cd to '%s' failed",
 			    cmd->argv[0], cmd->dir);
 		break;
+	case CHILD_ERR_SIGPROCMASK:
+		error_errno("sigprocmask failed restoring signals");
 	case CHILD_ERR_ENOENT:
 		error_errno("cannot run %s", cmd->argv[0]);
 		break;
@@ -388,6 +391,53 @@ static char **prep_childenv(const char *const *deltaenv)
 }
 #endif
 
+struct atfork_state {
+#ifndef NO_PTHREADS
+	int cs;
+#endif
+	sigset_t old;
+};
+
+#ifndef NO_PTHREADS
+static void bug_die(int err, const char *msg)
+{
+	if (err) {
+		errno = err;
+		die_errno("BUG: %s", msg);
+	}
+}
+#endif
+
+static void atfork_prepare(struct atfork_state *as)
+{
+	sigset_t all;
+
+	if (sigfillset(&all))
+		die_errno("sigfillset");
+#ifdef NO_PTHREADS
+	if (sigprocmask(SIG_SETMASK, &all, &as->old))
+		die_errno("sigprocmask");
+#else
+	bug_die(pthread_sigmask(SIG_SETMASK, &all, &as->old),
+		"blocking all signals");
+	bug_die(pthread_setcancelstate(PTHREAD_CANCEL_DISABLE, &as->cs),
+		"disabling cancellation");
+#endif
+}
+
+static void atfork_parent(struct atfork_state *as)
+{
+#ifdef NO_PTHREADS
+	if (sigprocmask(SIG_SETMASK, &as->old, NULL))
+		die_errno("sigprocmask");
+#else
+	bug_die(pthread_setcancelstate(as->cs, NULL),
+		"re-enabling cancellation");
+	bug_die(pthread_sigmask(SIG_SETMASK, &as->old, NULL),
+		"restoring signal mask");
+#endif
+}
+
 static inline void set_cloexec(int fd)
 {
 	int flags = fcntl(fd, F_GETFD);
@@ -511,6 +561,7 @@ int start_command(struct child_process *cmd)
 	char **childenv;
 	struct argv_array argv = ARGV_ARRAY_INIT;
 	struct child_err cerr;
+	struct atfork_state as;
 
 	if (pipe(notify_pipe))
 		notify_pipe[0] = notify_pipe[1] = -1;
@@ -524,6 +575,7 @@ int start_command(struct child_process *cmd)
 
 	prepare_cmd(&argv, cmd);
 	childenv = prep_childenv(cmd->env);
+	atfork_prepare(&as);
 
 	/*
 	 * NOTE: In order to prevent deadlocking when using threads special
@@ -537,6 +589,7 @@ int start_command(struct child_process *cmd)
 	cmd->pid = fork();
 	failed_errno = errno;
 	if (!cmd->pid) {
+		int sig;
 		/*
 		 * Ensure the default die/error/warn routines do not get
 		 * called, they can take stdio locks and malloc.
@@ -584,6 +637,21 @@ int start_command(struct child_process *cmd)
 		if (cmd->dir && chdir(cmd->dir))
 			child_die(CHILD_ERR_CHDIR);
 
+		/*
+		 * restore default signal handlers here, in case
+		 * we catch a signal right before execve below
+		 */
+		for (sig = 1; sig < NSIG; sig++) {
+			sighandler_t old = signal(sig, SIG_DFL);
+
+			/* ignored signals get reset to SIG_DFL on execve */
+			if (old == SIG_IGN)
+				signal(sig, SIG_IGN);
+		}
+
+		if (sigprocmask(SIG_SETMASK, &as.old, NULL) != 0)
+			child_die(CHILD_ERR_SIGPROCMASK);
+
 		execve(argv.argv[0], (char *const *) argv.argv,
 		       (char *const *) childenv);
 
@@ -595,6 +663,7 @@ int start_command(struct child_process *cmd)
 			child_die(CHILD_ERR_ERRNO);
 		}
 	}
+	atfork_parent(&as);
 	if (cmd->pid < 0)
 		error_errno("cannot fork() for %s", cmd->argv[0]);
 	else if (cmd->clean_on_exit)
-- 
EW

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [PATCH v2 2/6] run-command: prepare command before forking
  2017-04-13 18:32   ` [PATCH v2 2/6] run-command: prepare command before forking Brandon Williams
@ 2017-04-13 21:14     ` Jonathan Nieder
  2017-04-13 22:41       ` Brandon Williams
  0 siblings, 1 reply; 140+ messages in thread
From: Jonathan Nieder @ 2017-04-13 21:14 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, e

Hi,

Brandon Williams wrote:

> In order to avoid allocation between 'fork()' and 'exec()' the argv
> array used in the exec call is prepared prior to forking the process.

nit: s/(the argv array.*) is prepared/prepare \1/

Git's commit messages are in the imperative mood, as if they are
ordering the code or the computer to do something.

More importantly, the commit message is a good place to explain some
of the motivation behind the patch so that people can understand what
the patch is for by reading it without having to dig into mailing list
archives and get the discussion there.

E.g. this could say

- that grep tests are intermittently failing in configurations using
  some versions of tcmalloc

- that the cause is interaction between fork and threads: malloc holds
  a lock that those versions of tcmalloc doesn't release in a
  pthread_atfork handler

- that according to [1] we need to only call async-signal-safe
  operations between fork and exec.  Using malloc to build the argv
  array isn't async-signal-safe

[1] http://pubs.opengroup.org/onlinepubs/009695399/functions/fork.html

> In addition to this, the function used to exec is changed from
> 'execvp()' to 'execv()' as the (p) variant of exec has the potential to
> call malloc during the path resolution it performs.

*puzzled* is execvp actually allowed to call malloc?

Could this part go in a separate patch?  That would make it easier to
review.

[...]
> +++ b/run-command.c
[...]
> +	/*
> +	 * If there are no '/' characters in the command then perform a path
> +	 * lookup and use the resolved path as the command to exec.  If there
> +	 * are no '/' characters or if the command wasn't found in the path,
> +	 * have exec attempt to invoke the command directly.
> +	 */
> +	if (!strchr(out->argv[0], '/')) {
> +		char *program = locate_in_PATH(out->argv[0]);
> +		if (program) {
> +			free((char *)out->argv[0]);
> +			out->argv[0] = program;
> +		}
> +	}

This does one half of what execvp does but leaves out the other half.
http://pubs.opengroup.org/onlinepubs/009695399/functions/exec.html
explains:

  There are two distinct ways in which the contents of the process
  image file may cause the execution to fail, distinguished by the
  setting of errno to either [ENOEXEC] or [EINVAL] (see the ERRORS
  section). In the cases where the other members of the exec family of
  functions would fail and set errno to [ENOEXEC], the execlp() and
  execvp() functions shall execute a command interpreter and the
  environment of the executed command shall be as if the process
  invoked the sh utility using execl() as follows:

  execl(<shell path>, arg0, file, arg1, ..., (char *)0);

I think this is what the first patch in the series was about.  Do we
want to drop that support?

I think we need to keep it, since it is easy for authors of e.g.
credential helpers to accidentally rely on it.

[...]
> @@ -437,12 +458,9 @@ int start_command(struct child_process *cmd)
>  					unsetenv(*cmd->env);
>  			}
>  		}
> -		if (cmd->git_cmd)
> -			execv_git_cmd(cmd->argv);
> -		else if (cmd->use_shell)
> -			execv_shell_cmd(cmd->argv);
> -		else
> -			sane_execvp(cmd->argv[0], (char *const*) cmd->argv);
> +
> +		execv(argv.argv[0], (char *const *) argv.argv);

What happens in the case sane_execvp was trying to handle?  Does
prepare_cmd need error handling for when the command isn't found?

Sorry this got so fussy.

Thanks and hope that helps,
Jonathan

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [PATCH v2 1/6] t5550: use write_script to generate post-update hook
  2017-04-13 20:59       ` Eric Wong
@ 2017-04-13 21:35         ` Brandon Williams
  2017-04-13 21:39           ` Eric Wong
  0 siblings, 1 reply; 140+ messages in thread
From: Brandon Williams @ 2017-04-13 21:35 UTC (permalink / raw)
  To: Eric Wong; +Cc: Jonathan Nieder, git

On 04/13, Eric Wong wrote:
> Jonathan Nieder <jrnieder@gmail.com> wrote:
> > Brandon Williams wrote:
> > > The post-update hooks created in t5550-http-fetch-dumb.sh is missing the
> > > "!#/bin/sh" line which can cause issues with portability.  Instead
> > > create the hook using the 'write_script' function which includes the
> > > proper "#!" line.
> 
> > This would allow later patches to regress a previously supported
> > behavior.
> > 
> > I agree that it's silly to test that behavior as a side-effect of this
> > unrelated test, but I don't think we want to lose the test coverage.
> 
> I was about to write something similar about this regression.
> The new execve-using code should handle ENOEXEC as execvpe does
> and probably a new test for it needs to be written.

Would it be enough to upon seeing a failed exec call and ENOEXEC to
retry a single time, invoking the shell to attempt to interpret the
command?

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [PATCH v2 1/6] t5550: use write_script to generate post-update hook
  2017-04-13 21:35         ` Brandon Williams
@ 2017-04-13 21:39           ` Eric Wong
  0 siblings, 0 replies; 140+ messages in thread
From: Eric Wong @ 2017-04-13 21:39 UTC (permalink / raw)
  To: Brandon Williams; +Cc: Jonathan Nieder, git

Brandon Williams <bmwill@google.com> wrote:
> On 04/13, Eric Wong wrote:
> > Jonathan Nieder <jrnieder@gmail.com> wrote:
> > > Brandon Williams wrote:
> > > > The post-update hooks created in t5550-http-fetch-dumb.sh is missing the
> > > > "!#/bin/sh" line which can cause issues with portability.  Instead
> > > > create the hook using the 'write_script' function which includes the
> > > > proper "#!" line.
> > 
> > > This would allow later patches to regress a previously supported
> > > behavior.
> > > 
> > > I agree that it's silly to test that behavior as a side-effect of this
> > > unrelated test, but I don't think we want to lose the test coverage.
> > 
> > I was about to write something similar about this regression.
> > The new execve-using code should handle ENOEXEC as execvpe does
> > and probably a new test for it needs to be written.
> 
> Would it be enough to upon seeing a failed exec call and ENOEXEC to
> retry a single time, invoking the shell to attempt to interpret the
> command?

Yes, that's exactly what glibc does.

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [PATCH v2 2/6] run-command: prepare command before forking
  2017-04-13 21:14     ` Jonathan Nieder
@ 2017-04-13 22:41       ` Brandon Williams
  0 siblings, 0 replies; 140+ messages in thread
From: Brandon Williams @ 2017-04-13 22:41 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: git, e

On 04/13, Jonathan Nieder wrote:
> Hi,
> 
> Brandon Williams wrote:
> 
> > In order to avoid allocation between 'fork()' and 'exec()' the argv
> > array used in the exec call is prepared prior to forking the process.
> 
> nit: s/(the argv array.*) is prepared/prepare \1/
> 
> Git's commit messages are in the imperative mood, as if they are
> ordering the code or the computer to do something.
> 
> More importantly, the commit message is a good place to explain some
> of the motivation behind the patch so that people can understand what
> the patch is for by reading it without having to dig into mailing list
> archives and get the discussion there.
> 
> E.g. this could say
> 
> - that grep tests are intermittently failing in configurations using
>   some versions of tcmalloc
> 
> - that the cause is interaction between fork and threads: malloc holds
>   a lock that those versions of tcmalloc doesn't release in a
>   pthread_atfork handler
> 
> - that according to [1] we need to only call async-signal-safe
>   operations between fork and exec.  Using malloc to build the argv
>   array isn't async-signal-safe
> 
> [1] http://pubs.opengroup.org/onlinepubs/009695399/functions/fork.html
> 
> > In addition to this, the function used to exec is changed from
> > 'execvp()' to 'execv()' as the (p) variant of exec has the potential to
> > call malloc during the path resolution it performs.
> 
> *puzzled* is execvp actually allowed to call malloc?

It could possible as it isn't async-signal-safe.

> 
> Could this part go in a separate patch?  That would make it easier to
> review.

I'll break this conversion out to a different patch.

> 
> [...]
> > +++ b/run-command.c
> [...]
> > +	/*
> > +	 * If there are no '/' characters in the command then perform a path
> > +	 * lookup and use the resolved path as the command to exec.  If there
> > +	 * are no '/' characters or if the command wasn't found in the path,
> > +	 * have exec attempt to invoke the command directly.
> > +	 */
> > +	if (!strchr(out->argv[0], '/')) {
> > +		char *program = locate_in_PATH(out->argv[0]);
> > +		if (program) {
> > +			free((char *)out->argv[0]);
> > +			out->argv[0] = program;
> > +		}
> > +	}
> 
> This does one half of what execvp does but leaves out the other half.
> http://pubs.opengroup.org/onlinepubs/009695399/functions/exec.html
> explains:
> 
>   There are two distinct ways in which the contents of the process
>   image file may cause the execution to fail, distinguished by the
>   setting of errno to either [ENOEXEC] or [EINVAL] (see the ERRORS
>   section). In the cases where the other members of the exec family of
>   functions would fail and set errno to [ENOEXEC], the execlp() and
>   execvp() functions shall execute a command interpreter and the
>   environment of the executed command shall be as if the process
>   invoked the sh utility using execl() as follows:
> 
>   execl(<shell path>, arg0, file, arg1, ..., (char *)0);
> 
> I think this is what the first patch in the series was about.  Do we
> want to drop that support?
> 
> I think we need to keep it, since it is easy for authors of e.g.
> credential helpers to accidentally rely on it.
> 
> [...]
> > @@ -437,12 +458,9 @@ int start_command(struct child_process *cmd)
> >  					unsetenv(*cmd->env);
> >  			}
> >  		}
> > -		if (cmd->git_cmd)
> > -			execv_git_cmd(cmd->argv);
> > -		else if (cmd->use_shell)
> > -			execv_shell_cmd(cmd->argv);
> > -		else
> > -			sane_execvp(cmd->argv[0], (char *const*) cmd->argv);
> > +
> > +		execv(argv.argv[0], (char *const *) argv.argv);
> 
> What happens in the case sane_execvp was trying to handle?  Does
> prepare_cmd need error handling for when the command isn't found?
> 
> Sorry this got so fussy.
> 
> Thanks and hope that helps,
> Jonathan

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [PATCH 7/6] run-command: block signals between fork and execve
  2017-04-13 21:14   ` [PATCH 7/6] run-command: block signals between fork and execve Eric Wong
@ 2017-04-13 23:37     ` Brandon Williams
  2017-04-14  2:42     ` Brandon Williams
  1 sibling, 0 replies; 140+ messages in thread
From: Brandon Williams @ 2017-04-13 23:37 UTC (permalink / raw)
  To: Eric Wong; +Cc: git, jrnieder

On 04/13, Eric Wong wrote:
> Brandon Williams <bmwill@google.com> wrote:
> > v2 does a bit of restructuring based on comments from reviewers.  I took the
> > patch by Eric and broke it up and tweaked it a bit to flow better with v2.  I
> > left out the part of Eric's patch which did signal manipulation as I wasn't
> > experienced enough to know what it was doing or why it was necessary.  Though I
> > believe the code is structured in such a way that Eric could make another patch
> > on top of this series with just the signal changes.
> 
> Yeah, I think a separate commit message might be necessary to
> explain the signal changes.

Perfect!  I'll carry the changes along in the reroll.

> 
> -------8<-----
> Subject: [PATCH] run-command: block signals between fork and execve
> 
> Signal handlers of the parent firing in the forked child may
> have unintended side effects.  Rather than auditing every signal
> handler we have and will ever have, block signals while forking
> and restore default signal handlers in the child before execve.
> 
> Restoring default signal handlers is required because
> execve does not unblock signals, it only restores default
> signal handlers.  So we must restore them with sigprocmask
> before execve, leaving a window when signal handlers
> we control can fire in the child.  Continue ignoring
> ignored signals, but reset the rest to defaults.
> 
> Similarly, disable pthread cancellation to future-proof our code
> in case we start using cancellation; as cancellation is
> implemented with signals in glibc.
> 
> Signed-off-by: Eric Wong <e@80x24.org>
> ---
>   Changes from my original in <20170411070534.GA10552@whir>:
> 
>   - fixed typo in NO_PTHREADS code
> 
>   - dropped fflush(NULL) before fork, consider us screwed anyways
>     if the child uses stdio
> 
>   - respect SIG_IGN in child; that seems to be the prevailing
>     wisdom from reading https://ewontfix.com/7/ and process.c
>     in ruby (git clone https://github.com/ruby/ruby.git)
> 
>  run-command.c | 69 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 69 insertions(+)
> 
> diff --git a/run-command.c b/run-command.c
> index 1c36e692d..59a8b4806 100644
> --- a/run-command.c
> +++ b/run-command.c
> @@ -213,6 +213,7 @@ static int child_notifier = -1;
>  
>  enum child_errcode {
>  	CHILD_ERR_CHDIR,
> +	CHILD_ERR_SIGPROCMASK,
>  	CHILD_ERR_ENOENT,
>  	CHILD_ERR_SILENT,
>  	CHILD_ERR_ERRNO,
> @@ -277,6 +278,8 @@ static void child_err_spew(struct child_process *cmd, struct child_err *cerr)
>  		error_errno("exec '%s': cd to '%s' failed",
>  			    cmd->argv[0], cmd->dir);
>  		break;
> +	case CHILD_ERR_SIGPROCMASK:
> +		error_errno("sigprocmask failed restoring signals");
>  	case CHILD_ERR_ENOENT:
>  		error_errno("cannot run %s", cmd->argv[0]);
>  		break;
> @@ -388,6 +391,53 @@ static char **prep_childenv(const char *const *deltaenv)
>  }
>  #endif
>  
> +struct atfork_state {
> +#ifndef NO_PTHREADS
> +	int cs;
> +#endif
> +	sigset_t old;
> +};
> +
> +#ifndef NO_PTHREADS
> +static void bug_die(int err, const char *msg)
> +{
> +	if (err) {
> +		errno = err;
> +		die_errno("BUG: %s", msg);
> +	}
> +}
> +#endif
> +
> +static void atfork_prepare(struct atfork_state *as)
> +{
> +	sigset_t all;
> +
> +	if (sigfillset(&all))
> +		die_errno("sigfillset");
> +#ifdef NO_PTHREADS
> +	if (sigprocmask(SIG_SETMASK, &all, &as->old))
> +		die_errno("sigprocmask");
> +#else
> +	bug_die(pthread_sigmask(SIG_SETMASK, &all, &as->old),
> +		"blocking all signals");
> +	bug_die(pthread_setcancelstate(PTHREAD_CANCEL_DISABLE, &as->cs),
> +		"disabling cancellation");
> +#endif
> +}
> +
> +static void atfork_parent(struct atfork_state *as)
> +{
> +#ifdef NO_PTHREADS
> +	if (sigprocmask(SIG_SETMASK, &as->old, NULL))
> +		die_errno("sigprocmask");
> +#else
> +	bug_die(pthread_setcancelstate(as->cs, NULL),
> +		"re-enabling cancellation");
> +	bug_die(pthread_sigmask(SIG_SETMASK, &as->old, NULL),
> +		"restoring signal mask");
> +#endif
> +}
> +
>  static inline void set_cloexec(int fd)
>  {
>  	int flags = fcntl(fd, F_GETFD);
> @@ -511,6 +561,7 @@ int start_command(struct child_process *cmd)
>  	char **childenv;
>  	struct argv_array argv = ARGV_ARRAY_INIT;
>  	struct child_err cerr;
> +	struct atfork_state as;
>  
>  	if (pipe(notify_pipe))
>  		notify_pipe[0] = notify_pipe[1] = -1;
> @@ -524,6 +575,7 @@ int start_command(struct child_process *cmd)
>  
>  	prepare_cmd(&argv, cmd);
>  	childenv = prep_childenv(cmd->env);
> +	atfork_prepare(&as);
>  
>  	/*
>  	 * NOTE: In order to prevent deadlocking when using threads special
> @@ -537,6 +589,7 @@ int start_command(struct child_process *cmd)
>  	cmd->pid = fork();
>  	failed_errno = errno;
>  	if (!cmd->pid) {
> +		int sig;
>  		/*
>  		 * Ensure the default die/error/warn routines do not get
>  		 * called, they can take stdio locks and malloc.
> @@ -584,6 +637,21 @@ int start_command(struct child_process *cmd)
>  		if (cmd->dir && chdir(cmd->dir))
>  			child_die(CHILD_ERR_CHDIR);
>  
> +		/*
> +		 * restore default signal handlers here, in case
> +		 * we catch a signal right before execve below
> +		 */
> +		for (sig = 1; sig < NSIG; sig++) {
> +			sighandler_t old = signal(sig, SIG_DFL);
> +
> +			/* ignored signals get reset to SIG_DFL on execve */
> +			if (old == SIG_IGN)
> +				signal(sig, SIG_IGN);
> +		}
> +
> +		if (sigprocmask(SIG_SETMASK, &as.old, NULL) != 0)
> +			child_die(CHILD_ERR_SIGPROCMASK);
> +
>  		execve(argv.argv[0], (char *const *) argv.argv,
>  		       (char *const *) childenv);
>  
> @@ -595,6 +663,7 @@ int start_command(struct child_process *cmd)
>  			child_die(CHILD_ERR_ERRNO);
>  		}
>  	}
> +	atfork_parent(&as);
>  	if (cmd->pid < 0)
>  		error_errno("cannot fork() for %s", cmd->argv[0]);
>  	else if (cmd->clean_on_exit)
> -- 
> EW

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [PATCH v2 0/6] forking and threading
  2017-04-13 20:50   ` [PATCH v2 0/6] " Jonathan Nieder
@ 2017-04-13 23:44     ` Brandon Williams
  0 siblings, 0 replies; 140+ messages in thread
From: Brandon Williams @ 2017-04-13 23:44 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: git, e

On 04/13, Jonathan Nieder wrote:
> Brandon Williams wrote:
> 
> > From what I can see, there are now no calls in the child process (after fork
> > and before exec/_exit) which are not Async-Signal-Safe.  This means that
> > fork/exec in a threaded context should work without deadlock
> 
> I don't see why the former implies the latter.  Can you explain
> further?
> 
> You already know my opinions about fork+threads by now.  I continue to
> think this is heading in a direction of decreased maintainability that
> I dread.

I disagree here.  No one thought it was a bad idea back when I was
implementing grep to fork while running with threads.  Id rather fix the
problem in run_command so that we don't ever have to worry about this
again, especially for contributors who are unaware of this issue while
threading.

> 
> That's not to say that this is wasted work.  I would prefer an approach
> like the following:
> 
>  1. First, make grep work without running fork() from threads.
>     Jonathan Tan's approach would be one way to do this.  Another way
>     would be to simply disable threads in --recurse-submodules mode.
> 
>     This would be the first thing to do because it would make tests
>     reliable again, without having to wait for deeper changes.

I'm not much of a fan of Jonathan Tan's suggestion, id rather just fix
the problem at its root instead of adding in an additional hack.  If
this series crashes and burns then yes, lets just shut off threading in
grep with --recurse-submodules is uses, otherwise this series will fix
that case.

> 
>  2. Then, teaching run_command to prepare the environment and do $PATH
>     lookup before forking.  This might make it possible for run_command
>     to use vfork or might not.
> 
>  3. Teaching run_command to delegate chdir to the child, using -C for
>     git commands and 'cd' for shell commands, and using a shell where
>     necessary where it didn't before.
> 
>  4. Switching run_command to use posix_spawn on platforms where it is
>     available, which would make it possible to use in a threaded
>     context on those platforms.

After this series it should be easy enough to convert to posix_spawn if
someone wants to follow up with a patch to do that.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [PATCH 7/6] run-command: block signals between fork and execve
  2017-04-13 21:14   ` [PATCH 7/6] run-command: block signals between fork and execve Eric Wong
  2017-04-13 23:37     ` Brandon Williams
@ 2017-04-14  2:42     ` Brandon Williams
  2017-04-14  5:26       ` Eric Wong
  1 sibling, 1 reply; 140+ messages in thread
From: Brandon Williams @ 2017-04-14  2:42 UTC (permalink / raw)
  To: Eric Wong; +Cc: git, jrnieder

On 04/13, Eric Wong wrote:
> @@ -277,6 +278,8 @@ static void child_err_spew(struct child_process *cmd, struct child_err *cerr)
>  		error_errno("exec '%s': cd to '%s' failed",
>  			    cmd->argv[0], cmd->dir);
>  		break;
> +	case CHILD_ERR_SIGPROCMASK:
> +		error_errno("sigprocmask failed restoring signals");

missing a break statement here I'll add it in, in the re-roll.

>  	case CHILD_ERR_ENOENT:
>  		error_errno("cannot run %s", cmd->argv[0]);
>  		break;

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [PATCH 7/6] run-command: block signals between fork and execve
  2017-04-14  2:42     ` Brandon Williams
@ 2017-04-14  5:26       ` Eric Wong
  2017-04-14  5:35         ` Eric Wong
  0 siblings, 1 reply; 140+ messages in thread
From: Eric Wong @ 2017-04-14  5:26 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, jrnieder

Brandon Williams <bmwill@google.com> wrote:
> On 04/13, Eric Wong wrote:
> > @@ -277,6 +278,8 @@ static void child_err_spew(struct child_process *cmd, struct child_err *cerr)
> >  		error_errno("exec '%s': cd to '%s' failed",
> >  			    cmd->argv[0], cmd->dir);
> >  		break;
> > +	case CHILD_ERR_SIGPROCMASK:
> > +		error_errno("sigprocmask failed restoring signals");
> 
> missing a break statement here I'll add it in, in the re-roll.

Good catch, thanks!

> >  	case CHILD_ERR_ENOENT:
> >  		error_errno("cannot run %s", cmd->argv[0]);
> >  		break;

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [PATCH 7/6] run-command: block signals between fork and execve
  2017-04-14  5:26       ` Eric Wong
@ 2017-04-14  5:35         ` Eric Wong
  0 siblings, 0 replies; 140+ messages in thread
From: Eric Wong @ 2017-04-14  5:35 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, jrnieder

Eric Wong <e@80x24.org> wrote:
> Brandon Williams <bmwill@google.com> wrote:
> > On 04/13, Eric Wong wrote:
> > > @@ -277,6 +278,8 @@ static void child_err_spew(struct child_process *cmd, struct child_err *cerr)
> > >  		error_errno("exec '%s': cd to '%s' failed",
> > >  			    cmd->argv[0], cmd->dir);
> > >  		break;
> > > +	case CHILD_ERR_SIGPROCMASK:
> > > +		error_errno("sigprocmask failed restoring signals");
> > 
> > missing a break statement here I'll add it in, in the re-roll.
> 
> Good catch, thanks!

Actually, I now wonder if that should be die_errno instead.
sigprocmask failures (EFAULT/EINVAL) would only be due
to programmer error.

In one of my minor projects(*), I do something like this:

# define CHECK(type, expect, expr) do { \
	type checkvar = (expr); \
	assert(checkvar == (expect) && "BUG" && __FILE__ && __LINE__); \
	} while (0)

	CHECK(int, 0, sigfillset(&fullset));
	CHECK(int, 0, sigemptyset(&emptyset));
	CHECK(int, 0, pthread_sigmask(SIG_SETMASK, &fullset, NULL));

Dunno if it's considered good style or not, here.

(*) git clone git://bogomips.org/cmogstored

^ permalink raw reply	[flat|nested] 140+ messages in thread

* [PATCH v3 00/10] forking and threading
  2017-04-13 18:32 ` [PATCH v2 0/6] " Brandon Williams
                     ` (7 preceding siblings ...)
  2017-04-13 21:14   ` [PATCH 7/6] run-command: block signals between fork and execve Eric Wong
@ 2017-04-14 16:58   ` Brandon Williams
  2017-04-14 16:58     ` [PATCH v3 01/10] t5550: use write_script to generate post-update hook Brandon Williams
                       ` (10 more replies)
  8 siblings, 11 replies; 140+ messages in thread
From: Brandon Williams @ 2017-04-14 16:58 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, jrnieder, e

Changes in v3:
 * More error handling for dup2/close calls in the child
 * Added a test in t0061 to test for regressions in run_command's
   ability to interpret scripts without a "#!" line
 * In the event execve fails with ENOEXEC, attempt to exec one more
   time by invoking the shell to interpret the command

Brandon Williams (9):
  t5550: use write_script to generate post-update hook
  t0061: run_command executes scripts without a #! line
  run-command: prepare command before forking
  run-command: use the async-signal-safe execv instead of execvp
  run-command: prepare child environment before forking
  run-command: don't die in child when duping /dev/null
  run-command: eliminate calls to error handling functions in child
  run-command: handle dup2 and close errors in child
  run-command: add note about forking and threading

Eric Wong (1):
  run-command: block signals between fork and execve

 run-command.c              | 426 ++++++++++++++++++++++++++++++++++++---------
 t/t0061-run-command.sh     |  11 ++
 t/t5550-http-fetch-dumb.sh |   5 +-
 3 files changed, 357 insertions(+), 85 deletions(-)

-- 
2.12.2.762.g0e3151a226-goog


^ permalink raw reply	[flat|nested] 140+ messages in thread

* [PATCH v3 01/10] t5550: use write_script to generate post-update hook
  2017-04-14 16:58   ` [PATCH v3 00/10] forking and threading Brandon Williams
@ 2017-04-14 16:58     ` Brandon Williams
  2017-04-14 16:58     ` [PATCH v3 02/10] t0061: run_command executes scripts without a #! line Brandon Williams
                       ` (9 subsequent siblings)
  10 siblings, 0 replies; 140+ messages in thread
From: Brandon Williams @ 2017-04-14 16:58 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, jrnieder, e

The post-update hooks created in t5550-http-fetch-dumb.sh is missing the
"!#/bin/sh" line which can cause issues with portability.  Instead
create the hook using the 'write_script' function which includes the
proper "#!" line.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 t/t5550-http-fetch-dumb.sh | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/t/t5550-http-fetch-dumb.sh b/t/t5550-http-fetch-dumb.sh
index 87308cdce..8552184e7 100755
--- a/t/t5550-http-fetch-dumb.sh
+++ b/t/t5550-http-fetch-dumb.sh
@@ -20,8 +20,9 @@ test_expect_success 'create http-accessible bare repository with loose objects'
 	(cd "$HTTPD_DOCUMENT_ROOT_PATH/repo.git" &&
 	 git config core.bare true &&
 	 mkdir -p hooks &&
-	 echo "exec git update-server-info" >hooks/post-update &&
-	 chmod +x hooks/post-update &&
+	 write_script "hooks/post-update" <<-\EOF &&
+	 exec git update-server-info
+	EOF
 	 hooks/post-update
 	) &&
 	git remote add public "$HTTPD_DOCUMENT_ROOT_PATH/repo.git" &&
-- 
2.12.2.762.g0e3151a226-goog


^ permalink raw reply	[flat|nested] 140+ messages in thread

* [PATCH v3 02/10] t0061: run_command executes scripts without a #! line
  2017-04-14 16:58   ` [PATCH v3 00/10] forking and threading Brandon Williams
  2017-04-14 16:58     ` [PATCH v3 01/10] t5550: use write_script to generate post-update hook Brandon Williams
@ 2017-04-14 16:58     ` Brandon Williams
  2017-04-14 16:58     ` [PATCH v3 03/10] run-command: prepare command before forking Brandon Williams
                       ` (8 subsequent siblings)
  10 siblings, 0 replies; 140+ messages in thread
From: Brandon Williams @ 2017-04-14 16:58 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, jrnieder, e

Add a test to 't0061-run-command.sh' to ensure that run_command can
continue to execute scripts which don't include a '#!' line.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 t/t0061-run-command.sh | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/t/t0061-run-command.sh b/t/t0061-run-command.sh
index 12228b4aa..1a7490e29 100755
--- a/t/t0061-run-command.sh
+++ b/t/t0061-run-command.sh
@@ -26,6 +26,17 @@ test_expect_success 'run_command can run a command' '
 	test_cmp empty err
 '
 
+test_expect_success 'run_command can run a script without a #! line' '
+	cat >hello <<-\EOF &&
+	cat hello-script
+	EOF
+	chmod +x hello &&
+	test-run-command run-command ./hello >actual 2>err &&
+
+	test_cmp hello-script actual &&
+	test_cmp empty err
+'
+
 test_expect_success POSIXPERM 'run_command reports EACCES' '
 	cat hello-script >hello.sh &&
 	chmod -x hello.sh &&
-- 
2.12.2.762.g0e3151a226-goog


^ permalink raw reply	[flat|nested] 140+ messages in thread

* [PATCH v3 03/10] run-command: prepare command before forking
  2017-04-14 16:58   ` [PATCH v3 00/10] forking and threading Brandon Williams
  2017-04-14 16:58     ` [PATCH v3 01/10] t5550: use write_script to generate post-update hook Brandon Williams
  2017-04-14 16:58     ` [PATCH v3 02/10] t0061: run_command executes scripts without a #! line Brandon Williams
@ 2017-04-14 16:58     ` Brandon Williams
  2017-04-14 16:58     ` [PATCH v3 04/10] run-command: use the async-signal-safe execv instead of execvp Brandon Williams
                       ` (7 subsequent siblings)
  10 siblings, 0 replies; 140+ messages in thread
From: Brandon Williams @ 2017-04-14 16:58 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, jrnieder, e

According to [1] we need to only call async-signal-safe operations between fork
and exec.  Using malloc to build the argv array isn't async-signal-safe.

In order to avoid allocation between 'fork()' and 'exec()' prepare the
argv array used in the exec call prior to forking the process.

[1] http://pubs.opengroup.org/onlinepubs/009695399/functions/fork.html

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 run-command.c | 46 ++++++++++++++++++++++++++--------------------
 1 file changed, 26 insertions(+), 20 deletions(-)

diff --git a/run-command.c b/run-command.c
index 574b81d3e..d8d143795 100644
--- a/run-command.c
+++ b/run-command.c
@@ -221,18 +221,6 @@ static const char **prepare_shell_cmd(struct argv_array *out, const char **argv)
 }
 
 #ifndef GIT_WINDOWS_NATIVE
-static int execv_shell_cmd(const char **argv)
-{
-	struct argv_array nargv = ARGV_ARRAY_INIT;
-	prepare_shell_cmd(&nargv, argv);
-	trace_argv_printf(nargv.argv, "trace: exec:");
-	sane_execvp(nargv.argv[0], (char **)nargv.argv);
-	argv_array_clear(&nargv);
-	return -1;
-}
-#endif
-
-#ifndef GIT_WINDOWS_NATIVE
 static int child_notifier = -1;
 
 static void notify_parent(void)
@@ -244,6 +232,21 @@ static void notify_parent(void)
 	 */
 	xwrite(child_notifier, "", 1);
 }
+
+static void prepare_cmd(struct argv_array *out, const struct child_process *cmd)
+{
+	if (!cmd->argv[0])
+		die("BUG: command is empty");
+
+	if (cmd->git_cmd) {
+		argv_array_push(out, "git");
+		argv_array_pushv(out, cmd->argv);
+	} else if (cmd->use_shell) {
+		prepare_shell_cmd(out, cmd->argv);
+	} else {
+		argv_array_pushv(out, cmd->argv);
+	}
+}
 #endif
 
 static inline void set_cloexec(int fd)
@@ -372,9 +375,13 @@ int start_command(struct child_process *cmd)
 #ifndef GIT_WINDOWS_NATIVE
 {
 	int notify_pipe[2];
+	struct argv_array argv = ARGV_ARRAY_INIT;
+
 	if (pipe(notify_pipe))
 		notify_pipe[0] = notify_pipe[1] = -1;
 
+	prepare_cmd(&argv, cmd);
+
 	cmd->pid = fork();
 	failed_errno = errno;
 	if (!cmd->pid) {
@@ -437,12 +444,9 @@ int start_command(struct child_process *cmd)
 					unsetenv(*cmd->env);
 			}
 		}
-		if (cmd->git_cmd)
-			execv_git_cmd(cmd->argv);
-		else if (cmd->use_shell)
-			execv_shell_cmd(cmd->argv);
-		else
-			sane_execvp(cmd->argv[0], (char *const*) cmd->argv);
+
+		sane_execvp(argv.argv[0], (char *const *) argv.argv);
+
 		if (errno == ENOENT) {
 			if (!cmd->silent_exec_failure)
 				error("cannot run %s: %s", cmd->argv[0],
@@ -458,7 +462,7 @@ int start_command(struct child_process *cmd)
 		mark_child_for_cleanup(cmd->pid, cmd);
 
 	/*
-	 * Wait for child's execvp. If the execvp succeeds (or if fork()
+	 * Wait for child's exec. If the exec succeeds (or if fork()
 	 * failed), EOF is seen immediately by the parent. Otherwise, the
 	 * child process sends a single byte.
 	 * Note that use of this infrastructure is completely advisory,
@@ -467,7 +471,7 @@ int start_command(struct child_process *cmd)
 	close(notify_pipe[1]);
 	if (read(notify_pipe[0], &notify_pipe[1], 1) == 1) {
 		/*
-		 * At this point we know that fork() succeeded, but execvp()
+		 * At this point we know that fork() succeeded, but exec()
 		 * failed. Errors have been reported to our stderr.
 		 */
 		wait_or_whine(cmd->pid, cmd->argv[0], 0);
@@ -475,6 +479,8 @@ int start_command(struct child_process *cmd)
 		cmd->pid = -1;
 	}
 	close(notify_pipe[0]);
+
+	argv_array_clear(&argv);
 }
 #else
 {
-- 
2.12.2.762.g0e3151a226-goog


^ permalink raw reply	[flat|nested] 140+ messages in thread

* [PATCH v3 04/10] run-command: use the async-signal-safe execv instead of execvp
  2017-04-14 16:58   ` [PATCH v3 00/10] forking and threading Brandon Williams
                       ` (2 preceding siblings ...)
  2017-04-14 16:58     ` [PATCH v3 03/10] run-command: prepare command before forking Brandon Williams
@ 2017-04-14 16:58     ` Brandon Williams
  2017-04-14 16:58     ` [PATCH v3 05/10] run-command: prepare child environment before forking Brandon Williams
                       ` (6 subsequent siblings)
  10 siblings, 0 replies; 140+ messages in thread
From: Brandon Williams @ 2017-04-14 16:58 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, jrnieder, e

Convert the function used to exec from 'execvp()' to 'execv()' as the (p)
variant of exec isn't async-signal-safe and has the potential to call malloc
during the path resolution it performs.  Instead we simply do the path
resolution ourselves during the preparation stage prior to forking.  There also
don't exist any portable (p) variants which also take in an environment to use
in the exec'd process.  This allows easy migration to using 'execve()' in a
future patch.

Also, as noted in [1], in the event of an ENOEXEC the (p) variants of
exec will attempt to execute the command by interpreting it with the
'sh' utility.  To maintain this functionality, if 'execv()' fails with
ENOEXEC, start_command will atempt to execute the command by
interpreting it with 'sh'.

[1] http://pubs.opengroup.org/onlinepubs/009695399/functions/exec.html

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 run-command.c | 30 +++++++++++++++++++++++++++++-
 1 file changed, 29 insertions(+), 1 deletion(-)

diff --git a/run-command.c b/run-command.c
index d8d143795..1c7a3b611 100644
--- a/run-command.c
+++ b/run-command.c
@@ -238,6 +238,12 @@ static void prepare_cmd(struct argv_array *out, const struct child_process *cmd)
 	if (!cmd->argv[0])
 		die("BUG: command is empty");
 
+	/*
+	 * Add SHELL_PATH so in the event exec fails with ENOEXEC we can
+	 * attempt to interpret the command with 'sh'.
+	 */
+	argv_array_push(out, SHELL_PATH);
+
 	if (cmd->git_cmd) {
 		argv_array_push(out, "git");
 		argv_array_pushv(out, cmd->argv);
@@ -246,6 +252,20 @@ static void prepare_cmd(struct argv_array *out, const struct child_process *cmd)
 	} else {
 		argv_array_pushv(out, cmd->argv);
 	}
+
+	/*
+	 * If there are no '/' characters in the command then perform a path
+	 * lookup and use the resolved path as the command to exec.  If there
+	 * are no '/' characters or if the command wasn't found in the path,
+	 * have exec attempt to invoke the command directly.
+	 */
+	if (!strchr(out->argv[1], '/')) {
+		char *program = locate_in_PATH(out->argv[1]);
+		if (program) {
+			free((char *)out->argv[1]);
+			out->argv[1] = program;
+		}
+	}
 }
 #endif
 
@@ -445,7 +465,15 @@ int start_command(struct child_process *cmd)
 			}
 		}
 
-		sane_execvp(argv.argv[0], (char *const *) argv.argv);
+		/*
+		 * Attempt to exec using the command and arguments starting at
+		 * argv.argv[1].  argv.argv[0] contains SHELL_PATH which will
+		 * be used in the event exec failed with ENOEXEC at which point
+		 * we will try to interpret the command using 'sh'.
+		 */
+		execv(argv.argv[1], (char *const *) argv.argv + 1);
+		if (errno == ENOEXEC)
+			execv(argv.argv[0], (char *const *) argv.argv);
 
 		if (errno == ENOENT) {
 			if (!cmd->silent_exec_failure)
-- 
2.12.2.762.g0e3151a226-goog


^ permalink raw reply	[flat|nested] 140+ messages in thread

* [PATCH v3 05/10] run-command: prepare child environment before forking
  2017-04-14 16:58   ` [PATCH v3 00/10] forking and threading Brandon Williams
                       ` (3 preceding siblings ...)
  2017-04-14 16:58     ` [PATCH v3 04/10] run-command: use the async-signal-safe execv instead of execvp Brandon Williams
@ 2017-04-14 16:58     ` Brandon Williams
  2017-04-14 16:58     ` [PATCH v3 06/10] run-command: don't die in child when duping /dev/null Brandon Williams
                       ` (5 subsequent siblings)
  10 siblings, 0 replies; 140+ messages in thread
From: Brandon Williams @ 2017-04-14 16:58 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, jrnieder, e

In order to avoid allocation between 'fork()' and 'exec()' prepare the
environment to be used in the child process prior to forking.

Switch to using 'execve()' so that the construct child environment can
used in the exec'd process.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 run-command.c | 86 ++++++++++++++++++++++++++++++++++++++++++++++++++++-------
 1 file changed, 76 insertions(+), 10 deletions(-)

diff --git a/run-command.c b/run-command.c
index 1c7a3b611..5864b5ff3 100644
--- a/run-command.c
+++ b/run-command.c
@@ -267,6 +267,75 @@ static void prepare_cmd(struct argv_array *out, const struct child_process *cmd)
 		}
 	}
 }
+
+static int env_isequal(const char *e1, const char *e2)
+{
+	for (;;) {
+		char c1 = *e1++;
+		char c2 = *e2++;
+		c1 = (c1 == '=') ? '\0' : tolower(c1);
+		c2 = (c2 == '=') ? '\0' : tolower(c2);
+
+		if (c1 != c2)
+			return 0;
+		if (c1 == '\0')
+			return 1;
+	}
+}
+
+static int searchenv(char **env, const char *name)
+{
+	int pos = 0;
+
+	for (; env[pos]; pos++)
+		if (env_isequal(env[pos], name))
+			break;
+
+	return pos;
+}
+
+static int do_putenv(char **env, int env_nr, const char *name)
+{
+	int pos = searchenv(env, name);
+
+	if (strchr(name, '=')) {
+		/* ('key=value'), insert of replace entry */
+		if (pos >= env_nr)
+			env_nr++;
+		env[pos] = (char *) name;
+	} else if (pos < env_nr) {
+		/* otherwise ('key') remove existing entry */
+		env_nr--;
+		memmove(&env[pos], &env[pos + 1],
+			(env_nr - pos) * sizeof(char *));
+		env[env_nr] = NULL;
+	}
+
+	return env_nr;
+}
+
+static char **prep_childenv(const char *const *deltaenv)
+{
+	char **childenv;
+	int childenv_nr = 0, childenv_alloc = 0;
+	int i;
+
+	for (i = 0; environ[i]; i++)
+		childenv_nr++;
+	for (i = 0; deltaenv && deltaenv[i]; i++)
+		childenv_alloc++;
+	/* Add one for the NULL termination */
+	childenv_alloc += childenv_nr + 1;
+
+	childenv = xcalloc(childenv_alloc, sizeof(char *));
+	memcpy(childenv, environ, childenv_nr * sizeof(char *));
+
+	/* merge in deltaenv */
+	for (i = 0; deltaenv && deltaenv[i]; i++)
+		childenv_nr = do_putenv(childenv, childenv_nr, deltaenv[i]);
+
+	return childenv;
+}
 #endif
 
 static inline void set_cloexec(int fd)
@@ -395,12 +464,14 @@ int start_command(struct child_process *cmd)
 #ifndef GIT_WINDOWS_NATIVE
 {
 	int notify_pipe[2];
+	char **childenv;
 	struct argv_array argv = ARGV_ARRAY_INIT;
 
 	if (pipe(notify_pipe))
 		notify_pipe[0] = notify_pipe[1] = -1;
 
 	prepare_cmd(&argv, cmd);
+	childenv = prep_childenv(cmd->env);
 
 	cmd->pid = fork();
 	failed_errno = errno;
@@ -456,14 +527,6 @@ int start_command(struct child_process *cmd)
 		if (cmd->dir && chdir(cmd->dir))
 			die_errno("exec '%s': cd to '%s' failed", cmd->argv[0],
 			    cmd->dir);
-		if (cmd->env) {
-			for (; *cmd->env; cmd->env++) {
-				if (strchr(*cmd->env, '='))
-					putenv((char *)*cmd->env);
-				else
-					unsetenv(*cmd->env);
-			}
-		}
 
 		/*
 		 * Attempt to exec using the command and arguments starting at
@@ -471,9 +534,11 @@ int start_command(struct child_process *cmd)
 		 * be used in the event exec failed with ENOEXEC at which point
 		 * we will try to interpret the command using 'sh'.
 		 */
-		execv(argv.argv[1], (char *const *) argv.argv + 1);
+		execve(argv.argv[1], (char *const *) argv.argv + 1,
+		       (char *const *) childenv);
 		if (errno == ENOEXEC)
-			execv(argv.argv[0], (char *const *) argv.argv);
+			execve(argv.argv[0], (char *const *) argv.argv,
+			       (char *const *) childenv);
 
 		if (errno == ENOENT) {
 			if (!cmd->silent_exec_failure)
@@ -509,6 +574,7 @@ int start_command(struct child_process *cmd)
 	close(notify_pipe[0]);
 
 	argv_array_clear(&argv);
+	free(childenv);
 }
 #else
 {
-- 
2.12.2.762.g0e3151a226-goog


^ permalink raw reply	[flat|nested] 140+ messages in thread

* [PATCH v3 06/10] run-command: don't die in child when duping /dev/null
  2017-04-14 16:58   ` [PATCH v3 00/10] forking and threading Brandon Williams
                       ` (4 preceding siblings ...)
  2017-04-14 16:58     ` [PATCH v3 05/10] run-command: prepare child environment before forking Brandon Williams
@ 2017-04-14 16:58     ` Brandon Williams
  2017-04-14 19:38       ` Eric Wong
  2017-04-14 16:58     ` [PATCH v3 07/10] run-command: eliminate calls to error handling functions in child Brandon Williams
                       ` (4 subsequent siblings)
  10 siblings, 1 reply; 140+ messages in thread
From: Brandon Williams @ 2017-04-14 16:58 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, jrnieder, e

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 run-command.c | 28 +++++++++++++---------------
 1 file changed, 13 insertions(+), 15 deletions(-)

diff --git a/run-command.c b/run-command.c
index 5864b5ff3..ee2c680ab 100644
--- a/run-command.c
+++ b/run-command.c
@@ -117,18 +117,6 @@ static inline void close_pair(int fd[2])
 	close(fd[1]);
 }
 
-#ifndef GIT_WINDOWS_NATIVE
-static inline void dup_devnull(int to)
-{
-	int fd = open("/dev/null", O_RDWR);
-	if (fd < 0)
-		die_errno(_("open /dev/null failed"));
-	if (dup2(fd, to) < 0)
-		die_errno(_("dup2(%d,%d) failed"), fd, to);
-	close(fd);
-}
-#endif
-
 static char *locate_in_PATH(const char *file)
 {
 	const char *p = getenv("PATH");
@@ -464,12 +452,20 @@ int start_command(struct child_process *cmd)
 #ifndef GIT_WINDOWS_NATIVE
 {
 	int notify_pipe[2];
+	int null_fd = -1;
 	char **childenv;
 	struct argv_array argv = ARGV_ARRAY_INIT;
 
 	if (pipe(notify_pipe))
 		notify_pipe[0] = notify_pipe[1] = -1;
 
+	if (cmd->no_stdin || cmd->no_stdout || cmd->no_stderr) {
+		null_fd = open("/dev/null", O_RDWR | O_CLOEXEC | O_NONBLOCK);
+		if (null_fd < 0)
+			die_errno(_("open /dev/null failed"));
+		set_cloexec(null_fd);
+	}
+
 	prepare_cmd(&argv, cmd);
 	childenv = prep_childenv(cmd->env);
 
@@ -493,7 +489,7 @@ int start_command(struct child_process *cmd)
 		atexit(notify_parent);
 
 		if (cmd->no_stdin)
-			dup_devnull(0);
+			dup2(null_fd, 0);
 		else if (need_in) {
 			dup2(fdin[0], 0);
 			close_pair(fdin);
@@ -503,7 +499,7 @@ int start_command(struct child_process *cmd)
 		}
 
 		if (cmd->no_stderr)
-			dup_devnull(2);
+			dup2(null_fd, 2);
 		else if (need_err) {
 			dup2(fderr[1], 2);
 			close_pair(fderr);
@@ -513,7 +509,7 @@ int start_command(struct child_process *cmd)
 		}
 
 		if (cmd->no_stdout)
-			dup_devnull(1);
+			dup2(null_fd, 1);
 		else if (cmd->stdout_to_stderr)
 			dup2(2, 1);
 		else if (need_out) {
@@ -573,6 +569,8 @@ int start_command(struct child_process *cmd)
 	}
 	close(notify_pipe[0]);
 
+	if (null_fd >= 0)
+		close(null_fd);
 	argv_array_clear(&argv);
 	free(childenv);
 }
-- 
2.12.2.762.g0e3151a226-goog


^ permalink raw reply	[flat|nested] 140+ messages in thread

* [PATCH v3 07/10] run-command: eliminate calls to error handling functions in child
  2017-04-14 16:58   ` [PATCH v3 00/10] forking and threading Brandon Williams
                       ` (5 preceding siblings ...)
  2017-04-14 16:58     ` [PATCH v3 06/10] run-command: don't die in child when duping /dev/null Brandon Williams
@ 2017-04-14 16:58     ` Brandon Williams
  2017-04-14 18:50       ` Eric Wong
  2017-04-14 16:59     ` [PATCH v3 08/10] run-command: handle dup2 and close errors " Brandon Williams
                       ` (3 subsequent siblings)
  10 siblings, 1 reply; 140+ messages in thread
From: Brandon Williams @ 2017-04-14 16:58 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, jrnieder, e

All of our standard error handling paths have the potential to
call malloc or take stdio locks; so we must avoid them inside
the forked child.

Instead, the child only writes an 8 byte struct atomically to
the parent through the notification pipe to propagate an error.
All user-visible error reporting happens from the parent;
even avoiding functions like atexit(3) and exit(3).

Helped-by: Eric Wong <e@80x24.org>
Signed-off-by: Brandon Williams <bmwill@google.com>
---
 run-command.c | 121 ++++++++++++++++++++++++++++++++++++++++++----------------
 1 file changed, 89 insertions(+), 32 deletions(-)

diff --git a/run-command.c b/run-command.c
index ee2c680ab..25b487c35 100644
--- a/run-command.c
+++ b/run-command.c
@@ -211,14 +211,82 @@ static const char **prepare_shell_cmd(struct argv_array *out, const char **argv)
 #ifndef GIT_WINDOWS_NATIVE
 static int child_notifier = -1;
 
-static void notify_parent(void)
+enum child_errcode {
+	CHILD_ERR_CHDIR,
+	CHILD_ERR_ENOENT,
+	CHILD_ERR_SILENT,
+	CHILD_ERR_ERRNO,
+};
+
+struct child_err {
+	enum child_errcode err;
+	int syserr; /* errno */
+};
+
+static void child_die(enum child_errcode err)
 {
-	/*
-	 * execvp failed.  If possible, we'd like to let start_command
-	 * know, so failures like ENOENT can be handled right away; but
-	 * otherwise, finish_command will still report the error.
-	 */
-	xwrite(child_notifier, "", 1);
+	struct child_err buf;
+
+	buf.err = err;
+	buf.syserr = errno;
+
+	/* write(2) on buf smaller than PIPE_BUF (min 512) is atomic: */
+	xwrite(child_notifier, &buf, sizeof(buf));
+	_exit(1);
+}
+
+/*
+ * parent will make it look like the child spewed a fatal error and died
+ * this is needed to prevent changes to t0061.
+ */
+static void fake_fatal(const char *err, va_list params)
+{
+	vreportf("fatal: ", err, params);
+}
+
+static void child_error_fn(const char *err, va_list params)
+{
+	const char msg[] = "error() should not be called in child\n";
+	xwrite(2, msg, sizeof(msg) - 1);
+}
+
+static void child_warn_fn(const char *err, va_list params)
+{
+	const char msg[] = "warn() should not be called in child\n";
+	xwrite(2, msg, sizeof(msg) - 1);
+}
+
+static void NORETURN child_die_fn(const char *err, va_list params)
+{
+	const char msg[] = "die() should not be called in child\n";
+	xwrite(2, msg, sizeof(msg) - 1);
+	_exit(2);
+}
+
+/* this runs in the parent process */
+static void child_err_spew(struct child_process *cmd, struct child_err *cerr)
+{
+	static void (*old_errfn)(const char *err, va_list params);
+
+	old_errfn = get_error_routine();
+	set_error_routine(fake_fatal);
+	errno = cerr->syserr;
+
+	switch (cerr->err) {
+	case CHILD_ERR_CHDIR:
+		error_errno("exec '%s': cd to '%s' failed",
+			    cmd->argv[0], cmd->dir);
+		break;
+	case CHILD_ERR_ENOENT:
+		error_errno("cannot run %s", cmd->argv[0]);
+		break;
+	case CHILD_ERR_SILENT:
+		break;
+	case CHILD_ERR_ERRNO:
+		error_errno("cannot exec '%s'", cmd->argv[0]);
+		break;
+	}
+	set_error_routine(old_errfn);
 }
 
 static void prepare_cmd(struct argv_array *out, const struct child_process *cmd)
@@ -361,13 +429,6 @@ static int wait_or_whine(pid_t pid, const char *argv0, int in_signal)
 		code += 128;
 	} else if (WIFEXITED(status)) {
 		code = WEXITSTATUS(status);
-		/*
-		 * Convert special exit code when execvp failed.
-		 */
-		if (code == 127) {
-			code = -1;
-			failed_errno = ENOENT;
-		}
 	} else {
 		error("waitpid is confused (%s)", argv0);
 	}
@@ -455,6 +516,7 @@ int start_command(struct child_process *cmd)
 	int null_fd = -1;
 	char **childenv;
 	struct argv_array argv = ARGV_ARRAY_INIT;
+	struct child_err cerr;
 
 	if (pipe(notify_pipe))
 		notify_pipe[0] = notify_pipe[1] = -1;
@@ -473,20 +535,16 @@ int start_command(struct child_process *cmd)
 	failed_errno = errno;
 	if (!cmd->pid) {
 		/*
-		 * Redirect the channel to write syscall error messages to
-		 * before redirecting the process's stderr so that all die()
-		 * in subsequent call paths use the parent's stderr.
+		 * Ensure the default die/error/warn routines do not get
+		 * called, they can take stdio locks and malloc.
 		 */
-		if (cmd->no_stderr || need_err) {
-			int child_err = dup(2);
-			set_cloexec(child_err);
-			set_error_handle(fdopen(child_err, "w"));
-		}
+		set_die_routine(child_die_fn);
+		set_error_routine(child_error_fn);
+		set_warn_routine(child_warn_fn);
 
 		close(notify_pipe[0]);
 		set_cloexec(notify_pipe[1]);
 		child_notifier = notify_pipe[1];
-		atexit(notify_parent);
 
 		if (cmd->no_stdin)
 			dup2(null_fd, 0);
@@ -521,8 +579,7 @@ int start_command(struct child_process *cmd)
 		}
 
 		if (cmd->dir && chdir(cmd->dir))
-			die_errno("exec '%s': cd to '%s' failed", cmd->argv[0],
-			    cmd->dir);
+			child_die(CHILD_ERR_CHDIR);
 
 		/*
 		 * Attempt to exec using the command and arguments starting at
@@ -537,12 +594,11 @@ int start_command(struct child_process *cmd)
 			       (char *const *) childenv);
 
 		if (errno == ENOENT) {
-			if (!cmd->silent_exec_failure)
-				error("cannot run %s: %s", cmd->argv[0],
-					strerror(ENOENT));
-			exit(127);
+			if (cmd->silent_exec_failure)
+				child_die(CHILD_ERR_SILENT);
+			child_die(CHILD_ERR_ENOENT);
 		} else {
-			die_errno("cannot exec '%s'", cmd->argv[0]);
+			child_die(CHILD_ERR_ERRNO);
 		}
 	}
 	if (cmd->pid < 0)
@@ -553,17 +609,18 @@ int start_command(struct child_process *cmd)
 	/*
 	 * Wait for child's exec. If the exec succeeds (or if fork()
 	 * failed), EOF is seen immediately by the parent. Otherwise, the
-	 * child process sends a single byte.
+	 * child process sends a child_err struct.
 	 * Note that use of this infrastructure is completely advisory,
 	 * therefore, we keep error checks minimal.
 	 */
 	close(notify_pipe[1]);
-	if (read(notify_pipe[0], &notify_pipe[1], 1) == 1) {
+	if (xread(notify_pipe[0], &cerr, sizeof(cerr)) == sizeof(cerr)) {
 		/*
 		 * At this point we know that fork() succeeded, but exec()
 		 * failed. Errors have been reported to our stderr.
 		 */
 		wait_or_whine(cmd->pid, cmd->argv[0], 0);
+		child_err_spew(cmd, &cerr);
 		failed_errno = errno;
 		cmd->pid = -1;
 	}
-- 
2.12.2.762.g0e3151a226-goog


^ permalink raw reply	[flat|nested] 140+ messages in thread

* [PATCH v3 08/10] run-command: handle dup2 and close errors in child
  2017-04-14 16:58   ` [PATCH v3 00/10] forking and threading Brandon Williams
                       ` (6 preceding siblings ...)
  2017-04-14 16:58     ` [PATCH v3 07/10] run-command: eliminate calls to error handling functions in child Brandon Williams
@ 2017-04-14 16:59     ` " Brandon Williams
  2017-04-14 16:59     ` [PATCH v3 09/10] run-command: add note about forking and threading Brandon Williams
                       ` (2 subsequent siblings)
  10 siblings, 0 replies; 140+ messages in thread
From: Brandon Williams @ 2017-04-14 16:59 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, jrnieder, e

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 run-command.c | 58 ++++++++++++++++++++++++++++++++++++++++++----------------
 1 file changed, 42 insertions(+), 16 deletions(-)

diff --git a/run-command.c b/run-command.c
index 25b487c35..f36eafa8d 100644
--- a/run-command.c
+++ b/run-command.c
@@ -213,6 +213,8 @@ static int child_notifier = -1;
 
 enum child_errcode {
 	CHILD_ERR_CHDIR,
+	CHILD_ERR_DUP2,
+	CHILD_ERR_CLOSE,
 	CHILD_ERR_ENOENT,
 	CHILD_ERR_SILENT,
 	CHILD_ERR_ERRNO,
@@ -235,6 +237,24 @@ static void child_die(enum child_errcode err)
 	_exit(1);
 }
 
+static void child_dup2(int fd, int to)
+{
+	if (dup2(fd, to) < 0)
+		child_die(CHILD_ERR_DUP2);
+}
+
+static void child_close(int fd)
+{
+	if (close(fd))
+		child_die(CHILD_ERR_CLOSE);
+}
+
+static void child_close_pair(int fd[2])
+{
+	child_close(fd[0]);
+	child_close(fd[1]);
+}
+
 /*
  * parent will make it look like the child spewed a fatal error and died
  * this is needed to prevent changes to t0061.
@@ -277,6 +297,12 @@ static void child_err_spew(struct child_process *cmd, struct child_err *cerr)
 		error_errno("exec '%s': cd to '%s' failed",
 			    cmd->argv[0], cmd->dir);
 		break;
+	case CHILD_ERR_DUP2:
+		error_errno("dup2() in child failed");
+		break;
+	case CHILD_ERR_CLOSE:
+		error_errno("close() in child failed");
+		break;
 	case CHILD_ERR_ENOENT:
 		error_errno("cannot run %s", cmd->argv[0]);
 		break;
@@ -547,35 +573,35 @@ int start_command(struct child_process *cmd)
 		child_notifier = notify_pipe[1];
 
 		if (cmd->no_stdin)
-			dup2(null_fd, 0);
+			child_dup2(null_fd, 0);
 		else if (need_in) {
-			dup2(fdin[0], 0);
-			close_pair(fdin);
+			child_dup2(fdin[0], 0);
+			child_close_pair(fdin);
 		} else if (cmd->in) {
-			dup2(cmd->in, 0);
-			close(cmd->in);
+			child_dup2(cmd->in, 0);
+			child_close(cmd->in);
 		}
 
 		if (cmd->no_stderr)
-			dup2(null_fd, 2);
+			child_dup2(null_fd, 2);
 		else if (need_err) {
-			dup2(fderr[1], 2);
-			close_pair(fderr);
+			child_dup2(fderr[1], 2);
+			child_close_pair(fderr);
 		} else if (cmd->err > 1) {
-			dup2(cmd->err, 2);
-			close(cmd->err);
+			child_dup2(cmd->err, 2);
+			child_close(cmd->err);
 		}
 
 		if (cmd->no_stdout)
-			dup2(null_fd, 1);
+			child_dup2(null_fd, 1);
 		else if (cmd->stdout_to_stderr)
-			dup2(2, 1);
+			child_dup2(2, 1);
 		else if (need_out) {
-			dup2(fdout[1], 1);
-			close_pair(fdout);
+			child_dup2(fdout[1], 1);
+			child_close_pair(fdout);
 		} else if (cmd->out > 1) {
-			dup2(cmd->out, 1);
-			close(cmd->out);
+			child_dup2(cmd->out, 1);
+			child_close(cmd->out);
 		}
 
 		if (cmd->dir && chdir(cmd->dir))
-- 
2.12.2.762.g0e3151a226-goog


^ permalink raw reply	[flat|nested] 140+ messages in thread

* [PATCH v3 09/10] run-command: add note about forking and threading
  2017-04-14 16:58   ` [PATCH v3 00/10] forking and threading Brandon Williams
                       ` (7 preceding siblings ...)
  2017-04-14 16:59     ` [PATCH v3 08/10] run-command: handle dup2 and close errors " Brandon Williams
@ 2017-04-14 16:59     ` Brandon Williams
  2017-04-14 16:59     ` [PATCH v3 10/10] run-command: block signals between fork and execve Brandon Williams
  2017-04-17 22:08     ` [PATCH v4 00/10] forking and threading Brandon Williams
  10 siblings, 0 replies; 140+ messages in thread
From: Brandon Williams @ 2017-04-14 16:59 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, jrnieder, e

All non-Async-Signal-Safe functions (e.g. malloc and die) were removed
between 'fork' and 'exec' in start_command in order to avoid potential
deadlocking when forking while multiple threads are running.  This
deadlocking is possible when a thread (other than the one forking) has
acquired a lock and didn't get around to releasing it before the fork.
This leaves the lock in a locked state in the resulting process with no
hope of it ever being released.

Add a note describing this potential pitfall before the call to 'fork()'
so people working in this section of the code know to only use
Async-Signal-Safe functions in the child process.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 run-command.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/run-command.c b/run-command.c
index f36eafa8d..d3a32eab6 100644
--- a/run-command.c
+++ b/run-command.c
@@ -557,6 +557,15 @@ int start_command(struct child_process *cmd)
 	prepare_cmd(&argv, cmd);
 	childenv = prep_childenv(cmd->env);
 
+	/*
+	 * NOTE: In order to prevent deadlocking when using threads special
+	 * care should be taken with the function calls made in between the
+	 * fork() and exec() calls.  No calls should be made to functions which
+	 * require acquiring a lock (e.g. malloc) as the lock could have been
+	 * held by another thread at the time of forking, causing the lock to
+	 * never be released in the child process.  This means only
+	 * Async-Signal-Safe functions are permitted in the child.
+	 */
 	cmd->pid = fork();
 	failed_errno = errno;
 	if (!cmd->pid) {
-- 
2.12.2.762.g0e3151a226-goog


^ permalink raw reply	[flat|nested] 140+ messages in thread

* [PATCH v3 10/10] run-command: block signals between fork and execve
  2017-04-14 16:58   ` [PATCH v3 00/10] forking and threading Brandon Williams
                       ` (8 preceding siblings ...)
  2017-04-14 16:59     ` [PATCH v3 09/10] run-command: add note about forking and threading Brandon Williams
@ 2017-04-14 16:59     ` Brandon Williams
  2017-04-14 20:24       ` Brandon Williams
  2017-04-17 22:08     ` [PATCH v4 00/10] forking and threading Brandon Williams
  10 siblings, 1 reply; 140+ messages in thread
From: Brandon Williams @ 2017-04-14 16:59 UTC (permalink / raw)
  To: git; +Cc: Eric Wong, jrnieder, Brandon Williams

From: Eric Wong <e@80x24.org>

Signal handlers of the parent firing in the forked child may
have unintended side effects.  Rather than auditing every signal
handler we have and will ever have, block signals while forking
and restore default signal handlers in the child before execve.

Restoring default signal handlers is required because
execve does not unblock signals, it only restores default
signal handlers.  So we must restore them with sigprocmask
before execve, leaving a window when signal handlers
we control can fire in the child.  Continue ignoring
ignored signals, but reset the rest to defaults.

Similarly, disable pthread cancellation to future-proof our code
in case we start using cancellation; as cancellation is
implemented with signals in glibc.

Signed-off-by: Eric Wong <e@80x24.org>
Signed-off-by: Brandon Williams <bmwill@google.com>
---
 run-command.c | 70 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 70 insertions(+)

diff --git a/run-command.c b/run-command.c
index d3a32eab6..cbed3265f 100644
--- a/run-command.c
+++ b/run-command.c
@@ -215,6 +215,7 @@ enum child_errcode {
 	CHILD_ERR_CHDIR,
 	CHILD_ERR_DUP2,
 	CHILD_ERR_CLOSE,
+	CHILD_ERR_SIGPROCMASK,
 	CHILD_ERR_ENOENT,
 	CHILD_ERR_SILENT,
 	CHILD_ERR_ERRNO,
@@ -303,6 +304,9 @@ static void child_err_spew(struct child_process *cmd, struct child_err *cerr)
 	case CHILD_ERR_CLOSE:
 		error_errno("close() in child failed");
 		break;
+	case CHILD_ERR_SIGPROCMASK:
+		error_errno("sigprocmask failed restoring signals");
+		break;
 	case CHILD_ERR_ENOENT:
 		error_errno("cannot run %s", cmd->argv[0]);
 		break;
@@ -420,6 +424,53 @@ static char **prep_childenv(const char *const *deltaenv)
 }
 #endif
 
+struct atfork_state {
+#ifndef NO_PTHREADS
+	int cs;
+#endif
+	sigset_t old;
+};
+
+#ifndef NO_PTHREADS
+static void bug_die(int err, const char *msg)
+{
+	if (err) {
+		errno = err;
+		die_errno("BUG: %s", msg);
+	}
+}
+#endif
+
+static void atfork_prepare(struct atfork_state *as)
+{
+	sigset_t all;
+
+	if (sigfillset(&all))
+		die_errno("sigfillset");
+#ifdef NO_PTHREADS
+	if (sigprocmask(SIG_SETMASK, &all, &as->old))
+		die_errno("sigprocmask");
+#else
+	bug_die(pthread_sigmask(SIG_SETMASK, &all, &as->old),
+		"blocking all signals");
+	bug_die(pthread_setcancelstate(PTHREAD_CANCEL_DISABLE, &as->cs),
+		"disabling cancellation");
+#endif
+}
+
+static void atfork_parent(struct atfork_state *as)
+{
+#ifdef NO_PTHREADS
+	if (sigprocmask(SIG_SETMASK, &as->old, NULL))
+		die_errno("sigprocmask");
+#else
+	bug_die(pthread_setcancelstate(as->cs, NULL),
+		"re-enabling cancellation");
+	bug_die(pthread_sigmask(SIG_SETMASK, &as->old, NULL),
+		"restoring signal mask");
+#endif
+}
+
 static inline void set_cloexec(int fd)
 {
 	int flags = fcntl(fd, F_GETFD);
@@ -543,6 +594,7 @@ int start_command(struct child_process *cmd)
 	char **childenv;
 	struct argv_array argv = ARGV_ARRAY_INIT;
 	struct child_err cerr;
+	struct atfork_state as;
 
 	if (pipe(notify_pipe))
 		notify_pipe[0] = notify_pipe[1] = -1;
@@ -556,6 +608,7 @@ int start_command(struct child_process *cmd)
 
 	prepare_cmd(&argv, cmd);
 	childenv = prep_childenv(cmd->env);
+	atfork_prepare(&as);
 
 	/*
 	 * NOTE: In order to prevent deadlocking when using threads special
@@ -569,6 +622,7 @@ int start_command(struct child_process *cmd)
 	cmd->pid = fork();
 	failed_errno = errno;
 	if (!cmd->pid) {
+		int sig;
 		/*
 		 * Ensure the default die/error/warn routines do not get
 		 * called, they can take stdio locks and malloc.
@@ -617,6 +671,21 @@ int start_command(struct child_process *cmd)
 			child_die(CHILD_ERR_CHDIR);
 
 		/*
+		 * restore default signal handlers here, in case
+		 * we catch a signal right before execve below
+		 */
+		for (sig = 1; sig < NSIG; sig++) {
+			sighandler_t old = signal(sig, SIG_DFL);
+
+			/* ignored signals get reset to SIG_DFL on execve */
+			if (old == SIG_IGN)
+				signal(sig, SIG_IGN);
+		}
+
+		if (sigprocmask(SIG_SETMASK, &as.old, NULL) != 0)
+			child_die(CHILD_ERR_SIGPROCMASK);
+
+		/*
 		 * Attempt to exec using the command and arguments starting at
 		 * argv.argv[1].  argv.argv[0] contains SHELL_PATH which will
 		 * be used in the event exec failed with ENOEXEC at which point
@@ -636,6 +705,7 @@ int start_command(struct child_process *cmd)
 			child_die(CHILD_ERR_ERRNO);
 		}
 	}
+	atfork_parent(&as);
 	if (cmd->pid < 0)
 		error_errno("cannot fork() for %s", cmd->argv[0]);
 	else if (cmd->clean_on_exit)
-- 
2.12.2.762.g0e3151a226-goog


^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [PATCH v3 07/10] run-command: eliminate calls to error handling functions in child
  2017-04-14 16:58     ` [PATCH v3 07/10] run-command: eliminate calls to error handling functions in child Brandon Williams
@ 2017-04-14 18:50       ` Eric Wong
  2017-04-14 20:22         ` Brandon Williams
  0 siblings, 1 reply; 140+ messages in thread
From: Eric Wong @ 2017-04-14 18:50 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, jrnieder

Brandon Williams <bmwill@google.com> wrote:
> +++ b/run-command.c
> @@ -211,14 +211,82 @@ static const char **prepare_shell_cmd(struct argv_array *out, const char **argv)
>  #ifndef GIT_WINDOWS_NATIVE
>  static int child_notifier = -1;
>  
> -static void notify_parent(void)
> +enum child_errcode {
> +	CHILD_ERR_CHDIR,
> +	CHILD_ERR_ENOENT,
> +	CHILD_ERR_SILENT,
> +	CHILD_ERR_ERRNO,
> +};

I realize I introduced this in my original, but trailing commas
on the last enum value might not be portable.  Checking other
enum usages in our tree suggests we omit the last comma.

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [PATCH v3 06/10] run-command: don't die in child when duping /dev/null
  2017-04-14 16:58     ` [PATCH v3 06/10] run-command: don't die in child when duping /dev/null Brandon Williams
@ 2017-04-14 19:38       ` Eric Wong
  2017-04-14 20:19         ` Brandon Williams
  0 siblings, 1 reply; 140+ messages in thread
From: Eric Wong @ 2017-04-14 19:38 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, jrnieder

Brandon Williams <bmwill@google.com> wrote:
> +	if (cmd->no_stdin || cmd->no_stdout || cmd->no_stderr) {
> +		null_fd = open("/dev/null", O_RDWR | O_CLOEXEC | O_NONBLOCK);

O_NONBLOCK?  This was in my original patch, too :x
Wow, I wonder what I was smoking that day...

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [PATCH v3 06/10] run-command: don't die in child when duping /dev/null
  2017-04-14 19:38       ` Eric Wong
@ 2017-04-14 20:19         ` Brandon Williams
  0 siblings, 0 replies; 140+ messages in thread
From: Brandon Williams @ 2017-04-14 20:19 UTC (permalink / raw)
  To: Eric Wong; +Cc: git, jrnieder

On 04/14, Eric Wong wrote:
> Brandon Williams <bmwill@google.com> wrote:
> > +	if (cmd->no_stdin || cmd->no_stdout || cmd->no_stderr) {
> > +		null_fd = open("/dev/null", O_RDWR | O_CLOEXEC | O_NONBLOCK);
> 
> O_NONBLOCK?  This was in my original patch, too :x
> Wow, I wonder what I was smoking that day...

And I apparently wasn't thinking enough to catch that!  I'll fix that.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [PATCH v3 07/10] run-command: eliminate calls to error handling functions in child
  2017-04-14 18:50       ` Eric Wong
@ 2017-04-14 20:22         ` Brandon Williams
  0 siblings, 0 replies; 140+ messages in thread
From: Brandon Williams @ 2017-04-14 20:22 UTC (permalink / raw)
  To: Eric Wong; +Cc: git, jrnieder

On 04/14, Eric Wong wrote:
> Brandon Williams <bmwill@google.com> wrote:
> > +++ b/run-command.c
> > @@ -211,14 +211,82 @@ static const char **prepare_shell_cmd(struct argv_array *out, const char **argv)
> >  #ifndef GIT_WINDOWS_NATIVE
> >  static int child_notifier = -1;
> >  
> > -static void notify_parent(void)
> > +enum child_errcode {
> > +	CHILD_ERR_CHDIR,
> > +	CHILD_ERR_ENOENT,
> > +	CHILD_ERR_SILENT,
> > +	CHILD_ERR_ERRNO,
> > +};
> 
> I realize I introduced this in my original, but trailing commas
> on the last enum value might not be portable.  Checking other
> enum usages in our tree suggests we omit the last comma.

While I realize its not portal, I think there are other places that do
this same thing.  I think it means we can move to a newer standard of C!

In all seriousness though I'll drop the trailing comma.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [PATCH v3 10/10] run-command: block signals between fork and execve
  2017-04-14 16:59     ` [PATCH v3 10/10] run-command: block signals between fork and execve Brandon Williams
@ 2017-04-14 20:24       ` Brandon Williams
  2017-04-14 21:35         ` Eric Wong
  0 siblings, 1 reply; 140+ messages in thread
From: Brandon Williams @ 2017-04-14 20:24 UTC (permalink / raw)
  To: git; +Cc: Eric Wong, jrnieder

On 04/14, Brandon Williams wrote:
>  		/*
> +		 * restore default signal handlers here, in case
> +		 * we catch a signal right before execve below
> +		 */
> +		for (sig = 1; sig < NSIG; sig++) {
> +			sighandler_t old = signal(sig, SIG_DFL);

So sighandler_t doesn't work on macOS.  Is there a more portable lib
that needs to be included for this to work?

> +
> +			/* ignored signals get reset to SIG_DFL on execve */
> +			if (old == SIG_IGN)
> +				signal(sig, SIG_IGN);
> +		}
> +
> +		if (sigprocmask(SIG_SETMASK, &as.old, NULL) != 0)
> +			child_die(CHILD_ERR_SIGPROCMASK);
> +

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [PATCH v3 10/10] run-command: block signals between fork and execve
  2017-04-14 20:24       ` Brandon Williams
@ 2017-04-14 21:35         ` Eric Wong
  0 siblings, 0 replies; 140+ messages in thread
From: Eric Wong @ 2017-04-14 21:35 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, jrnieder

Brandon Williams <bmwill@google.com> wrote:
> On 04/14, Brandon Williams wrote:
> >  		/*
> > +		 * restore default signal handlers here, in case
> > +		 * we catch a signal right before execve below
> > +		 */
> > +		for (sig = 1; sig < NSIG; sig++) {
> > +			sighandler_t old = signal(sig, SIG_DFL);
> 
> So sighandler_t doesn't work on macOS.  Is there a more portable lib
> that needs to be included for this to work?

Oops, maybe this works (only tested on GNU/Linux):

--- a/run-command.c
+++ b/run-command.c
@@ -675,7 +675,7 @@ int start_command(struct child_process *cmd)
 		 * we catch a signal right before execve below
 		 */
 		for (sig = 1; sig < NSIG; sig++) {
-			sighandler_t old = signal(sig, SIG_DFL);
+			void (*old)(int) = signal(sig, SIG_DFL);
 
 			/* ignored signals get reset to SIG_DFL on execve */
 			if (old == SIG_IGN)

Otherwise, maybe just casting to 'void *' is OK:

			void *old = (void *)signal(sig, SIG_DFL);

			if (old == (void *)SIG_IGN)
				...



void *old = signal(sig, SIG_DFL);

^ permalink raw reply	[flat|nested] 140+ messages in thread

* [PATCH v4 00/10] forking and threading
  2017-04-14 16:58   ` [PATCH v3 00/10] forking and threading Brandon Williams
                       ` (9 preceding siblings ...)
  2017-04-14 16:59     ` [PATCH v3 10/10] run-command: block signals between fork and execve Brandon Williams
@ 2017-04-17 22:08     ` Brandon Williams
  2017-04-17 22:08       ` [PATCH v4 01/10] t5550: use write_script to generate post-update hook Brandon Williams
                         ` (10 more replies)
  10 siblings, 11 replies; 140+ messages in thread
From: Brandon Williams @ 2017-04-17 22:08 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, jrnieder, e

v4 fixes a few minor compatability issues:
* add 'extern' reference for 'environ'
* small portability change with the signal handeling
* remove trailing ',' in enum
* null_fd not opened with O_NONBLOCK

Brandon Williams (9):
  t5550: use write_script to generate post-update hook
  t0061: run_command executes scripts without a #! line
  run-command: prepare command before forking
  run-command: use the async-signal-safe execv instead of execvp
  run-command: prepare child environment before forking
  run-command: don't die in child when duping /dev/null
  run-command: eliminate calls to error handling functions in child
  run-command: handle dup2 and close errors in child
  run-command: add note about forking and threading

Eric Wong (1):
  run-command: block signals between fork and execve

 run-command.c              | 425 ++++++++++++++++++++++++++++++++++++---------
 t/t0061-run-command.sh     |  11 ++
 t/t5550-http-fetch-dumb.sh |   5 +-
 3 files changed, 356 insertions(+), 85 deletions(-)

-- 
2.12.2.762.g0e3151a226-goog


^ permalink raw reply	[flat|nested] 140+ messages in thread

* [PATCH v4 01/10] t5550: use write_script to generate post-update hook
  2017-04-17 22:08     ` [PATCH v4 00/10] forking and threading Brandon Williams
@ 2017-04-17 22:08       ` Brandon Williams
  2017-04-17 22:08       ` [PATCH v4 02/10] t0061: run_command executes scripts without a #! line Brandon Williams
                         ` (9 subsequent siblings)
  10 siblings, 0 replies; 140+ messages in thread
From: Brandon Williams @ 2017-04-17 22:08 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, jrnieder, e

The post-update hooks created in t5550-http-fetch-dumb.sh is missing the
"!#/bin/sh" line which can cause issues with portability.  Instead
create the hook using the 'write_script' function which includes the
proper "#!" line.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 t/t5550-http-fetch-dumb.sh | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/t/t5550-http-fetch-dumb.sh b/t/t5550-http-fetch-dumb.sh
index 87308cdce..8552184e7 100755
--- a/t/t5550-http-fetch-dumb.sh
+++ b/t/t5550-http-fetch-dumb.sh
@@ -20,8 +20,9 @@ test_expect_success 'create http-accessible bare repository with loose objects'
 	(cd "$HTTPD_DOCUMENT_ROOT_PATH/repo.git" &&
 	 git config core.bare true &&
 	 mkdir -p hooks &&
-	 echo "exec git update-server-info" >hooks/post-update &&
-	 chmod +x hooks/post-update &&
+	 write_script "hooks/post-update" <<-\EOF &&
+	 exec git update-server-info
+	EOF
 	 hooks/post-update
 	) &&
 	git remote add public "$HTTPD_DOCUMENT_ROOT_PATH/repo.git" &&
-- 
2.12.2.762.g0e3151a226-goog


^ permalink raw reply	[flat|nested] 140+ messages in thread

* [PATCH v4 02/10] t0061: run_command executes scripts without a #! line
  2017-04-17 22:08     ` [PATCH v4 00/10] forking and threading Brandon Williams
  2017-04-17 22:08       ` [PATCH v4 01/10] t5550: use write_script to generate post-update hook Brandon Williams
@ 2017-04-17 22:08       ` Brandon Williams
  2017-04-17 22:08       ` [PATCH v4 03/10] run-command: prepare command before forking Brandon Williams
                         ` (8 subsequent siblings)
  10 siblings, 0 replies; 140+ messages in thread
From: Brandon Williams @ 2017-04-17 22:08 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, jrnieder, e

Add a test to 't0061-run-command.sh' to ensure that run_command can
continue to execute scripts which don't include a '#!' line.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 t/t0061-run-command.sh | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/t/t0061-run-command.sh b/t/t0061-run-command.sh
index 12228b4aa..1a7490e29 100755
--- a/t/t0061-run-command.sh
+++ b/t/t0061-run-command.sh
@@ -26,6 +26,17 @@ test_expect_success 'run_command can run a command' '
 	test_cmp empty err
 '
 
+test_expect_success 'run_command can run a script without a #! line' '
+	cat >hello <<-\EOF &&
+	cat hello-script
+	EOF
+	chmod +x hello &&
+	test-run-command run-command ./hello >actual 2>err &&
+
+	test_cmp hello-script actual &&
+	test_cmp empty err
+'
+
 test_expect_success POSIXPERM 'run_command reports EACCES' '
 	cat hello-script >hello.sh &&
 	chmod -x hello.sh &&
-- 
2.12.2.762.g0e3151a226-goog


^ permalink raw reply	[flat|nested] 140+ messages in thread

* [PATCH v4 03/10] run-command: prepare command before forking
  2017-04-17 22:08     ` [PATCH v4 00/10] forking and threading Brandon Williams
  2017-04-17 22:08       ` [PATCH v4 01/10] t5550: use write_script to generate post-update hook Brandon Williams
  2017-04-17 22:08       ` [PATCH v4 02/10] t0061: run_command executes scripts without a #! line Brandon Williams
@ 2017-04-17 22:08       ` Brandon Williams
  2017-04-17 22:08       ` [PATCH v4 04/10] run-command: use the async-signal-safe execv instead of execvp Brandon Williams
                         ` (7 subsequent siblings)
  10 siblings, 0 replies; 140+ messages in thread
From: Brandon Williams @ 2017-04-17 22:08 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, jrnieder, e

According to [1] we need to only call async-signal-safe operations between fork
and exec.  Using malloc to build the argv array isn't async-signal-safe.

In order to avoid allocation between 'fork()' and 'exec()' prepare the
argv array used in the exec call prior to forking the process.

[1] http://pubs.opengroup.org/onlinepubs/009695399/functions/fork.html

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 run-command.c | 46 ++++++++++++++++++++++++++--------------------
 1 file changed, 26 insertions(+), 20 deletions(-)

diff --git a/run-command.c b/run-command.c
index 574b81d3e..d8d143795 100644
--- a/run-command.c
+++ b/run-command.c
@@ -221,18 +221,6 @@ static const char **prepare_shell_cmd(struct argv_array *out, const char **argv)
 }
 
 #ifndef GIT_WINDOWS_NATIVE
-static int execv_shell_cmd(const char **argv)
-{
-	struct argv_array nargv = ARGV_ARRAY_INIT;
-	prepare_shell_cmd(&nargv, argv);
-	trace_argv_printf(nargv.argv, "trace: exec:");
-	sane_execvp(nargv.argv[0], (char **)nargv.argv);
-	argv_array_clear(&nargv);
-	return -1;
-}
-#endif
-
-#ifndef GIT_WINDOWS_NATIVE
 static int child_notifier = -1;
 
 static void notify_parent(void)
@@ -244,6 +232,21 @@ static void notify_parent(void)
 	 */
 	xwrite(child_notifier, "", 1);
 }
+
+static void prepare_cmd(struct argv_array *out, const struct child_process *cmd)
+{
+	if (!cmd->argv[0])
+		die("BUG: command is empty");
+
+	if (cmd->git_cmd) {
+		argv_array_push(out, "git");
+		argv_array_pushv(out, cmd->argv);
+	} else if (cmd->use_shell) {
+		prepare_shell_cmd(out, cmd->argv);
+	} else {
+		argv_array_pushv(out, cmd->argv);
+	}
+}
 #endif
 
 static inline void set_cloexec(int fd)
@@ -372,9 +375,13 @@ int start_command(struct child_process *cmd)
 #ifndef GIT_WINDOWS_NATIVE
 {
 	int notify_pipe[2];
+	struct argv_array argv = ARGV_ARRAY_INIT;
+
 	if (pipe(notify_pipe))
 		notify_pipe[0] = notify_pipe[1] = -1;
 
+	prepare_cmd(&argv, cmd);
+
 	cmd->pid = fork();
 	failed_errno = errno;
 	if (!cmd->pid) {
@@ -437,12 +444,9 @@ int start_command(struct child_process *cmd)
 					unsetenv(*cmd->env);
 			}
 		}
-		if (cmd->git_cmd)
-			execv_git_cmd(cmd->argv);
-		else if (cmd->use_shell)
-			execv_shell_cmd(cmd->argv);
-		else
-			sane_execvp(cmd->argv[0], (char *const*) cmd->argv);
+
+		sane_execvp(argv.argv[0], (char *const *) argv.argv);
+
 		if (errno == ENOENT) {
 			if (!cmd->silent_exec_failure)
 				error("cannot run %s: %s", cmd->argv[0],
@@ -458,7 +462,7 @@ int start_command(struct child_process *cmd)
 		mark_child_for_cleanup(cmd->pid, cmd);
 
 	/*
-	 * Wait for child's execvp. If the execvp succeeds (or if fork()
+	 * Wait for child's exec. If the exec succeeds (or if fork()
 	 * failed), EOF is seen immediately by the parent. Otherwise, the
 	 * child process sends a single byte.
 	 * Note that use of this infrastructure is completely advisory,
@@ -467,7 +471,7 @@ int start_command(struct child_process *cmd)
 	close(notify_pipe[1]);
 	if (read(notify_pipe[0], &notify_pipe[1], 1) == 1) {
 		/*
-		 * At this point we know that fork() succeeded, but execvp()
+		 * At this point we know that fork() succeeded, but exec()
 		 * failed. Errors have been reported to our stderr.
 		 */
 		wait_or_whine(cmd->pid, cmd->argv[0], 0);
@@ -475,6 +479,8 @@ int start_command(struct child_process *cmd)
 		cmd->pid = -1;
 	}
 	close(notify_pipe[0]);
+
+	argv_array_clear(&argv);
 }
 #else
 {
-- 
2.12.2.762.g0e3151a226-goog


^ permalink raw reply	[flat|nested] 140+ messages in thread

* [PATCH v4 04/10] run-command: use the async-signal-safe execv instead of execvp
  2017-04-17 22:08     ` [PATCH v4 00/10] forking and threading Brandon Williams
                         ` (2 preceding siblings ...)
  2017-04-17 22:08       ` [PATCH v4 03/10] run-command: prepare command before forking Brandon Williams
@ 2017-04-17 22:08       ` Brandon Williams
  2017-04-17 22:08       ` [PATCH v4 05/10] run-command: prepare child environment before forking Brandon Williams
                         ` (6 subsequent siblings)
  10 siblings, 0 replies; 140+ messages in thread
From: Brandon Williams @ 2017-04-17 22:08 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, jrnieder, e

Convert the function used to exec from 'execvp()' to 'execv()' as the (p)
variant of exec isn't async-signal-safe and has the potential to call malloc
during the path resolution it performs.  Instead we simply do the path
resolution ourselves during the preparation stage prior to forking.  There also
don't exist any portable (p) variants which also take in an environment to use
in the exec'd process.  This allows easy migration to using 'execve()' in a
future patch.

Also, as noted in [1], in the event of an ENOEXEC the (p) variants of
exec will attempt to execute the command by interpreting it with the
'sh' utility.  To maintain this functionality, if 'execv()' fails with
ENOEXEC, start_command will atempt to execute the command by
interpreting it with 'sh'.

[1] http://pubs.opengroup.org/onlinepubs/009695399/functions/exec.html

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 run-command.c | 30 +++++++++++++++++++++++++++++-
 1 file changed, 29 insertions(+), 1 deletion(-)

diff --git a/run-command.c b/run-command.c
index d8d143795..1c7a3b611 100644
--- a/run-command.c
+++ b/run-command.c
@@ -238,6 +238,12 @@ static void prepare_cmd(struct argv_array *out, const struct child_process *cmd)
 	if (!cmd->argv[0])
 		die("BUG: command is empty");
 
+	/*
+	 * Add SHELL_PATH so in the event exec fails with ENOEXEC we can
+	 * attempt to interpret the command with 'sh'.
+	 */
+	argv_array_push(out, SHELL_PATH);
+
 	if (cmd->git_cmd) {
 		argv_array_push(out, "git");
 		argv_array_pushv(out, cmd->argv);
@@ -246,6 +252,20 @@ static void prepare_cmd(struct argv_array *out, const struct child_process *cmd)
 	} else {
 		argv_array_pushv(out, cmd->argv);
 	}
+
+	/*
+	 * If there are no '/' characters in the command then perform a path
+	 * lookup and use the resolved path as the command to exec.  If there
+	 * are no '/' characters or if the command wasn't found in the path,
+	 * have exec attempt to invoke the command directly.
+	 */
+	if (!strchr(out->argv[1], '/')) {
+		char *program = locate_in_PATH(out->argv[1]);
+		if (program) {
+			free((char *)out->argv[1]);
+			out->argv[1] = program;
+		}
+	}
 }
 #endif
 
@@ -445,7 +465,15 @@ int start_command(struct child_process *cmd)
 			}
 		}
 
-		sane_execvp(argv.argv[0], (char *const *) argv.argv);
+		/*
+		 * Attempt to exec using the command and arguments starting at
+		 * argv.argv[1].  argv.argv[0] contains SHELL_PATH which will
+		 * be used in the event exec failed with ENOEXEC at which point
+		 * we will try to interpret the command using 'sh'.
+		 */
+		execv(argv.argv[1], (char *const *) argv.argv + 1);
+		if (errno == ENOEXEC)
+			execv(argv.argv[0], (char *const *) argv.argv);
 
 		if (errno == ENOENT) {
 			if (!cmd->silent_exec_failure)
-- 
2.12.2.762.g0e3151a226-goog


^ permalink raw reply	[flat|nested] 140+ messages in thread

* [PATCH v4 05/10] run-command: prepare child environment before forking
  2017-04-17 22:08     ` [PATCH v4 00/10] forking and threading Brandon Williams
                         ` (3 preceding siblings ...)
  2017-04-17 22:08       ` [PATCH v4 04/10] run-command: use the async-signal-safe execv instead of execvp Brandon Williams
@ 2017-04-17 22:08       ` Brandon Williams
  2017-04-18  0:26         ` Eric Wong
  2017-04-17 22:08       ` [PATCH v4 06/10] run-command: don't die in child when duping /dev/null Brandon Williams
                         ` (5 subsequent siblings)
  10 siblings, 1 reply; 140+ messages in thread
From: Brandon Williams @ 2017-04-17 22:08 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, jrnieder, e

In order to avoid allocation between 'fork()' and 'exec()' prepare the
environment to be used in the child process prior to forking.

Switch to using 'execve()' so that the construct child environment can
used in the exec'd process.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 run-command.c | 87 ++++++++++++++++++++++++++++++++++++++++++++++++++++-------
 1 file changed, 77 insertions(+), 10 deletions(-)

diff --git a/run-command.c b/run-command.c
index 1c7a3b611..2fff60a04 100644
--- a/run-command.c
+++ b/run-command.c
@@ -267,6 +267,76 @@ static void prepare_cmd(struct argv_array *out, const struct child_process *cmd)
 		}
 	}
 }
+
+static int env_isequal(const char *e1, const char *e2)
+{
+	for (;;) {
+		char c1 = *e1++;
+		char c2 = *e2++;
+		c1 = (c1 == '=') ? '\0' : tolower(c1);
+		c2 = (c2 == '=') ? '\0' : tolower(c2);
+
+		if (c1 != c2)
+			return 0;
+		if (c1 == '\0')
+			return 1;
+	}
+}
+
+static int searchenv(char **env, const char *name)
+{
+	int pos = 0;
+
+	for (; env[pos]; pos++)
+		if (env_isequal(env[pos], name))
+			break;
+
+	return pos;
+}
+
+static int do_putenv(char **env, int env_nr, const char *name)
+{
+	int pos = searchenv(env, name);
+
+	if (strchr(name, '=')) {
+		/* ('key=value'), insert of replace entry */
+		if (pos >= env_nr)
+			env_nr++;
+		env[pos] = (char *) name;
+	} else if (pos < env_nr) {
+		/* otherwise ('key') remove existing entry */
+		env_nr--;
+		memmove(&env[pos], &env[pos + 1],
+			(env_nr - pos) * sizeof(char *));
+		env[env_nr] = NULL;
+	}
+
+	return env_nr;
+}
+
+static char **prep_childenv(const char *const *deltaenv)
+{
+	extern char **environ;
+	char **childenv;
+	int childenv_nr = 0, childenv_alloc = 0;
+	int i;
+
+	for (i = 0; environ[i]; i++)
+		childenv_nr++;
+	for (i = 0; deltaenv && deltaenv[i]; i++)
+		childenv_alloc++;
+	/* Add one for the NULL termination */
+	childenv_alloc += childenv_nr + 1;
+
+	childenv = xcalloc(childenv_alloc, sizeof(char *));
+	memcpy(childenv, environ, childenv_nr * sizeof(char *));
+
+	/* merge in deltaenv */
+	for (i = 0; deltaenv && deltaenv[i]; i++)
+		childenv_nr = do_putenv(childenv, childenv_nr, deltaenv[i]);
+
+	return childenv;
+}
 #endif
 
 static inline void set_cloexec(int fd)
@@ -395,12 +465,14 @@ int start_command(struct child_process *cmd)
 #ifndef GIT_WINDOWS_NATIVE
 {
 	int notify_pipe[2];
+	char **childenv;
 	struct argv_array argv = ARGV_ARRAY_INIT;
 
 	if (pipe(notify_pipe))
 		notify_pipe[0] = notify_pipe[1] = -1;
 
 	prepare_cmd(&argv, cmd);
+	childenv = prep_childenv(cmd->env);
 
 	cmd->pid = fork();
 	failed_errno = errno;
@@ -456,14 +528,6 @@ int start_command(struct child_process *cmd)
 		if (cmd->dir && chdir(cmd->dir))
 			die_errno("exec '%s': cd to '%s' failed", cmd->argv[0],
 			    cmd->dir);
-		if (cmd->env) {
-			for (; *cmd->env; cmd->env++) {
-				if (strchr(*cmd->env, '='))
-					putenv((char *)*cmd->env);
-				else
-					unsetenv(*cmd->env);
-			}
-		}
 
 		/*
 		 * Attempt to exec using the command and arguments starting at
@@ -471,9 +535,11 @@ int start_command(struct child_process *cmd)
 		 * be used in the event exec failed with ENOEXEC at which point
 		 * we will try to interpret the command using 'sh'.
 		 */
-		execv(argv.argv[1], (char *const *) argv.argv + 1);
+		execve(argv.argv[1], (char *const *) argv.argv + 1,
+		       (char *const *) childenv);
 		if (errno == ENOEXEC)
-			execv(argv.argv[0], (char *const *) argv.argv);
+			execve(argv.argv[0], (char *const *) argv.argv,
+			       (char *const *) childenv);
 
 		if (errno == ENOENT) {
 			if (!cmd->silent_exec_failure)
@@ -509,6 +575,7 @@ int start_command(struct child_process *cmd)
 	close(notify_pipe[0]);
 
 	argv_array_clear(&argv);
+	free(childenv);
 }
 #else
 {
-- 
2.12.2.762.g0e3151a226-goog


^ permalink raw reply	[flat|nested] 140+ messages in thread

* [PATCH v4 06/10] run-command: don't die in child when duping /dev/null
  2017-04-17 22:08     ` [PATCH v4 00/10] forking and threading Brandon Williams
                         ` (4 preceding siblings ...)
  2017-04-17 22:08       ` [PATCH v4 05/10] run-command: prepare child environment before forking Brandon Williams
@ 2017-04-17 22:08       ` Brandon Williams
  2017-04-17 22:08       ` [PATCH v4 07/10] run-command: eliminate calls to error handling functions in child Brandon Williams
                         ` (4 subsequent siblings)
  10 siblings, 0 replies; 140+ messages in thread
From: Brandon Williams @ 2017-04-17 22:08 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, jrnieder, e

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 run-command.c | 28 +++++++++++++---------------
 1 file changed, 13 insertions(+), 15 deletions(-)

diff --git a/run-command.c b/run-command.c
index 2fff60a04..3aa8b7112 100644
--- a/run-command.c
+++ b/run-command.c
@@ -117,18 +117,6 @@ static inline void close_pair(int fd[2])
 	close(fd[1]);
 }
 
-#ifndef GIT_WINDOWS_NATIVE
-static inline void dup_devnull(int to)
-{
-	int fd = open("/dev/null", O_RDWR);
-	if (fd < 0)
-		die_errno(_("open /dev/null failed"));
-	if (dup2(fd, to) < 0)
-		die_errno(_("dup2(%d,%d) failed"), fd, to);
-	close(fd);
-}
-#endif
-
 static char *locate_in_PATH(const char *file)
 {
 	const char *p = getenv("PATH");
@@ -465,12 +453,20 @@ int start_command(struct child_process *cmd)
 #ifndef GIT_WINDOWS_NATIVE
 {
 	int notify_pipe[2];
+	int null_fd = -1;
 	char **childenv;
 	struct argv_array argv = ARGV_ARRAY_INIT;
 
 	if (pipe(notify_pipe))
 		notify_pipe[0] = notify_pipe[1] = -1;
 
+	if (cmd->no_stdin || cmd->no_stdout || cmd->no_stderr) {
+		null_fd = open("/dev/null", O_RDWR | O_CLOEXEC);
+		if (null_fd < 0)
+			die_errno(_("open /dev/null failed"));
+		set_cloexec(null_fd);
+	}
+
 	prepare_cmd(&argv, cmd);
 	childenv = prep_childenv(cmd->env);
 
@@ -494,7 +490,7 @@ int start_command(struct child_process *cmd)
 		atexit(notify_parent);
 
 		if (cmd->no_stdin)
-			dup_devnull(0);
+			dup2(null_fd, 0);
 		else if (need_in) {
 			dup2(fdin[0], 0);
 			close_pair(fdin);
@@ -504,7 +500,7 @@ int start_command(struct child_process *cmd)
 		}
 
 		if (cmd->no_stderr)
-			dup_devnull(2);
+			dup2(null_fd, 2);
 		else if (need_err) {
 			dup2(fderr[1], 2);
 			close_pair(fderr);
@@ -514,7 +510,7 @@ int start_command(struct child_process *cmd)
 		}
 
 		if (cmd->no_stdout)
-			dup_devnull(1);
+			dup2(null_fd, 1);
 		else if (cmd->stdout_to_stderr)
 			dup2(2, 1);
 		else if (need_out) {
@@ -574,6 +570,8 @@ int start_command(struct child_process *cmd)
 	}
 	close(notify_pipe[0]);
 
+	if (null_fd >= 0)
+		close(null_fd);
 	argv_array_clear(&argv);
 	free(childenv);
 }
-- 
2.12.2.762.g0e3151a226-goog


^ permalink raw reply	[flat|nested] 140+ messages in thread

* [PATCH v4 07/10] run-command: eliminate calls to error handling functions in child
  2017-04-17 22:08     ` [PATCH v4 00/10] forking and threading Brandon Williams
                         ` (5 preceding siblings ...)
  2017-04-17 22:08       ` [PATCH v4 06/10] run-command: don't die in child when duping /dev/null Brandon Williams
@ 2017-04-17 22:08       ` Brandon Williams
  2017-04-17 22:08       ` [PATCH v4 08/10] run-command: handle dup2 and close errors " Brandon Williams
                         ` (3 subsequent siblings)
  10 siblings, 0 replies; 140+ messages in thread
From: Brandon Williams @ 2017-04-17 22:08 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, jrnieder, e

All of our standard error handling paths have the potential to
call malloc or take stdio locks; so we must avoid them inside
the forked child.

Instead, the child only writes an 8 byte struct atomically to
the parent through the notification pipe to propagate an error.
All user-visible error reporting happens from the parent;
even avoiding functions like atexit(3) and exit(3).

Helped-by: Eric Wong <e@80x24.org>
Signed-off-by: Brandon Williams <bmwill@google.com>
---
 run-command.c | 121 ++++++++++++++++++++++++++++++++++++++++++----------------
 1 file changed, 89 insertions(+), 32 deletions(-)

diff --git a/run-command.c b/run-command.c
index 3aa8b7112..e1e8780ca 100644
--- a/run-command.c
+++ b/run-command.c
@@ -211,14 +211,82 @@ static const char **prepare_shell_cmd(struct argv_array *out, const char **argv)
 #ifndef GIT_WINDOWS_NATIVE
 static int child_notifier = -1;
 
-static void notify_parent(void)
+enum child_errcode {
+	CHILD_ERR_CHDIR,
+	CHILD_ERR_ENOENT,
+	CHILD_ERR_SILENT,
+	CHILD_ERR_ERRNO
+};
+
+struct child_err {
+	enum child_errcode err;
+	int syserr; /* errno */
+};
+
+static void child_die(enum child_errcode err)
 {
-	/*
-	 * execvp failed.  If possible, we'd like to let start_command
-	 * know, so failures like ENOENT can be handled right away; but
-	 * otherwise, finish_command will still report the error.
-	 */
-	xwrite(child_notifier, "", 1);
+	struct child_err buf;
+
+	buf.err = err;
+	buf.syserr = errno;
+
+	/* write(2) on buf smaller than PIPE_BUF (min 512) is atomic: */
+	xwrite(child_notifier, &buf, sizeof(buf));
+	_exit(1);
+}
+
+/*
+ * parent will make it look like the child spewed a fatal error and died
+ * this is needed to prevent changes to t0061.
+ */
+static void fake_fatal(const char *err, va_list params)
+{
+	vreportf("fatal: ", err, params);
+}
+
+static void child_error_fn(const char *err, va_list params)
+{
+	const char msg[] = "error() should not be called in child\n";
+	xwrite(2, msg, sizeof(msg) - 1);
+}
+
+static void child_warn_fn(const char *err, va_list params)
+{
+	const char msg[] = "warn() should not be called in child\n";
+	xwrite(2, msg, sizeof(msg) - 1);
+}
+
+static void NORETURN child_die_fn(const char *err, va_list params)
+{
+	const char msg[] = "die() should not be called in child\n";
+	xwrite(2, msg, sizeof(msg) - 1);
+	_exit(2);
+}
+
+/* this runs in the parent process */
+static void child_err_spew(struct child_process *cmd, struct child_err *cerr)
+{
+	static void (*old_errfn)(const char *err, va_list params);
+
+	old_errfn = get_error_routine();
+	set_error_routine(fake_fatal);
+	errno = cerr->syserr;
+
+	switch (cerr->err) {
+	case CHILD_ERR_CHDIR:
+		error_errno("exec '%s': cd to '%s' failed",
+			    cmd->argv[0], cmd->dir);
+		break;
+	case CHILD_ERR_ENOENT:
+		error_errno("cannot run %s", cmd->argv[0]);
+		break;
+	case CHILD_ERR_SILENT:
+		break;
+	case CHILD_ERR_ERRNO:
+		error_errno("cannot exec '%s'", cmd->argv[0]);
+		break;
+	}
+	set_error_routine(old_errfn);
 }
 
 static void prepare_cmd(struct argv_array *out, const struct child_process *cmd)
@@ -362,13 +430,6 @@ static int wait_or_whine(pid_t pid, const char *argv0, int in_signal)
 		code += 128;
 	} else if (WIFEXITED(status)) {
 		code = WEXITSTATUS(status);
-		/*
-		 * Convert special exit code when execvp failed.
-		 */
-		if (code == 127) {
-			code = -1;
-			failed_errno = ENOENT;
-		}
 	} else {
 		error("waitpid is confused (%s)", argv0);
 	}
@@ -456,6 +517,7 @@ int start_command(struct child_process *cmd)
 	int null_fd = -1;
 	char **childenv;
 	struct argv_array argv = ARGV_ARRAY_INIT;
+	struct child_err cerr;
 
 	if (pipe(notify_pipe))
 		notify_pipe[0] = notify_pipe[1] = -1;
@@ -474,20 +536,16 @@ int start_command(struct child_process *cmd)
 	failed_errno = errno;
 	if (!cmd->pid) {
 		/*
-		 * Redirect the channel to write syscall error messages to
-		 * before redirecting the process's stderr so that all die()
-		 * in subsequent call paths use the parent's stderr.
+		 * Ensure the default die/error/warn routines do not get
+		 * called, they can take stdio locks and malloc.
 		 */
-		if (cmd->no_stderr || need_err) {
-			int child_err = dup(2);
-			set_cloexec(child_err);
-			set_error_handle(fdopen(child_err, "w"));
-		}
+		set_die_routine(child_die_fn);
+		set_error_routine(child_error_fn);
+		set_warn_routine(child_warn_fn);
 
 		close(notify_pipe[0]);
 		set_cloexec(notify_pipe[1]);
 		child_notifier = notify_pipe[1];
-		atexit(notify_parent);
 
 		if (cmd->no_stdin)
 			dup2(null_fd, 0);
@@ -522,8 +580,7 @@ int start_command(struct child_process *cmd)
 		}
 
 		if (cmd->dir && chdir(cmd->dir))
-			die_errno("exec '%s': cd to '%s' failed", cmd->argv[0],
-			    cmd->dir);
+			child_die(CHILD_ERR_CHDIR);
 
 		/*
 		 * Attempt to exec using the command and arguments starting at
@@ -538,12 +595,11 @@ int start_command(struct child_process *cmd)
 			       (char *const *) childenv);
 
 		if (errno == ENOENT) {
-			if (!cmd->silent_exec_failure)
-				error("cannot run %s: %s", cmd->argv[0],
-					strerror(ENOENT));
-			exit(127);
+			if (cmd->silent_exec_failure)
+				child_die(CHILD_ERR_SILENT);
+			child_die(CHILD_ERR_ENOENT);
 		} else {
-			die_errno("cannot exec '%s'", cmd->argv[0]);
+			child_die(CHILD_ERR_ERRNO);
 		}
 	}
 	if (cmd->pid < 0)
@@ -554,17 +610,18 @@ int start_command(struct child_process *cmd)
 	/*
 	 * Wait for child's exec. If the exec succeeds (or if fork()
 	 * failed), EOF is seen immediately by the parent. Otherwise, the
-	 * child process sends a single byte.
+	 * child process sends a child_err struct.
 	 * Note that use of this infrastructure is completely advisory,
 	 * therefore, we keep error checks minimal.
 	 */
 	close(notify_pipe[1]);
-	if (read(notify_pipe[0], &notify_pipe[1], 1) == 1) {
+	if (xread(notify_pipe[0], &cerr, sizeof(cerr)) == sizeof(cerr)) {
 		/*
 		 * At this point we know that fork() succeeded, but exec()
 		 * failed. Errors have been reported to our stderr.
 		 */
 		wait_or_whine(cmd->pid, cmd->argv[0], 0);
+		child_err_spew(cmd, &cerr);
 		failed_errno = errno;
 		cmd->pid = -1;
 	}
-- 
2.12.2.762.g0e3151a226-goog


^ permalink raw reply	[flat|nested] 140+ messages in thread

* [PATCH v4 08/10] run-command: handle dup2 and close errors in child
  2017-04-17 22:08     ` [PATCH v4 00/10] forking and threading Brandon Williams
                         ` (6 preceding siblings ...)
  2017-04-17 22:08       ` [PATCH v4 07/10] run-command: eliminate calls to error handling functions in child Brandon Williams
@ 2017-04-17 22:08       ` " Brandon Williams
  2017-04-17 22:08       ` [PATCH v4 09/10] run-command: add note about forking and threading Brandon Williams
                         ` (2 subsequent siblings)
  10 siblings, 0 replies; 140+ messages in thread
From: Brandon Williams @ 2017-04-17 22:08 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, jrnieder, e

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 run-command.c | 58 ++++++++++++++++++++++++++++++++++++++++++----------------
 1 file changed, 42 insertions(+), 16 deletions(-)

diff --git a/run-command.c b/run-command.c
index e1e8780ca..bd6414283 100644
--- a/run-command.c
+++ b/run-command.c
@@ -213,6 +213,8 @@ static int child_notifier = -1;
 
 enum child_errcode {
 	CHILD_ERR_CHDIR,
+	CHILD_ERR_DUP2,
+	CHILD_ERR_CLOSE,
 	CHILD_ERR_ENOENT,
 	CHILD_ERR_SILENT,
 	CHILD_ERR_ERRNO
@@ -235,6 +237,24 @@ static void child_die(enum child_errcode err)
 	_exit(1);
 }
 
+static void child_dup2(int fd, int to)
+{
+	if (dup2(fd, to) < 0)
+		child_die(CHILD_ERR_DUP2);
+}
+
+static void child_close(int fd)
+{
+	if (close(fd))
+		child_die(CHILD_ERR_CLOSE);
+}
+
+static void child_close_pair(int fd[2])
+{
+	child_close(fd[0]);
+	child_close(fd[1]);
+}
+
 /*
  * parent will make it look like the child spewed a fatal error and died
  * this is needed to prevent changes to t0061.
@@ -277,6 +297,12 @@ static void child_err_spew(struct child_process *cmd, struct child_err *cerr)
 		error_errno("exec '%s': cd to '%s' failed",
 			    cmd->argv[0], cmd->dir);
 		break;
+	case CHILD_ERR_DUP2:
+		error_errno("dup2() in child failed");
+		break;
+	case CHILD_ERR_CLOSE:
+		error_errno("close() in child failed");
+		break;
 	case CHILD_ERR_ENOENT:
 		error_errno("cannot run %s", cmd->argv[0]);
 		break;
@@ -548,35 +574,35 @@ int start_command(struct child_process *cmd)
 		child_notifier = notify_pipe[1];
 
 		if (cmd->no_stdin)
-			dup2(null_fd, 0);
+			child_dup2(null_fd, 0);
 		else if (need_in) {
-			dup2(fdin[0], 0);
-			close_pair(fdin);
+			child_dup2(fdin[0], 0);
+			child_close_pair(fdin);
 		} else if (cmd->in) {
-			dup2(cmd->in, 0);
-			close(cmd->in);
+			child_dup2(cmd->in, 0);
+			child_close(cmd->in);
 		}
 
 		if (cmd->no_stderr)
-			dup2(null_fd, 2);
+			child_dup2(null_fd, 2);
 		else if (need_err) {
-			dup2(fderr[1], 2);
-			close_pair(fderr);
+			child_dup2(fderr[1], 2);
+			child_close_pair(fderr);
 		} else if (cmd->err > 1) {
-			dup2(cmd->err, 2);
-			close(cmd->err);
+			child_dup2(cmd->err, 2);
+			child_close(cmd->err);
 		}
 
 		if (cmd->no_stdout)
-			dup2(null_fd, 1);
+			child_dup2(null_fd, 1);
 		else if (cmd->stdout_to_stderr)
-			dup2(2, 1);
+			child_dup2(2, 1);
 		else if (need_out) {
-			dup2(fdout[1], 1);
-			close_pair(fdout);
+			child_dup2(fdout[1], 1);
+			child_close_pair(fdout);
 		} else if (cmd->out > 1) {
-			dup2(cmd->out, 1);
-			close(cmd->out);
+			child_dup2(cmd->out, 1);
+			child_close(cmd->out);
 		}
 
 		if (cmd->dir && chdir(cmd->dir))
-- 
2.12.2.762.g0e3151a226-goog


^ permalink raw reply	[flat|nested] 140+ messages in thread

* [PATCH v4 09/10] run-command: add note about forking and threading
  2017-04-17 22:08     ` [PATCH v4 00/10] forking and threading Brandon Williams
                         ` (7 preceding siblings ...)
  2017-04-17 22:08       ` [PATCH v4 08/10] run-command: handle dup2 and close errors " Brandon Williams
@ 2017-04-17 22:08       ` Brandon Williams
  2017-04-17 22:08       ` [PATCH v4 10/10] run-command: block signals between fork and execve Brandon Williams
  2017-04-18 23:17       ` [PATCH v5 00/11] forking and threading Brandon Williams
  10 siblings, 0 replies; 140+ messages in thread
From: Brandon Williams @ 2017-04-17 22:08 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, jrnieder, e

All non-Async-Signal-Safe functions (e.g. malloc and die) were removed
between 'fork' and 'exec' in start_command in order to avoid potential
deadlocking when forking while multiple threads are running.  This
deadlocking is possible when a thread (other than the one forking) has
acquired a lock and didn't get around to releasing it before the fork.
This leaves the lock in a locked state in the resulting process with no
hope of it ever being released.

Add a note describing this potential pitfall before the call to 'fork()'
so people working in this section of the code know to only use
Async-Signal-Safe functions in the child process.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 run-command.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/run-command.c b/run-command.c
index bd6414283..c27c53bc5 100644
--- a/run-command.c
+++ b/run-command.c
@@ -558,6 +558,15 @@ int start_command(struct child_process *cmd)
 	prepare_cmd(&argv, cmd);
 	childenv = prep_childenv(cmd->env);
 
+	/*
+	 * NOTE: In order to prevent deadlocking when using threads special
+	 * care should be taken with the function calls made in between the
+	 * fork() and exec() calls.  No calls should be made to functions which
+	 * require acquiring a lock (e.g. malloc) as the lock could have been
+	 * held by another thread at the time of forking, causing the lock to
+	 * never be released in the child process.  This means only
+	 * Async-Signal-Safe functions are permitted in the child.
+	 */
 	cmd->pid = fork();
 	failed_errno = errno;
 	if (!cmd->pid) {
-- 
2.12.2.762.g0e3151a226-goog


^ permalink raw reply	[flat|nested] 140+ messages in thread

* [PATCH v4 10/10] run-command: block signals between fork and execve
  2017-04-17 22:08     ` [PATCH v4 00/10] forking and threading Brandon Williams
                         ` (8 preceding siblings ...)
  2017-04-17 22:08       ` [PATCH v4 09/10] run-command: add note about forking and threading Brandon Williams
@ 2017-04-17 22:08       ` Brandon Williams
  2017-04-18 23:17       ` [PATCH v5 00/11] forking and threading Brandon Williams
  10 siblings, 0 replies; 140+ messages in thread
From: Brandon Williams @ 2017-04-17 22:08 UTC (permalink / raw)
  To: git; +Cc: Eric Wong, jrnieder, Brandon Williams

From: Eric Wong <e@80x24.org>

Signal handlers of the parent firing in the forked child may
have unintended side effects.  Rather than auditing every signal
handler we have and will ever have, block signals while forking
and restore default signal handlers in the child before execve.

Restoring default signal handlers is required because
execve does not unblock signals, it only restores default
signal handlers.  So we must restore them with sigprocmask
before execve, leaving a window when signal handlers
we control can fire in the child.  Continue ignoring
ignored signals, but reset the rest to defaults.

Similarly, disable pthread cancellation to future-proof our code
in case we start using cancellation; as cancellation is
implemented with signals in glibc.

Signed-off-by: Eric Wong <e@80x24.org>
Signed-off-by: Brandon Williams <bmwill@google.com>
---
 run-command.c | 68 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 68 insertions(+)

diff --git a/run-command.c b/run-command.c
index c27c53bc5..c2d15310b 100644
--- a/run-command.c
+++ b/run-command.c
@@ -215,6 +215,7 @@ enum child_errcode {
 	CHILD_ERR_CHDIR,
 	CHILD_ERR_DUP2,
 	CHILD_ERR_CLOSE,
+	CHILD_ERR_SIGPROCMASK,
 	CHILD_ERR_ENOENT,
 	CHILD_ERR_SILENT,
 	CHILD_ERR_ERRNO
@@ -303,6 +304,9 @@ static void child_err_spew(struct child_process *cmd, struct child_err *cerr)
 	case CHILD_ERR_CLOSE:
 		error_errno("close() in child failed");
 		break;
+	case CHILD_ERR_SIGPROCMASK:
+		error_errno("sigprocmask failed restoring signals");
+		break;
 	case CHILD_ERR_ENOENT:
 		error_errno("cannot run %s", cmd->argv[0]);
 		break;
@@ -421,6 +425,53 @@ static char **prep_childenv(const char *const *deltaenv)
 }
 #endif
 
+struct atfork_state {
+#ifndef NO_PTHREADS
+	int cs;
+#endif
+	sigset_t old;
+};
+
+#ifndef NO_PTHREADS
+static void bug_die(int err, const char *msg)
+{
+	if (err) {
+		errno = err;
+		die_errno("BUG: %s", msg);
+	}
+}
+#endif
+
+static void atfork_prepare(struct atfork_state *as)
+{
+	sigset_t all;
+
+	if (sigfillset(&all))
+		die_errno("sigfillset");
+#ifdef NO_PTHREADS
+	if (sigprocmask(SIG_SETMASK, &all, &as->old))
+		die_errno("sigprocmask");
+#else
+	bug_die(pthread_sigmask(SIG_SETMASK, &all, &as->old),
+		"blocking all signals");
+	bug_die(pthread_setcancelstate(PTHREAD_CANCEL_DISABLE, &as->cs),
+		"disabling cancellation");
+#endif
+}
+
+static void atfork_parent(struct atfork_state *as)
+{
+#ifdef NO_PTHREADS
+	if (sigprocmask(SIG_SETMASK, &as->old, NULL))
+		die_errno("sigprocmask");
+#else
+	bug_die(pthread_setcancelstate(as->cs, NULL),
+		"re-enabling cancellation");
+	bug_die(pthread_sigmask(SIG_SETMASK, &as->old, NULL),
+		"restoring signal mask");
+#endif
+}
+
 static inline void set_cloexec(int fd)
 {
 	int flags = fcntl(fd, F_GETFD);
@@ -544,6 +595,7 @@ int start_command(struct child_process *cmd)
 	char **childenv;
 	struct argv_array argv = ARGV_ARRAY_INIT;
 	struct child_err cerr;
+	struct atfork_state as;
 
 	if (pipe(notify_pipe))
 		notify_pipe[0] = notify_pipe[1] = -1;
@@ -557,6 +609,7 @@ int start_command(struct child_process *cmd)
 
 	prepare_cmd(&argv, cmd);
 	childenv = prep_childenv(cmd->env);
+	atfork_prepare(&as);
 
 	/*
 	 * NOTE: In order to prevent deadlocking when using threads special
@@ -570,6 +623,7 @@ int start_command(struct child_process *cmd)
 	cmd->pid = fork();
 	failed_errno = errno;
 	if (!cmd->pid) {
+		int sig;
 		/*
 		 * Ensure the default die/error/warn routines do not get
 		 * called, they can take stdio locks and malloc.
@@ -618,6 +672,19 @@ int start_command(struct child_process *cmd)
 			child_die(CHILD_ERR_CHDIR);
 
 		/*
+		 * restore default signal handlers here, in case
+		 * we catch a signal right before execve below
+		 */
+		for (sig = 1; sig < NSIG; sig++) {
+			/* ignored signals get reset to SIG_DFL on execve */
+			if (signal(sig, SIG_DFL) == SIG_IGN)
+				signal(sig, SIG_IGN);
+		}
+
+		if (sigprocmask(SIG_SETMASK, &as.old, NULL) != 0)
+			child_die(CHILD_ERR_SIGPROCMASK);
+
+		/*
 		 * Attempt to exec using the command and arguments starting at
 		 * argv.argv[1].  argv.argv[0] contains SHELL_PATH which will
 		 * be used in the event exec failed with ENOEXEC at which point
@@ -637,6 +704,7 @@ int start_command(struct child_process *cmd)
 			child_die(CHILD_ERR_ERRNO);
 		}
 	}
+	atfork_parent(&as);
 	if (cmd->pid < 0)
 		error_errno("cannot fork() for %s", cmd->argv[0]);
 	else if (cmd->clean_on_exit)
-- 
2.12.2.762.g0e3151a226-goog


^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [PATCH v4 05/10] run-command: prepare child environment before forking
  2017-04-17 22:08       ` [PATCH v4 05/10] run-command: prepare child environment before forking Brandon Williams
@ 2017-04-18  0:26         ` Eric Wong
  2017-04-18 21:02           ` Brandon Williams
  0 siblings, 1 reply; 140+ messages in thread
From: Eric Wong @ 2017-04-18  0:26 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, jrnieder, Karsten Blees

+Cc Karsten for comments below...

Brandon Williams <bmwill@google.com> wrote:
> In order to avoid allocation between 'fork()' and 'exec()' prepare the
> environment to be used in the child process prior to forking.
> 
> Switch to using 'execve()' so that the construct child environment can
> used in the exec'd process.
> 
> Signed-off-by: Brandon Williams <bmwill@google.com>
> ---
>  run-command.c | 87 ++++++++++++++++++++++++++++++++++++++++++++++++++++-------
>  1 file changed, 77 insertions(+), 10 deletions(-)
> 
> diff --git a/run-command.c b/run-command.c
> index 1c7a3b611..2fff60a04 100644
> --- a/run-command.c
> +++ b/run-command.c
> @@ -267,6 +267,76 @@ static void prepare_cmd(struct argv_array *out, const struct child_process *cmd)
>  		}
>  	}
>  }
> +
> +static int env_isequal(const char *e1, const char *e2)
> +{
> +	for (;;) {
> +		char c1 = *e1++;
> +		char c2 = *e2++;
> +		c1 = (c1 == '=') ? '\0' : tolower(c1);
> +		c2 = (c2 == '=') ? '\0' : tolower(c2);

Dealing with C strings scares me so maybe I'm misreading;
but: why is this comparison case-insensitive?

Reading on...

> +
> +		if (c1 != c2)
> +			return 0;
> +		if (c1 == '\0')
> +			return 1;
> +	}
> +}
> +
> +static int searchenv(char **env, const char *name)
> +{
> +	int pos = 0;
> +
> +	for (; env[pos]; pos++)
> +		if (env_isequal(env[pos], name))
> +			break;
> +
> +	return pos;
> +}

So this scans through every string in env looking for something
that matches 'name' followed by a '=', and return the position
of that in env so do_putenv below can use it:

> +
> +static int do_putenv(char **env, int env_nr, const char *name)
> +{
> +	int pos = searchenv(env, name);
> +
> +	if (strchr(name, '=')) {
> +		/* ('key=value'), insert of replace entry */

		"insert or replace"

> +		if (pos >= env_nr)
> +			env_nr++;
> +		env[pos] = (char *) name;

OK, this sets an entry...

> +	} else if (pos < env_nr) {
> +		/* otherwise ('key') remove existing entry */
> +		env_nr--;
> +		memmove(&env[pos], &env[pos + 1],
> +			(env_nr - pos) * sizeof(char *));
> +		env[env_nr] = NULL;

And this clobbers it, freeing up a slot for future sets and
tells the caller by returning env_nr below:

> +	}
> +
> +	return env_nr;
> +}

So now the caller below will know where there's a free slot
to place the next environment variable:

> +static char **prep_childenv(const char *const *deltaenv)
> +{
> +	extern char **environ;
> +	char **childenv;
> +	int childenv_nr = 0, childenv_alloc = 0;
> +	int i;
> +
> +	for (i = 0; environ[i]; i++)
> +		childenv_nr++;
> +	for (i = 0; deltaenv && deltaenv[i]; i++)
> +		childenv_alloc++;
> +	/* Add one for the NULL termination */
> +	childenv_alloc += childenv_nr + 1;
> +
> +	childenv = xcalloc(childenv_alloc, sizeof(char *));
> +	memcpy(childenv, environ, childenv_nr * sizeof(char *));
> +
> +	/* merge in deltaenv */
> +	for (i = 0; deltaenv && deltaenv[i]; i++)
> +		childenv_nr = do_putenv(childenv, childenv_nr, deltaenv[i]);
> +
> +	return childenv;
> +}

OK, the above seems to make sense; copy parent environment and
then make changes from deltaenv on top of it...

>  #endif
>  
>  static inline void set_cloexec(int fd)
> @@ -395,12 +465,14 @@ int start_command(struct child_process *cmd)
>  #ifndef GIT_WINDOWS_NATIVE
>  {
>  	int notify_pipe[2];
> +	char **childenv;
>  	struct argv_array argv = ARGV_ARRAY_INIT;
>  
>  	if (pipe(notify_pipe))
>  		notify_pipe[0] = notify_pipe[1] = -1;
>  
>  	prepare_cmd(&argv, cmd);
> +	childenv = prep_childenv(cmd->env);
>  
>  	cmd->pid = fork();
>  	failed_errno = errno;
> @@ -456,14 +528,6 @@ int start_command(struct child_process *cmd)
>  		if (cmd->dir && chdir(cmd->dir))
>  			die_errno("exec '%s': cd to '%s' failed", cmd->argv[0],
>  			    cmd->dir);
> -		if (cmd->env) {
> -			for (; *cmd->env; cmd->env++) {
> -				if (strchr(*cmd->env, '='))
> -					putenv((char *)*cmd->env);
> -				else
> -					unsetenv(*cmd->env);
> -			}
> -		}

... which was what the original code did inside the forked child.

So, everything above made sense to me except the use of tolower.

So it looks like Brandon is reusing some of Karsten's
compat/mingw.c changes in
commit 343ff06da7d83f40892b10a3b653c7d0e6cb526c,
("Win32: keep the environment sorted")

But, since these changes to run-command are *nix only,
the sorting the env makes no sense, and neither
does case-insensitivity.

However, reading Karsten's commit; it only seems the use
of case-insensitive qsort is correct.  I'm not sure
if mingw is case-insensitive for actual env modifications,
but I know *nix env names are case-sensitive.

So, there _may_ be a bug in compat/mingw.c with the use
of bsearchenv and compareenv.  It does not seem bsearch
is correct to use there, nor can a case-insensitive
compareenv be used for searching, only sorting...

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [PATCH v4 05/10] run-command: prepare child environment before forking
  2017-04-18  0:26         ` Eric Wong
@ 2017-04-18 21:02           ` Brandon Williams
  0 siblings, 0 replies; 140+ messages in thread
From: Brandon Williams @ 2017-04-18 21:02 UTC (permalink / raw)
  To: Eric Wong; +Cc: git, jrnieder, Karsten Blees

On 04/18, Eric Wong wrote:
> > +static int env_isequal(const char *e1, const char *e2)
> > +{
> > +	for (;;) {
> > +		char c1 = *e1++;
> > +		char c2 = *e2++;
> > +		c1 = (c1 == '=') ? '\0' : tolower(c1);
> > +		c2 = (c2 == '=') ? '\0' : tolower(c2);
> 
> Dealing with C strings scares me so maybe I'm misreading;
> but: why is this comparison case-insensitive?

Well i was pulling inspiration from the stuff in mingw.c...looks like i
probably shouldn't have done so as you're correct, they should be
case-sensitive.  Jonathan pointed out that doing this env stuff in
vanilla C may not be a good idea...and I kinda forgot about that cause
it worked (it passed tests) Let me re-write this section of code to make
it correct, and saner.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 140+ messages in thread

* [PATCH v5 00/11] forking and threading
  2017-04-17 22:08     ` [PATCH v4 00/10] forking and threading Brandon Williams
                         ` (9 preceding siblings ...)
  2017-04-17 22:08       ` [PATCH v4 10/10] run-command: block signals between fork and execve Brandon Williams
@ 2017-04-18 23:17       ` Brandon Williams
  2017-04-18 23:17         ` [PATCH v5 01/11] t5550: use write_script to generate post-update hook Brandon Williams
                           ` (11 more replies)
  10 siblings, 12 replies; 140+ messages in thread
From: Brandon Williams @ 2017-04-18 23:17 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, e, jrnieder

v5 addresses the issues with creating the child's environment.  Instead of
doing vanilla C string manipulation, I used a struct string_list and struct
strbuf.  The code should be much easier to read and understand now.  I also
needed to add a function to remove a string_list_item from a string_list to the
string_list API.

Brandon Williams (10):
  t5550: use write_script to generate post-update hook
  t0061: run_command executes scripts without a #! line
  run-command: prepare command before forking
  run-command: use the async-signal-safe execv instead of execvp
  string-list: add string_list_remove function
  run-command: prepare child environment before forking
  run-command: don't die in child when duping /dev/null
  run-command: eliminate calls to error handling functions in child
  run-command: handle dup2 and close errors in child
  run-command: add note about forking and threading

Eric Wong (1):
  run-command: block signals between fork and execve

 run-command.c              | 404 +++++++++++++++++++++++++++++++++++----------
 string-list.c              |  18 ++
 string-list.h              |   5 +
 t/t0061-run-command.sh     |  11 ++
 t/t5550-http-fetch-dumb.sh |   5 +-
 5 files changed, 358 insertions(+), 85 deletions(-)

-- 
2.12.2.816.g2cccc81164-goog


^ permalink raw reply	[flat|nested] 140+ messages in thread

* [PATCH v5 01/11] t5550: use write_script to generate post-update hook
  2017-04-18 23:17       ` [PATCH v5 00/11] forking and threading Brandon Williams
@ 2017-04-18 23:17         ` Brandon Williams
  2017-04-18 23:17         ` [PATCH v5 02/11] t0061: run_command executes scripts without a #! line Brandon Williams
                           ` (10 subsequent siblings)
  11 siblings, 0 replies; 140+ messages in thread
From: Brandon Williams @ 2017-04-18 23:17 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, e, jrnieder

The post-update hooks created in t5550-http-fetch-dumb.sh is missing the
"!#/bin/sh" line which can cause issues with portability.  Instead
create the hook using the 'write_script' function which includes the
proper "#!" line.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 t/t5550-http-fetch-dumb.sh | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/t/t5550-http-fetch-dumb.sh b/t/t5550-http-fetch-dumb.sh
index 87308cdce..8552184e7 100755
--- a/t/t5550-http-fetch-dumb.sh
+++ b/t/t5550-http-fetch-dumb.sh
@@ -20,8 +20,9 @@ test_expect_success 'create http-accessible bare repository with loose objects'
 	(cd "$HTTPD_DOCUMENT_ROOT_PATH/repo.git" &&
 	 git config core.bare true &&
 	 mkdir -p hooks &&
-	 echo "exec git update-server-info" >hooks/post-update &&
-	 chmod +x hooks/post-update &&
+	 write_script "hooks/post-update" <<-\EOF &&
+	 exec git update-server-info
+	EOF
 	 hooks/post-update
 	) &&
 	git remote add public "$HTTPD_DOCUMENT_ROOT_PATH/repo.git" &&
-- 
2.12.2.816.g2cccc81164-goog


^ permalink raw reply	[flat|nested] 140+ messages in thread

* [PATCH v5 02/11] t0061: run_command executes scripts without a #! line
  2017-04-18 23:17       ` [PATCH v5 00/11] forking and threading Brandon Williams
  2017-04-18 23:17         ` [PATCH v5 01/11] t5550: use write_script to generate post-update hook Brandon Williams
@ 2017-04-18 23:17         ` Brandon Williams
  2017-04-19  5:43           ` Johannes Sixt
  2017-04-18 23:17         ` [PATCH v5 03/11] run-command: prepare command before forking Brandon Williams
                           ` (9 subsequent siblings)
  11 siblings, 1 reply; 140+ messages in thread
From: Brandon Williams @ 2017-04-18 23:17 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, e, jrnieder

Add a test to 't0061-run-command.sh' to ensure that run_command can
continue to execute scripts which don't include a '#!' line.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 t/t0061-run-command.sh | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/t/t0061-run-command.sh b/t/t0061-run-command.sh
index 12228b4aa..1a7490e29 100755
--- a/t/t0061-run-command.sh
+++ b/t/t0061-run-command.sh
@@ -26,6 +26,17 @@ test_expect_success 'run_command can run a command' '
 	test_cmp empty err
 '
 
+test_expect_success 'run_command can run a script without a #! line' '
+	cat >hello <<-\EOF &&
+	cat hello-script
+	EOF
+	chmod +x hello &&
+	test-run-command run-command ./hello >actual 2>err &&
+
+	test_cmp hello-script actual &&
+	test_cmp empty err
+'
+
 test_expect_success POSIXPERM 'run_command reports EACCES' '
 	cat hello-script >hello.sh &&
 	chmod -x hello.sh &&
-- 
2.12.2.816.g2cccc81164-goog


^ permalink raw reply	[flat|nested] 140+ messages in thread

* [PATCH v5 03/11] run-command: prepare command before forking
  2017-04-18 23:17       ` [PATCH v5 00/11] forking and threading Brandon Williams
  2017-04-18 23:17         ` [PATCH v5 01/11] t5550: use write_script to generate post-update hook Brandon Williams
  2017-04-18 23:17         ` [PATCH v5 02/11] t0061: run_command executes scripts without a #! line Brandon Williams
@ 2017-04-18 23:17         ` Brandon Williams
  2017-04-18 23:17         ` [PATCH v5 04/11] run-command: use the async-signal-safe execv instead of execvp Brandon Williams
                           ` (8 subsequent siblings)
  11 siblings, 0 replies; 140+ messages in thread
From: Brandon Williams @ 2017-04-18 23:17 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, e, jrnieder

According to [1] we need to only call async-signal-safe operations between fork
and exec.  Using malloc to build the argv array isn't async-signal-safe.

In order to avoid allocation between 'fork()' and 'exec()' prepare the
argv array used in the exec call prior to forking the process.

[1] http://pubs.opengroup.org/onlinepubs/009695399/functions/fork.html

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 run-command.c | 46 ++++++++++++++++++++++++++--------------------
 1 file changed, 26 insertions(+), 20 deletions(-)

diff --git a/run-command.c b/run-command.c
index 574b81d3e..d8d143795 100644
--- a/run-command.c
+++ b/run-command.c
@@ -221,18 +221,6 @@ static const char **prepare_shell_cmd(struct argv_array *out, const char **argv)
 }
 
 #ifndef GIT_WINDOWS_NATIVE
-static int execv_shell_cmd(const char **argv)
-{
-	struct argv_array nargv = ARGV_ARRAY_INIT;
-	prepare_shell_cmd(&nargv, argv);
-	trace_argv_printf(nargv.argv, "trace: exec:");
-	sane_execvp(nargv.argv[0], (char **)nargv.argv);
-	argv_array_clear(&nargv);
-	return -1;
-}
-#endif
-
-#ifndef GIT_WINDOWS_NATIVE
 static int child_notifier = -1;
 
 static void notify_parent(void)
@@ -244,6 +232,21 @@ static void notify_parent(void)
 	 */
 	xwrite(child_notifier, "", 1);
 }
+
+static void prepare_cmd(struct argv_array *out, const struct child_process *cmd)
+{
+	if (!cmd->argv[0])
+		die("BUG: command is empty");
+
+	if (cmd->git_cmd) {
+		argv_array_push(out, "git");
+		argv_array_pushv(out, cmd->argv);
+	} else if (cmd->use_shell) {
+		prepare_shell_cmd(out, cmd->argv);
+	} else {
+		argv_array_pushv(out, cmd->argv);
+	}
+}
 #endif
 
 static inline void set_cloexec(int fd)
@@ -372,9 +375,13 @@ int start_command(struct child_process *cmd)
 #ifndef GIT_WINDOWS_NATIVE
 {
 	int notify_pipe[2];
+	struct argv_array argv = ARGV_ARRAY_INIT;
+
 	if (pipe(notify_pipe))
 		notify_pipe[0] = notify_pipe[1] = -1;
 
+	prepare_cmd(&argv, cmd);
+
 	cmd->pid = fork();
 	failed_errno = errno;
 	if (!cmd->pid) {
@@ -437,12 +444,9 @@ int start_command(struct child_process *cmd)
 					unsetenv(*cmd->env);
 			}
 		}
-		if (cmd->git_cmd)
-			execv_git_cmd(cmd->argv);
-		else if (cmd->use_shell)
-			execv_shell_cmd(cmd->argv);
-		else
-			sane_execvp(cmd->argv[0], (char *const*) cmd->argv);
+
+		sane_execvp(argv.argv[0], (char *const *) argv.argv);
+
 		if (errno == ENOENT) {
 			if (!cmd->silent_exec_failure)
 				error("cannot run %s: %s", cmd->argv[0],
@@ -458,7 +462,7 @@ int start_command(struct child_process *cmd)
 		mark_child_for_cleanup(cmd->pid, cmd);
 
 	/*
-	 * Wait for child's execvp. If the execvp succeeds (or if fork()
+	 * Wait for child's exec. If the exec succeeds (or if fork()
 	 * failed), EOF is seen immediately by the parent. Otherwise, the
 	 * child process sends a single byte.
 	 * Note that use of this infrastructure is completely advisory,
@@ -467,7 +471,7 @@ int start_command(struct child_process *cmd)
 	close(notify_pipe[1]);
 	if (read(notify_pipe[0], &notify_pipe[1], 1) == 1) {
 		/*
-		 * At this point we know that fork() succeeded, but execvp()
+		 * At this point we know that fork() succeeded, but exec()
 		 * failed. Errors have been reported to our stderr.
 		 */
 		wait_or_whine(cmd->pid, cmd->argv[0], 0);
@@ -475,6 +479,8 @@ int start_command(struct child_process *cmd)
 		cmd->pid = -1;
 	}
 	close(notify_pipe[0]);
+
+	argv_array_clear(&argv);
 }
 #else
 {
-- 
2.12.2.816.g2cccc81164-goog


^ permalink raw reply	[flat|nested] 140+ messages in thread

* [PATCH v5 04/11] run-command: use the async-signal-safe execv instead of execvp
  2017-04-18 23:17       ` [PATCH v5 00/11] forking and threading Brandon Williams
                           ` (2 preceding siblings ...)
  2017-04-18 23:17         ` [PATCH v5 03/11] run-command: prepare command before forking Brandon Williams
@ 2017-04-18 23:17         ` Brandon Williams
  2017-04-18 23:17         ` [PATCH v5 05/11] string-list: add string_list_remove function Brandon Williams
                           ` (7 subsequent siblings)
  11 siblings, 0 replies; 140+ messages in thread
From: Brandon Williams @ 2017-04-18 23:17 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, e, jrnieder

Convert the function used to exec from 'execvp()' to 'execv()' as the (p)
variant of exec isn't async-signal-safe and has the potential to call malloc
during the path resolution it performs.  Instead we simply do the path
resolution ourselves during the preparation stage prior to forking.  There also
don't exist any portable (p) variants which also take in an environment to use
in the exec'd process.  This allows easy migration to using 'execve()' in a
future patch.

Also, as noted in [1], in the event of an ENOEXEC the (p) variants of
exec will attempt to execute the command by interpreting it with the
'sh' utility.  To maintain this functionality, if 'execv()' fails with
ENOEXEC, start_command will atempt to execute the command by
interpreting it with 'sh'.

[1] http://pubs.opengroup.org/onlinepubs/009695399/functions/exec.html

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 run-command.c | 30 +++++++++++++++++++++++++++++-
 1 file changed, 29 insertions(+), 1 deletion(-)

diff --git a/run-command.c b/run-command.c
index d8d143795..1c7a3b611 100644
--- a/run-command.c
+++ b/run-command.c
@@ -238,6 +238,12 @@ static void prepare_cmd(struct argv_array *out, const struct child_process *cmd)
 	if (!cmd->argv[0])
 		die("BUG: command is empty");
 
+	/*
+	 * Add SHELL_PATH so in the event exec fails with ENOEXEC we can
+	 * attempt to interpret the command with 'sh'.
+	 */
+	argv_array_push(out, SHELL_PATH);
+
 	if (cmd->git_cmd) {
 		argv_array_push(out, "git");
 		argv_array_pushv(out, cmd->argv);
@@ -246,6 +252,20 @@ static void prepare_cmd(struct argv_array *out, const struct child_process *cmd)
 	} else {
 		argv_array_pushv(out, cmd->argv);
 	}
+
+	/*
+	 * If there are no '/' characters in the command then perform a path
+	 * lookup and use the resolved path as the command to exec.  If there
+	 * are no '/' characters or if the command wasn't found in the path,
+	 * have exec attempt to invoke the command directly.
+	 */
+	if (!strchr(out->argv[1], '/')) {
+		char *program = locate_in_PATH(out->argv[1]);
+		if (program) {
+			free((char *)out->argv[1]);
+			out->argv[1] = program;
+		}
+	}
 }
 #endif
 
@@ -445,7 +465,15 @@ int start_command(struct child_process *cmd)
 			}
 		}
 
-		sane_execvp(argv.argv[0], (char *const *) argv.argv);
+		/*
+		 * Attempt to exec using the command and arguments starting at
+		 * argv.argv[1].  argv.argv[0] contains SHELL_PATH which will
+		 * be used in the event exec failed with ENOEXEC at which point
+		 * we will try to interpret the command using 'sh'.
+		 */
+		execv(argv.argv[1], (char *const *) argv.argv + 1);
+		if (errno == ENOEXEC)
+			execv(argv.argv[0], (char *const *) argv.argv);
 
 		if (errno == ENOENT) {
 			if (!cmd->silent_exec_failure)
-- 
2.12.2.816.g2cccc81164-goog


^ permalink raw reply	[flat|nested] 140+ messages in thread

* [PATCH v5 05/11] string-list: add string_list_remove function
  2017-04-18 23:17       ` [PATCH v5 00/11] forking and threading Brandon Williams
                           ` (3 preceding siblings ...)
  2017-04-18 23:17         ` [PATCH v5 04/11] run-command: use the async-signal-safe execv instead of execvp Brandon Williams
@ 2017-04-18 23:17         ` Brandon Williams
  2017-04-18 23:31           ` Stefan Beller
  2017-04-18 23:18         ` [PATCH v5 06/11] run-command: prepare child environment before forking Brandon Williams
                           ` (6 subsequent siblings)
  11 siblings, 1 reply; 140+ messages in thread
From: Brandon Williams @ 2017-04-18 23:17 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, e, jrnieder

Teach string-list to be able to remove a string from a sorted
'struct string_list'.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 string-list.c | 18 ++++++++++++++++++
 string-list.h |  5 +++++
 2 files changed, 23 insertions(+)

diff --git a/string-list.c b/string-list.c
index 45016ad86..8f7b69ada 100644
--- a/string-list.c
+++ b/string-list.c
@@ -67,6 +67,24 @@ struct string_list_item *string_list_insert(struct string_list *list, const char
 	return list->items + index;
 }
 
+void string_list_remove(struct string_list *list, const char *string,
+			int free_util)
+{
+	int exact_match;
+	int i = get_entry_index(list, string, &exact_match);
+
+	if (exact_match) {
+		if (list->strdup_strings)
+			free(list->items[i].string);
+		if (free_util)
+			free(list->items[i].util);
+
+		list->nr--;
+		memmove(list->items + i, list->items + i + 1,
+			(list->nr - i) * sizeof(struct string_list_item));
+	}
+}
+
 int string_list_has_string(const struct string_list *list, const char *string)
 {
 	int exact_match;
diff --git a/string-list.h b/string-list.h
index d3809a141..18520dbc8 100644
--- a/string-list.h
+++ b/string-list.h
@@ -63,6 +63,11 @@ int string_list_find_insert_index(const struct string_list *list, const char *st
 struct string_list_item *string_list_insert(struct string_list *list, const char *string);
 
 /*
+ * Removes the given string from the sorted list.
+ */
+void string_list_remove(struct string_list *list, const char *string, int free_util);
+
+/*
  * Checks if the given string is part of a sorted list. If it is part of the list,
  * return the coresponding string_list_item, NULL otherwise.
  */
-- 
2.12.2.816.g2cccc81164-goog


^ permalink raw reply	[flat|nested] 140+ messages in thread

* [PATCH v5 06/11] run-command: prepare child environment before forking
  2017-04-18 23:17       ` [PATCH v5 00/11] forking and threading Brandon Williams
                           ` (4 preceding siblings ...)
  2017-04-18 23:17         ` [PATCH v5 05/11] string-list: add string_list_remove function Brandon Williams
@ 2017-04-18 23:18         ` Brandon Williams
  2017-04-18 23:18         ` [PATCH v5 07/11] run-command: don't die in child when duping /dev/null Brandon Williams
                           ` (5 subsequent siblings)
  11 siblings, 0 replies; 140+ messages in thread
From: Brandon Williams @ 2017-04-18 23:18 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, e, jrnieder

In order to avoid allocation between 'fork()' and 'exec()' prepare the
environment to be used in the child process prior to forking.

Switch to using 'execve()' so that the construct child environment can
used in the exec'd process.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 run-command.c | 66 ++++++++++++++++++++++++++++++++++++++++++++++++++---------
 1 file changed, 56 insertions(+), 10 deletions(-)

diff --git a/run-command.c b/run-command.c
index 1c7a3b611..15e2e74a7 100644
--- a/run-command.c
+++ b/run-command.c
@@ -267,6 +267,55 @@ static void prepare_cmd(struct argv_array *out, const struct child_process *cmd)
 		}
 	}
 }
+
+static char **prep_childenv(const char *const *deltaenv)
+{
+	extern char **environ;
+	char **childenv;
+	struct string_list env = STRING_LIST_INIT_DUP;
+	struct strbuf key = STRBUF_INIT;
+	const char *const *p;
+	int i;
+
+	/* Construct a sorted string list consisting of the current environ */
+	for (p = (const char *const *) environ; p && *p; p++) {
+		const char *equals = strchr(*p, '=');
+
+		if (equals) {
+			strbuf_reset(&key);
+			strbuf_add(&key, *p, equals - *p);
+			string_list_append(&env, key.buf)->util = (void *) *p;
+		} else {
+			string_list_append(&env, *p)->util = (void *) *p;
+		}
+	}
+	string_list_sort(&env);
+
+	/* Merge in 'deltaenv' with the current environ */
+	for (p = deltaenv; p && *p; p++) {
+		const char *equals = strchr(*p, '=');
+
+		if (equals) {
+			/* ('key=value'), insert or replace entry */
+			strbuf_reset(&key);
+			strbuf_add(&key, *p, equals - *p);
+			string_list_insert(&env, key.buf)->util = (void *) *p;
+		} else {
+			/* otherwise ('key') remove existing entry */
+			string_list_remove(&env, *p, 0);
+		}
+	}
+
+	/* Create an array of 'char *' to be used as the childenv */
+	childenv = xmalloc((env.nr + 1) * sizeof(char *));
+	for (i = 0; i < env.nr; i++)
+		childenv[i] = env.items[i].util;
+	childenv[env.nr] = NULL;
+
+	string_list_clear(&env, 0);
+	strbuf_release(&key);
+	return childenv;
+}
 #endif
 
 static inline void set_cloexec(int fd)
@@ -395,12 +444,14 @@ int start_command(struct child_process *cmd)
 #ifndef GIT_WINDOWS_NATIVE
 {
 	int notify_pipe[2];
+	char **childenv;
 	struct argv_array argv = ARGV_ARRAY_INIT;
 
 	if (pipe(notify_pipe))
 		notify_pipe[0] = notify_pipe[1] = -1;
 
 	prepare_cmd(&argv, cmd);
+	childenv = prep_childenv(cmd->env);
 
 	cmd->pid = fork();
 	failed_errno = errno;
@@ -456,14 +507,6 @@ int start_command(struct child_process *cmd)
 		if (cmd->dir && chdir(cmd->dir))
 			die_errno("exec '%s': cd to '%s' failed", cmd->argv[0],
 			    cmd->dir);
-		if (cmd->env) {
-			for (; *cmd->env; cmd->env++) {
-				if (strchr(*cmd->env, '='))
-					putenv((char *)*cmd->env);
-				else
-					unsetenv(*cmd->env);
-			}
-		}
 
 		/*
 		 * Attempt to exec using the command and arguments starting at
@@ -471,9 +514,11 @@ int start_command(struct child_process *cmd)
 		 * be used in the event exec failed with ENOEXEC at which point
 		 * we will try to interpret the command using 'sh'.
 		 */
-		execv(argv.argv[1], (char *const *) argv.argv + 1);
+		execve(argv.argv[1], (char *const *) argv.argv + 1,
+		       (char *const *) childenv);
 		if (errno == ENOEXEC)
-			execv(argv.argv[0], (char *const *) argv.argv);
+			execve(argv.argv[0], (char *const *) argv.argv,
+			       (char *const *) childenv);
 
 		if (errno == ENOENT) {
 			if (!cmd->silent_exec_failure)
@@ -509,6 +554,7 @@ int start_command(struct child_process *cmd)
 	close(notify_pipe[0]);
 
 	argv_array_clear(&argv);
+	free(childenv);
 }
 #else
 {
-- 
2.12.2.816.g2cccc81164-goog


^ permalink raw reply	[flat|nested] 140+ messages in thread

* [PATCH v5 07/11] run-command: don't die in child when duping /dev/null
  2017-04-18 23:17       ` [PATCH v5 00/11] forking and threading Brandon Williams
                           ` (5 preceding siblings ...)
  2017-04-18 23:18         ` [PATCH v5 06/11] run-command: prepare child environment before forking Brandon Williams
@ 2017-04-18 23:18         ` Brandon Williams
  2017-04-18 23:18         ` [PATCH v5 08/11] run-command: eliminate calls to error handling functions in child Brandon Williams
                           ` (4 subsequent siblings)
  11 siblings, 0 replies; 140+ messages in thread
From: Brandon Williams @ 2017-04-18 23:18 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, e, jrnieder

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 run-command.c | 28 +++++++++++++---------------
 1 file changed, 13 insertions(+), 15 deletions(-)

diff --git a/run-command.c b/run-command.c
index 15e2e74a7..b3a35dd82 100644
--- a/run-command.c
+++ b/run-command.c
@@ -117,18 +117,6 @@ static inline void close_pair(int fd[2])
 	close(fd[1]);
 }
 
-#ifndef GIT_WINDOWS_NATIVE
-static inline void dup_devnull(int to)
-{
-	int fd = open("/dev/null", O_RDWR);
-	if (fd < 0)
-		die_errno(_("open /dev/null failed"));
-	if (dup2(fd, to) < 0)
-		die_errno(_("dup2(%d,%d) failed"), fd, to);
-	close(fd);
-}
-#endif
-
 static char *locate_in_PATH(const char *file)
 {
 	const char *p = getenv("PATH");
@@ -444,12 +432,20 @@ int start_command(struct child_process *cmd)
 #ifndef GIT_WINDOWS_NATIVE
 {
 	int notify_pipe[2];
+	int null_fd = -1;
 	char **childenv;
 	struct argv_array argv = ARGV_ARRAY_INIT;
 
 	if (pipe(notify_pipe))
 		notify_pipe[0] = notify_pipe[1] = -1;
 
+	if (cmd->no_stdin || cmd->no_stdout || cmd->no_stderr) {
+		null_fd = open("/dev/null", O_RDWR | O_CLOEXEC);
+		if (null_fd < 0)
+			die_errno(_("open /dev/null failed"));
+		set_cloexec(null_fd);
+	}
+
 	prepare_cmd(&argv, cmd);
 	childenv = prep_childenv(cmd->env);
 
@@ -473,7 +469,7 @@ int start_command(struct child_process *cmd)
 		atexit(notify_parent);
 
 		if (cmd->no_stdin)
-			dup_devnull(0);
+			dup2(null_fd, 0);
 		else if (need_in) {
 			dup2(fdin[0], 0);
 			close_pair(fdin);
@@ -483,7 +479,7 @@ int start_command(struct child_process *cmd)
 		}
 
 		if (cmd->no_stderr)
-			dup_devnull(2);
+			dup2(null_fd, 2);
 		else if (need_err) {
 			dup2(fderr[1], 2);
 			close_pair(fderr);
@@ -493,7 +489,7 @@ int start_command(struct child_process *cmd)
 		}
 
 		if (cmd->no_stdout)
-			dup_devnull(1);
+			dup2(null_fd, 1);
 		else if (cmd->stdout_to_stderr)
 			dup2(2, 1);
 		else if (need_out) {
@@ -553,6 +549,8 @@ int start_command(struct child_process *cmd)
 	}
 	close(notify_pipe[0]);
 
+	if (null_fd >= 0)
+		close(null_fd);
 	argv_array_clear(&argv);
 	free(childenv);
 }
-- 
2.12.2.816.g2cccc81164-goog


^ permalink raw reply	[flat|nested] 140+ messages in thread

* [PATCH v5 08/11] run-command: eliminate calls to error handling functions in child
  2017-04-18 23:17       ` [PATCH v5 00/11] forking and threading Brandon Williams
                           ` (6 preceding siblings ...)
  2017-04-18 23:18         ` [PATCH v5 07/11] run-command: don't die in child when duping /dev/null Brandon Williams
@ 2017-04-18 23:18         ` Brandon Williams
  2017-04-18 23:18         ` [PATCH v5 09/11] run-command: handle dup2 and close errors " Brandon Williams
                           ` (3 subsequent siblings)
  11 siblings, 0 replies; 140+ messages in thread
From: Brandon Williams @ 2017-04-18 23:18 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, e, jrnieder

All of our standard error handling paths have the potential to
call malloc or take stdio locks; so we must avoid them inside
the forked child.

Instead, the child only writes an 8 byte struct atomically to
the parent through the notification pipe to propagate an error.
All user-visible error reporting happens from the parent;
even avoiding functions like atexit(3) and exit(3).

Helped-by: Eric Wong <e@80x24.org>
Signed-off-by: Brandon Williams <bmwill@google.com>
---
 run-command.c | 121 ++++++++++++++++++++++++++++++++++++++++++----------------
 1 file changed, 89 insertions(+), 32 deletions(-)

diff --git a/run-command.c b/run-command.c
index b3a35dd82..1f15714b1 100644
--- a/run-command.c
+++ b/run-command.c
@@ -211,14 +211,82 @@ static const char **prepare_shell_cmd(struct argv_array *out, const char **argv)
 #ifndef GIT_WINDOWS_NATIVE
 static int child_notifier = -1;
 
-static void notify_parent(void)
+enum child_errcode {
+	CHILD_ERR_CHDIR,
+	CHILD_ERR_ENOENT,
+	CHILD_ERR_SILENT,
+	CHILD_ERR_ERRNO
+};
+
+struct child_err {
+	enum child_errcode err;
+	int syserr; /* errno */
+};
+
+static void child_die(enum child_errcode err)
 {
-	/*
-	 * execvp failed.  If possible, we'd like to let start_command
-	 * know, so failures like ENOENT can be handled right away; but
-	 * otherwise, finish_command will still report the error.
-	 */
-	xwrite(child_notifier, "", 1);
+	struct child_err buf;
+
+	buf.err = err;
+	buf.syserr = errno;
+
+	/* write(2) on buf smaller than PIPE_BUF (min 512) is atomic: */
+	xwrite(child_notifier, &buf, sizeof(buf));
+	_exit(1);
+}
+
+/*
+ * parent will make it look like the child spewed a fatal error and died
+ * this is needed to prevent changes to t0061.
+ */
+static void fake_fatal(const char *err, va_list params)
+{
+	vreportf("fatal: ", err, params);
+}
+
+static void child_error_fn(const char *err, va_list params)
+{
+	const char msg[] = "error() should not be called in child\n";
+	xwrite(2, msg, sizeof(msg) - 1);
+}
+
+static void child_warn_fn(const char *err, va_list params)
+{
+	const char msg[] = "warn() should not be called in child\n";
+	xwrite(2, msg, sizeof(msg) - 1);
+}
+
+static void NORETURN child_die_fn(const char *err, va_list params)
+{
+	const char msg[] = "die() should not be called in child\n";
+	xwrite(2, msg, sizeof(msg) - 1);
+	_exit(2);
+}
+
+/* this runs in the parent process */
+static void child_err_spew(struct child_process *cmd, struct child_err *cerr)
+{
+	static void (*old_errfn)(const char *err, va_list params);
+
+	old_errfn = get_error_routine();
+	set_error_routine(fake_fatal);
+	errno = cerr->syserr;
+
+	switch (cerr->err) {
+	case CHILD_ERR_CHDIR:
+		error_errno("exec '%s': cd to '%s' failed",
+			    cmd->argv[0], cmd->dir);
+		break;
+	case CHILD_ERR_ENOENT:
+		error_errno("cannot run %s", cmd->argv[0]);
+		break;
+	case CHILD_ERR_SILENT:
+		break;
+	case CHILD_ERR_ERRNO:
+		error_errno("cannot exec '%s'", cmd->argv[0]);
+		break;
+	}
+	set_error_routine(old_errfn);
 }
 
 static void prepare_cmd(struct argv_array *out, const struct child_process *cmd)
@@ -341,13 +409,6 @@ static int wait_or_whine(pid_t pid, const char *argv0, int in_signal)
 		code += 128;
 	} else if (WIFEXITED(status)) {
 		code = WEXITSTATUS(status);
-		/*
-		 * Convert special exit code when execvp failed.
-		 */
-		if (code == 127) {
-			code = -1;
-			failed_errno = ENOENT;
-		}
 	} else {
 		error("waitpid is confused (%s)", argv0);
 	}
@@ -435,6 +496,7 @@ int start_command(struct child_process *cmd)
 	int null_fd = -1;
 	char **childenv;
 	struct argv_array argv = ARGV_ARRAY_INIT;
+	struct child_err cerr;
 
 	if (pipe(notify_pipe))
 		notify_pipe[0] = notify_pipe[1] = -1;
@@ -453,20 +515,16 @@ int start_command(struct child_process *cmd)
 	failed_errno = errno;
 	if (!cmd->pid) {
 		/*
-		 * Redirect the channel to write syscall error messages to
-		 * before redirecting the process's stderr so that all die()
-		 * in subsequent call paths use the parent's stderr.
+		 * Ensure the default die/error/warn routines do not get
+		 * called, they can take stdio locks and malloc.
 		 */
-		if (cmd->no_stderr || need_err) {
-			int child_err = dup(2);
-			set_cloexec(child_err);
-			set_error_handle(fdopen(child_err, "w"));
-		}
+		set_die_routine(child_die_fn);
+		set_error_routine(child_error_fn);
+		set_warn_routine(child_warn_fn);
 
 		close(notify_pipe[0]);
 		set_cloexec(notify_pipe[1]);
 		child_notifier = notify_pipe[1];
-		atexit(notify_parent);
 
 		if (cmd->no_stdin)
 			dup2(null_fd, 0);
@@ -501,8 +559,7 @@ int start_command(struct child_process *cmd)
 		}
 
 		if (cmd->dir && chdir(cmd->dir))
-			die_errno("exec '%s': cd to '%s' failed", cmd->argv[0],
-			    cmd->dir);
+			child_die(CHILD_ERR_CHDIR);
 
 		/*
 		 * Attempt to exec using the command and arguments starting at
@@ -517,12 +574,11 @@ int start_command(struct child_process *cmd)
 			       (char *const *) childenv);
 
 		if (errno == ENOENT) {
-			if (!cmd->silent_exec_failure)
-				error("cannot run %s: %s", cmd->argv[0],
-					strerror(ENOENT));
-			exit(127);
+			if (cmd->silent_exec_failure)
+				child_die(CHILD_ERR_SILENT);
+			child_die(CHILD_ERR_ENOENT);
 		} else {
-			die_errno("cannot exec '%s'", cmd->argv[0]);
+			child_die(CHILD_ERR_ERRNO);
 		}
 	}
 	if (cmd->pid < 0)
@@ -533,17 +589,18 @@ int start_command(struct child_process *cmd)
 	/*
 	 * Wait for child's exec. If the exec succeeds (or if fork()
 	 * failed), EOF is seen immediately by the parent. Otherwise, the
-	 * child process sends a single byte.
+	 * child process sends a child_err struct.
 	 * Note that use of this infrastructure is completely advisory,
 	 * therefore, we keep error checks minimal.
 	 */
 	close(notify_pipe[1]);
-	if (read(notify_pipe[0], &notify_pipe[1], 1) == 1) {
+	if (xread(notify_pipe[0], &cerr, sizeof(cerr)) == sizeof(cerr)) {
 		/*
 		 * At this point we know that fork() succeeded, but exec()
 		 * failed. Errors have been reported to our stderr.
 		 */
 		wait_or_whine(cmd->pid, cmd->argv[0], 0);
+		child_err_spew(cmd, &cerr);
 		failed_errno = errno;
 		cmd->pid = -1;
 	}
-- 
2.12.2.816.g2cccc81164-goog


^ permalink raw reply	[flat|nested] 140+ messages in thread

* [PATCH v5 09/11] run-command: handle dup2 and close errors in child
  2017-04-18 23:17       ` [PATCH v5 00/11] forking and threading Brandon Williams
                           ` (7 preceding siblings ...)
  2017-04-18 23:18         ` [PATCH v5 08/11] run-command: eliminate calls to error handling functions in child Brandon Williams
@ 2017-04-18 23:18         ` " Brandon Williams
  2017-04-18 23:18         ` [PATCH v5 10/11] run-command: add note about forking and threading Brandon Williams
                           ` (2 subsequent siblings)
  11 siblings, 0 replies; 140+ messages in thread
From: Brandon Williams @ 2017-04-18 23:18 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, e, jrnieder

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 run-command.c | 58 ++++++++++++++++++++++++++++++++++++++++++----------------
 1 file changed, 42 insertions(+), 16 deletions(-)

diff --git a/run-command.c b/run-command.c
index 1f15714b1..615b6e9c9 100644
--- a/run-command.c
+++ b/run-command.c
@@ -213,6 +213,8 @@ static int child_notifier = -1;
 
 enum child_errcode {
 	CHILD_ERR_CHDIR,
+	CHILD_ERR_DUP2,
+	CHILD_ERR_CLOSE,
 	CHILD_ERR_ENOENT,
 	CHILD_ERR_SILENT,
 	CHILD_ERR_ERRNO
@@ -235,6 +237,24 @@ static void child_die(enum child_errcode err)
 	_exit(1);
 }
 
+static void child_dup2(int fd, int to)
+{
+	if (dup2(fd, to) < 0)
+		child_die(CHILD_ERR_DUP2);
+}
+
+static void child_close(int fd)
+{
+	if (close(fd))
+		child_die(CHILD_ERR_CLOSE);
+}
+
+static void child_close_pair(int fd[2])
+{
+	child_close(fd[0]);
+	child_close(fd[1]);
+}
+
 /*
  * parent will make it look like the child spewed a fatal error and died
  * this is needed to prevent changes to t0061.
@@ -277,6 +297,12 @@ static void child_err_spew(struct child_process *cmd, struct child_err *cerr)
 		error_errno("exec '%s': cd to '%s' failed",
 			    cmd->argv[0], cmd->dir);
 		break;
+	case CHILD_ERR_DUP2:
+		error_errno("dup2() in child failed");
+		break;
+	case CHILD_ERR_CLOSE:
+		error_errno("close() in child failed");
+		break;
 	case CHILD_ERR_ENOENT:
 		error_errno("cannot run %s", cmd->argv[0]);
 		break;
@@ -527,35 +553,35 @@ int start_command(struct child_process *cmd)
 		child_notifier = notify_pipe[1];
 
 		if (cmd->no_stdin)
-			dup2(null_fd, 0);
+			child_dup2(null_fd, 0);
 		else if (need_in) {
-			dup2(fdin[0], 0);
-			close_pair(fdin);
+			child_dup2(fdin[0], 0);
+			child_close_pair(fdin);
 		} else if (cmd->in) {
-			dup2(cmd->in, 0);
-			close(cmd->in);
+			child_dup2(cmd->in, 0);
+			child_close(cmd->in);
 		}
 
 		if (cmd->no_stderr)
-			dup2(null_fd, 2);
+			child_dup2(null_fd, 2);
 		else if (need_err) {
-			dup2(fderr[1], 2);
-			close_pair(fderr);
+			child_dup2(fderr[1], 2);
+			child_close_pair(fderr);
 		} else if (cmd->err > 1) {
-			dup2(cmd->err, 2);
-			close(cmd->err);
+			child_dup2(cmd->err, 2);
+			child_close(cmd->err);
 		}
 
 		if (cmd->no_stdout)
-			dup2(null_fd, 1);
+			child_dup2(null_fd, 1);
 		else if (cmd->stdout_to_stderr)
-			dup2(2, 1);
+			child_dup2(2, 1);
 		else if (need_out) {
-			dup2(fdout[1], 1);
-			close_pair(fdout);
+			child_dup2(fdout[1], 1);
+			child_close_pair(fdout);
 		} else if (cmd->out > 1) {
-			dup2(cmd->out, 1);
-			close(cmd->out);
+			child_dup2(cmd->out, 1);
+			child_close(cmd->out);
 		}
 
 		if (cmd->dir && chdir(cmd->dir))
-- 
2.12.2.816.g2cccc81164-goog


^ permalink raw reply	[flat|nested] 140+ messages in thread

* [PATCH v5 10/11] run-command: add note about forking and threading
  2017-04-18 23:17       ` [PATCH v5 00/11] forking and threading Brandon Williams
                           ` (8 preceding siblings ...)
  2017-04-18 23:18         ` [PATCH v5 09/11] run-command: handle dup2 and close errors " Brandon Williams
@ 2017-04-18 23:18         ` Brandon Williams
  2017-04-18 23:18         ` [PATCH v5 11/11] run-command: block signals between fork and execve Brandon Williams
  2017-04-19 23:13         ` [PATCH v6 00/11] forking and threading Brandon Williams
  11 siblings, 0 replies; 140+ messages in thread
From: Brandon Williams @ 2017-04-18 23:18 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, e, jrnieder

All non-Async-Signal-Safe functions (e.g. malloc and die) were removed
between 'fork' and 'exec' in start_command in order to avoid potential
deadlocking when forking while multiple threads are running.  This
deadlocking is possible when a thread (other than the one forking) has
acquired a lock and didn't get around to releasing it before the fork.
This leaves the lock in a locked state in the resulting process with no
hope of it ever being released.

Add a note describing this potential pitfall before the call to 'fork()'
so people working in this section of the code know to only use
Async-Signal-Safe functions in the child process.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 run-command.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/run-command.c b/run-command.c
index 615b6e9c9..df1edd963 100644
--- a/run-command.c
+++ b/run-command.c
@@ -537,6 +537,15 @@ int start_command(struct child_process *cmd)
 	prepare_cmd(&argv, cmd);
 	childenv = prep_childenv(cmd->env);
 
+	/*
+	 * NOTE: In order to prevent deadlocking when using threads special
+	 * care should be taken with the function calls made in between the
+	 * fork() and exec() calls.  No calls should be made to functions which
+	 * require acquiring a lock (e.g. malloc) as the lock could have been
+	 * held by another thread at the time of forking, causing the lock to
+	 * never be released in the child process.  This means only
+	 * Async-Signal-Safe functions are permitted in the child.
+	 */
 	cmd->pid = fork();
 	failed_errno = errno;
 	if (!cmd->pid) {
-- 
2.12.2.816.g2cccc81164-goog


^ permalink raw reply	[flat|nested] 140+ messages in thread

* [PATCH v5 11/11] run-command: block signals between fork and execve
  2017-04-18 23:17       ` [PATCH v5 00/11] forking and threading Brandon Williams
                           ` (9 preceding siblings ...)
  2017-04-18 23:18         ` [PATCH v5 10/11] run-command: add note about forking and threading Brandon Williams
@ 2017-04-18 23:18         ` Brandon Williams
  2017-04-19  6:00           ` Johannes Sixt
  2017-04-19 23:13         ` [PATCH v6 00/11] forking and threading Brandon Williams
  11 siblings, 1 reply; 140+ messages in thread
From: Brandon Williams @ 2017-04-18 23:18 UTC (permalink / raw)
  To: git; +Cc: Eric Wong, jrnieder, Brandon Williams

From: Eric Wong <e@80x24.org>

Signal handlers of the parent firing in the forked child may
have unintended side effects.  Rather than auditing every signal
handler we have and will ever have, block signals while forking
and restore default signal handlers in the child before execve.

Restoring default signal handlers is required because
execve does not unblock signals, it only restores default
signal handlers.  So we must restore them with sigprocmask
before execve, leaving a window when signal handlers
we control can fire in the child.  Continue ignoring
ignored signals, but reset the rest to defaults.

Similarly, disable pthread cancellation to future-proof our code
in case we start using cancellation; as cancellation is
implemented with signals in glibc.

Signed-off-by: Eric Wong <e@80x24.org>
Signed-off-by: Brandon Williams <bmwill@google.com>
---
 run-command.c | 68 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 68 insertions(+)

diff --git a/run-command.c b/run-command.c
index df1edd963..1f3c38e43 100644
--- a/run-command.c
+++ b/run-command.c
@@ -215,6 +215,7 @@ enum child_errcode {
 	CHILD_ERR_CHDIR,
 	CHILD_ERR_DUP2,
 	CHILD_ERR_CLOSE,
+	CHILD_ERR_SIGPROCMASK,
 	CHILD_ERR_ENOENT,
 	CHILD_ERR_SILENT,
 	CHILD_ERR_ERRNO
@@ -303,6 +304,9 @@ static void child_err_spew(struct child_process *cmd, struct child_err *cerr)
 	case CHILD_ERR_CLOSE:
 		error_errno("close() in child failed");
 		break;
+	case CHILD_ERR_SIGPROCMASK:
+		error_errno("sigprocmask failed restoring signals");
+		break;
 	case CHILD_ERR_ENOENT:
 		error_errno("cannot run %s", cmd->argv[0]);
 		break;
@@ -400,6 +404,53 @@ static char **prep_childenv(const char *const *deltaenv)
 }
 #endif
 
+struct atfork_state {
+#ifndef NO_PTHREADS
+	int cs;
+#endif
+	sigset_t old;
+};
+
+#ifndef NO_PTHREADS
+static void bug_die(int err, const char *msg)
+{
+	if (err) {
+		errno = err;
+		die_errno("BUG: %s", msg);
+	}
+}
+#endif
+
+static void atfork_prepare(struct atfork_state *as)
+{
+	sigset_t all;
+
+	if (sigfillset(&all))
+		die_errno("sigfillset");
+#ifdef NO_PTHREADS
+	if (sigprocmask(SIG_SETMASK, &all, &as->old))
+		die_errno("sigprocmask");
+#else
+	bug_die(pthread_sigmask(SIG_SETMASK, &all, &as->old),
+		"blocking all signals");
+	bug_die(pthread_setcancelstate(PTHREAD_CANCEL_DISABLE, &as->cs),
+		"disabling cancellation");
+#endif
+}
+
+static void atfork_parent(struct atfork_state *as)
+{
+#ifdef NO_PTHREADS
+	if (sigprocmask(SIG_SETMASK, &as->old, NULL))
+		die_errno("sigprocmask");
+#else
+	bug_die(pthread_setcancelstate(as->cs, NULL),
+		"re-enabling cancellation");
+	bug_die(pthread_sigmask(SIG_SETMASK, &as->old, NULL),
+		"restoring signal mask");
+#endif
+}
+
 static inline void set_cloexec(int fd)
 {
 	int flags = fcntl(fd, F_GETFD);
@@ -523,6 +574,7 @@ int start_command(struct child_process *cmd)
 	char **childenv;
 	struct argv_array argv = ARGV_ARRAY_INIT;
 	struct child_err cerr;
+	struct atfork_state as;
 
 	if (pipe(notify_pipe))
 		notify_pipe[0] = notify_pipe[1] = -1;
@@ -536,6 +588,7 @@ int start_command(struct child_process *cmd)
 
 	prepare_cmd(&argv, cmd);
 	childenv = prep_childenv(cmd->env);
+	atfork_prepare(&as);
 
 	/*
 	 * NOTE: In order to prevent deadlocking when using threads special
@@ -549,6 +602,7 @@ int start_command(struct child_process *cmd)
 	cmd->pid = fork();
 	failed_errno = errno;
 	if (!cmd->pid) {
+		int sig;
 		/*
 		 * Ensure the default die/error/warn routines do not get
 		 * called, they can take stdio locks and malloc.
@@ -597,6 +651,19 @@ int start_command(struct child_process *cmd)
 			child_die(CHILD_ERR_CHDIR);
 
 		/*
+		 * restore default signal handlers here, in case
+		 * we catch a signal right before execve below
+		 */
+		for (sig = 1; sig < NSIG; sig++) {
+			/* ignored signals get reset to SIG_DFL on execve */
+			if (signal(sig, SIG_DFL) == SIG_IGN)
+				signal(sig, SIG_IGN);
+		}
+
+		if (sigprocmask(SIG_SETMASK, &as.old, NULL) != 0)
+			child_die(CHILD_ERR_SIGPROCMASK);
+
+		/*
 		 * Attempt to exec using the command and arguments starting at
 		 * argv.argv[1].  argv.argv[0] contains SHELL_PATH which will
 		 * be used in the event exec failed with ENOEXEC at which point
@@ -616,6 +683,7 @@ int start_command(struct child_process *cmd)
 			child_die(CHILD_ERR_ERRNO);
 		}
 	}
+	atfork_parent(&as);
 	if (cmd->pid < 0)
 		error_errno("cannot fork() for %s", cmd->argv[0]);
 	else if (cmd->clean_on_exit)
-- 
2.12.2.816.g2cccc81164-goog


^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [PATCH v5 05/11] string-list: add string_list_remove function
  2017-04-18 23:17         ` [PATCH v5 05/11] string-list: add string_list_remove function Brandon Williams
@ 2017-04-18 23:31           ` Stefan Beller
  2017-04-18 23:36             ` Brandon Williams
  0 siblings, 1 reply; 140+ messages in thread
From: Stefan Beller @ 2017-04-18 23:31 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, Eric Wong, Jonathan Nieder

On Tue, Apr 18, 2017 at 4:17 PM, Brandon Williams <bmwill@google.com> wrote:

>
> +void string_list_remove(struct string_list *list, const char *string,
> +                       int free_util)
> +{
> +       int exact_match;
> +       int i = get_entry_index(list, string, &exact_match);
> +
> +       if (exact_match) {
> +               if (list->strdup_strings)
> +                       free(list->items[i].string);
> +               if (free_util)
> +                       free(list->items[i].util);
> +
> +               list->nr--;
> +               memmove(list->items + i, list->items + i + 1,
> +                       (list->nr - i) * sizeof(struct string_list_item));
> +       }

Looks correct. I shortly wondered if we'd have any value in returing
`exact_match`, as that may save the caller some code, as I imagine the
caller to be:

  if (!string_list_has_string(&list, string))
    die("BUG: ...");
  string_list_remove(&list, string, 0);

which could be simplified if we had the exact_match returned, i.e.
the string_list_remove returns the implicit string_list_has_string.

>  /*
> + * Removes the given string from the sorted list.

What happens when the string is not found?

> + */
> +void string_list_remove(struct string_list *list, const char *string, int free_util);

How much do we care about (eventual) consistency? ;)
i.e. mark it extern ?

Thanks,
Stefan

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [PATCH v5 05/11] string-list: add string_list_remove function
  2017-04-18 23:31           ` Stefan Beller
@ 2017-04-18 23:36             ` Brandon Williams
  2017-04-18 23:40               ` Stefan Beller
  0 siblings, 1 reply; 140+ messages in thread
From: Brandon Williams @ 2017-04-18 23:36 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, Eric Wong, Jonathan Nieder

On 04/18, Stefan Beller wrote:
> On Tue, Apr 18, 2017 at 4:17 PM, Brandon Williams <bmwill@google.com> wrote:
> 
> >
> > +void string_list_remove(struct string_list *list, const char *string,
> > +                       int free_util)
> > +{
> > +       int exact_match;
> > +       int i = get_entry_index(list, string, &exact_match);
> > +
> > +       if (exact_match) {
> > +               if (list->strdup_strings)
> > +                       free(list->items[i].string);
> > +               if (free_util)
> > +                       free(list->items[i].util);
> > +
> > +               list->nr--;
> > +               memmove(list->items + i, list->items + i + 1,
> > +                       (list->nr - i) * sizeof(struct string_list_item));
> > +       }
> 
> Looks correct. I shortly wondered if we'd have any value in returing
> `exact_match`, as that may save the caller some code, as I imagine the
> caller to be:
> 
>   if (!string_list_has_string(&list, string))
>     die("BUG: ...");
>   string_list_remove(&list, string, 0);
> 
> which could be simplified if we had the exact_match returned, i.e.
> the string_list_remove returns the implicit string_list_has_string.

I don't really see the value in this, as the only caller doesn't need it
right now.

> 
> >  /*
> > + * Removes the given string from the sorted list.
> 
> What happens when the string is not found?

nothing.

> 
> > + */
> > +void string_list_remove(struct string_list *list, const char *string, int free_util);
> 
> How much do we care about (eventual) consistency? ;)
> i.e. mark it extern ?

If I need to do another reroll I can mark it extern.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [PATCH v5 05/11] string-list: add string_list_remove function
  2017-04-18 23:36             ` Brandon Williams
@ 2017-04-18 23:40               ` Stefan Beller
  0 siblings, 0 replies; 140+ messages in thread
From: Stefan Beller @ 2017-04-18 23:40 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, Eric Wong, Jonathan Nieder

On Tue, Apr 18, 2017 at 4:36 PM, Brandon Williams <bmwill@google.com> wrote:
> On 04/18, Stefan Beller wrote:
>> On Tue, Apr 18, 2017 at 4:17 PM, Brandon Williams <bmwill@google.com> wrote:
>>
>> >
>> > +void string_list_remove(struct string_list *list, const char *string,
>> > +                       int free_util)
>> > +{
>> > +       int exact_match;
>> > +       int i = get_entry_index(list, string, &exact_match);
>> > +
>> > +       if (exact_match) {
>> > +               if (list->strdup_strings)
>> > +                       free(list->items[i].string);
>> > +               if (free_util)
>> > +                       free(list->items[i].util);
>> > +
>> > +               list->nr--;
>> > +               memmove(list->items + i, list->items + i + 1,
>> > +                       (list->nr - i) * sizeof(struct string_list_item));
>> > +       }
>>
>> Looks correct. I shortly wondered if we'd have any value in returing
>> `exact_match`, as that may save the caller some code, as I imagine the
>> caller to be:
>>
>>   if (!string_list_has_string(&list, string))
>>     die("BUG: ...");
>>   string_list_remove(&list, string, 0);
>>
>> which could be simplified if we had the exact_match returned, i.e.
>> the string_list_remove returns the implicit string_list_has_string.
>
> I don't really see the value in this, as the only caller doesn't need it
> right now.

yeah, I guess we can add such functionality later.

>
>>
>> >  /*
>> > + * Removes the given string from the sorted list.
>>
>> What happens when the string is not found?
>
> nothing.

yeah that is what I figured from reading the code. The question could
have been worded: Do we want to document what happens if the string
is not found? (A reader in the future may wonder if it is even allowed to
call this function without having consulted string_list_has_string).

>
>>
>> > + */
>> > +void string_list_remove(struct string_list *list, const char *string, int free_util);
>>
>> How much do we care about (eventual) consistency? ;)
>> i.e. mark it extern ?
>
> If I need to do another reroll I can mark it extern.

Thanks! This doesn't require a reroll on its own and all other patches
look sane to me.

Thanks,
Stefan

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [PATCH v5 02/11] t0061: run_command executes scripts without a #! line
  2017-04-18 23:17         ` [PATCH v5 02/11] t0061: run_command executes scripts without a #! line Brandon Williams
@ 2017-04-19  5:43           ` Johannes Sixt
  2017-04-19  6:21             ` Johannes Sixt
  0 siblings, 1 reply; 140+ messages in thread
From: Johannes Sixt @ 2017-04-19  5:43 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, e, jrnieder

Am 19.04.2017 um 01:17 schrieb Brandon Williams:
> Add a test to 't0061-run-command.sh' to ensure that run_command can
> continue to execute scripts which don't include a '#!' line.

Why is this necessary? I am pretty certain that our emulation layer on 
Windows can only run scripts with a shbang line.

-- Hannes


^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [PATCH v5 11/11] run-command: block signals between fork and execve
  2017-04-18 23:18         ` [PATCH v5 11/11] run-command: block signals between fork and execve Brandon Williams
@ 2017-04-19  6:00           ` Johannes Sixt
  2017-04-19  7:48             ` Eric Wong
  0 siblings, 1 reply; 140+ messages in thread
From: Johannes Sixt @ 2017-04-19  6:00 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, Eric Wong, jrnieder

Am 19.04.2017 um 01:18 schrieb Brandon Williams:
> @@ -400,6 +404,53 @@ static char **prep_childenv(const char *const *deltaenv)
>  }
>  #endif
>

Does this #endif in this hunk context belong to an #ifndef 
GIT_WINDOWS_NATIVE? If so, I wonder why these new functions are outside 
these brackets? An oversight?

> +struct atfork_state {
> +#ifndef NO_PTHREADS
> +	int cs;
> +#endif
> +	sigset_t old;
> +};
...

-- Hannes


^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [PATCH v5 02/11] t0061: run_command executes scripts without a #! line
  2017-04-19  5:43           ` Johannes Sixt
@ 2017-04-19  6:21             ` Johannes Sixt
  2017-04-19 15:56               ` Brandon Williams
  0 siblings, 1 reply; 140+ messages in thread
From: Johannes Sixt @ 2017-04-19  6:21 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, e, jrnieder

Am 19.04.2017 um 07:43 schrieb Johannes Sixt:
> Am 19.04.2017 um 01:17 schrieb Brandon Williams:
>> Add a test to 't0061-run-command.sh' to ensure that run_command can
>> continue to execute scripts which don't include a '#!' line.
>
> Why is this necessary? I am pretty certain that our emulation layer on
> Windows can only run scripts with a shbang line.

Nevermind. It is a compatibility feature: People may have written their 
hooks and scripts without #!, and these must continue to work where they 
worked before.

Please protect the new test with !MINGW.

Thanks,
-- Hannes


^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [PATCH v5 11/11] run-command: block signals between fork and execve
  2017-04-19  6:00           ` Johannes Sixt
@ 2017-04-19  7:48             ` Eric Wong
  2017-04-19 16:10               ` Brandon Williams
  0 siblings, 1 reply; 140+ messages in thread
From: Eric Wong @ 2017-04-19  7:48 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: Brandon Williams, git, jrnieder

Johannes Sixt <j6t@kdbg.org> wrote:
> Am 19.04.2017 um 01:18 schrieb Brandon Williams:
> >@@ -400,6 +404,53 @@ static char **prep_childenv(const char *const *deltaenv)
> > }
> > #endif
> >
> 
> Does this #endif in this hunk context belong to an #ifndef
> GIT_WINDOWS_NATIVE? If so, I wonder why these new functions are outside
> these brackets? An oversight?

Seems like an oversight, sorry about that.
All the new atfork stuff I added should be protected by
#ifndef GIT_WINDOWS_NATIVE.

Brandon / Johannes: can you fixup on your end?

I wonder if some of this OS-specific code would be more
easily maintained if split out further to OS-specific files,
even at the risk of some code duplication.

And/or perhaps label all #else and #endif statements with
comments, and limit the scope of each ifdef block to be
per-function for with tiny attention spans like me :x

> >+struct atfork_state {
> >+#ifndef NO_PTHREADS
> >+	int cs;
> >+#endif
> >+	sigset_t old;
> >+};
> ...
> 
> -- Hannes
> 

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [PATCH v5 02/11] t0061: run_command executes scripts without a #! line
  2017-04-19  6:21             ` Johannes Sixt
@ 2017-04-19 15:56               ` Brandon Williams
  2017-04-19 18:18                 ` Johannes Sixt
  2017-04-20 10:47                 ` Johannes Schindelin
  0 siblings, 2 replies; 140+ messages in thread
From: Brandon Williams @ 2017-04-19 15:56 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: git, e, jrnieder

On 04/19, Johannes Sixt wrote:
> Am 19.04.2017 um 07:43 schrieb Johannes Sixt:
> >Am 19.04.2017 um 01:17 schrieb Brandon Williams:
> >>Add a test to 't0061-run-command.sh' to ensure that run_command can
> >>continue to execute scripts which don't include a '#!' line.
> >
> >Why is this necessary? I am pretty certain that our emulation layer on
> >Windows can only run scripts with a shbang line.

Out of curiosity how did you have t5550 passing on windows then?  Since
the first patch in this series fixes a that test which doesn't have a
'#!' line.

> 
> Nevermind. It is a compatibility feature: People may have written
> their hooks and scripts without #!, and these must continue to work
> where they worked before.
> 
> Please protect the new test with !MINGW.

Will do.

> 
> Thanks,
> -- Hannes
> 

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [PATCH v5 11/11] run-command: block signals between fork and execve
  2017-04-19  7:48             ` Eric Wong
@ 2017-04-19 16:10               ` Brandon Williams
  0 siblings, 0 replies; 140+ messages in thread
From: Brandon Williams @ 2017-04-19 16:10 UTC (permalink / raw)
  To: Eric Wong; +Cc: Johannes Sixt, git, jrnieder

On 04/19, Eric Wong wrote:
> Johannes Sixt <j6t@kdbg.org> wrote:
> > Am 19.04.2017 um 01:18 schrieb Brandon Williams:
> > >@@ -400,6 +404,53 @@ static char **prep_childenv(const char *const *deltaenv)
> > > }
> > > #endif
> > >
> > 
> > Does this #endif in this hunk context belong to an #ifndef
> > GIT_WINDOWS_NATIVE? If so, I wonder why these new functions are outside
> > these brackets? An oversight?
> 
> Seems like an oversight, sorry about that.
> All the new atfork stuff I added should be protected by
> #ifndef GIT_WINDOWS_NATIVE.
> 
> Brandon / Johannes: can you fixup on your end?

Correct, this is an oversight I should have caught :)
No worries though, I'll fix it up in a reroll (since I'm going to be
need to send out another version to fix up another patch in the series
for Windows)

> 
> I wonder if some of this OS-specific code would be more
> easily maintained if split out further to OS-specific files,
> even at the risk of some code duplication.
> 
> And/or perhaps label all #else and #endif statements with
> comments, and limit the scope of each ifdef block to be
> per-function for with tiny attention spans like me :x

Yeah I'm not sure I know the best way to prevent this from happening,
thankfully we have windows folk who keep us honest :D

> 
> > >+struct atfork_state {
> > >+#ifndef NO_PTHREADS
> > >+	int cs;
> > >+#endif
> > >+	sigset_t old;
> > >+};
> > ...
> > 
> > -- Hannes
> > 

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [PATCH v5 02/11] t0061: run_command executes scripts without a #! line
  2017-04-19 15:56               ` Brandon Williams
@ 2017-04-19 18:18                 ` Johannes Sixt
  2017-04-20 10:47                 ` Johannes Schindelin
  1 sibling, 0 replies; 140+ messages in thread
From: Johannes Sixt @ 2017-04-19 18:18 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, e, jrnieder

Am 19.04.2017 um 17:56 schrieb Brandon Williams:
> On 04/19, Johannes Sixt wrote:
>> Am 19.04.2017 um 07:43 schrieb Johannes Sixt:
>>> Windows can only run scripts with a shbang line.
>
> Out of curiosity how did you have t5550 passing on windows then?  Since
> the first patch in this series fixes a that test which doesn't have a
> '#!' line.

I guess, I don't run it at all in my setup.

-- Hannes


^ permalink raw reply	[flat|nested] 140+ messages in thread

* [PATCH v6 00/11] forking and threading
  2017-04-18 23:17       ` [PATCH v5 00/11] forking and threading Brandon Williams
                           ` (10 preceding siblings ...)
  2017-04-18 23:18         ` [PATCH v5 11/11] run-command: block signals between fork and execve Brandon Williams
@ 2017-04-19 23:13         ` Brandon Williams
  2017-04-19 23:13           ` [PATCH v6 01/11] t5550: use write_script to generate post-update hook Brandon Williams
                             ` (11 more replies)
  11 siblings, 12 replies; 140+ messages in thread
From: Brandon Williams @ 2017-04-19 23:13 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, j6t, sbeller, e, jrnieder

Changes in v6:
* fix some windows compat issues
* better comment on the string_list_remove function (also marked extern)

Brandon Williams (10):
  t5550: use write_script to generate post-update hook
  t0061: run_command executes scripts without a #! line
  run-command: prepare command before forking
  run-command: use the async-signal-safe execv instead of execvp
  string-list: add string_list_remove function
  run-command: prepare child environment before forking
  run-command: don't die in child when duping /dev/null
  run-command: eliminate calls to error handling functions in child
  run-command: handle dup2 and close errors in child
  run-command: add note about forking and threading

Eric Wong (1):
  run-command: block signals between fork and execve

 run-command.c              | 404 +++++++++++++++++++++++++++++++++++----------
 string-list.c              |  18 ++
 string-list.h              |   7 +
 t/t0061-run-command.sh     |  11 ++
 t/t5550-http-fetch-dumb.sh |   5 +-
 5 files changed, 360 insertions(+), 85 deletions(-)

--- interdiff with 'origin/bw/forking-and-threading'

diff --git a/run-command.c b/run-command.c
index 1f3c38e43..a97d7bf9f 100644
--- a/run-command.c
+++ b/run-command.c
@@ -402,7 +402,6 @@ static char **prep_childenv(const char *const *deltaenv)
 	strbuf_release(&key);
 	return childenv;
 }
-#endif
 
 struct atfork_state {
 #ifndef NO_PTHREADS
@@ -450,6 +449,7 @@ static void atfork_parent(struct atfork_state *as)
 		"restoring signal mask");
 #endif
 }
+#endif /* GIT_WINDOWS_NATIVE */
 
 static inline void set_cloexec(int fd)
 {
diff --git a/string-list.h b/string-list.h
index 18520dbc8..29bfb7ae4 100644
--- a/string-list.h
+++ b/string-list.h
@@ -64,8 +64,10 @@ struct string_list_item *string_list_insert(struct string_list *list, const char
 
 /*
  * Removes the given string from the sorted list.
+ * If the string doesn't exist, the list is not altered.
  */
-void string_list_remove(struct string_list *list, const char *string, int free_util);
+extern void string_list_remove(struct string_list *list, const char *string,
+			       int free_util);
 
 /*
  * Checks if the given string is part of a sorted list. If it is part of the list,
diff --git a/t/t0061-run-command.sh b/t/t0061-run-command.sh
index 1a7490e29..98c09dd98 100755
--- a/t/t0061-run-command.sh
+++ b/t/t0061-run-command.sh
@@ -26,7 +26,7 @@ test_expect_success 'run_command can run a command' '
 	test_cmp empty err
 '
 
-test_expect_success 'run_command can run a script without a #! line' '
+test_expect_success !MINGW 'run_command can run a script without a #! line' '
 	cat >hello <<-\EOF &&
 	cat hello-script
 	EOF

-- 
2.12.2.816.g2cccc81164-goog


^ permalink raw reply	[flat|nested] 140+ messages in thread

* [PATCH v6 01/11] t5550: use write_script to generate post-update hook
  2017-04-19 23:13         ` [PATCH v6 00/11] forking and threading Brandon Williams
@ 2017-04-19 23:13           ` Brandon Williams
  2017-04-19 23:13           ` [PATCH v6 02/11] t0061: run_command executes scripts without a #! line Brandon Williams
                             ` (10 subsequent siblings)
  11 siblings, 0 replies; 140+ messages in thread
From: Brandon Williams @ 2017-04-19 23:13 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, j6t, sbeller, e, jrnieder

The post-update hooks created in t5550-http-fetch-dumb.sh is missing the
"!#/bin/sh" line which can cause issues with portability.  Instead
create the hook using the 'write_script' function which includes the
proper "#!" line.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 t/t5550-http-fetch-dumb.sh | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/t/t5550-http-fetch-dumb.sh b/t/t5550-http-fetch-dumb.sh
index 87308cdce..8552184e7 100755
--- a/t/t5550-http-fetch-dumb.sh
+++ b/t/t5550-http-fetch-dumb.sh
@@ -20,8 +20,9 @@ test_expect_success 'create http-accessible bare repository with loose objects'
 	(cd "$HTTPD_DOCUMENT_ROOT_PATH/repo.git" &&
 	 git config core.bare true &&
 	 mkdir -p hooks &&
-	 echo "exec git update-server-info" >hooks/post-update &&
-	 chmod +x hooks/post-update &&
+	 write_script "hooks/post-update" <<-\EOF &&
+	 exec git update-server-info
+	EOF
 	 hooks/post-update
 	) &&
 	git remote add public "$HTTPD_DOCUMENT_ROOT_PATH/repo.git" &&
-- 
2.12.2.816.g2cccc81164-goog


^ permalink raw reply	[flat|nested] 140+ messages in thread

* [PATCH v6 02/11] t0061: run_command executes scripts without a #! line
  2017-04-19 23:13         ` [PATCH v6 00/11] forking and threading Brandon Williams
  2017-04-19 23:13           ` [PATCH v6 01/11] t5550: use write_script to generate post-update hook Brandon Williams
@ 2017-04-19 23:13           ` Brandon Williams
  2017-04-20 10:49             ` Johannes Schindelin
  2017-04-19 23:13           ` [PATCH v6 03/11] run-command: prepare command before forking Brandon Williams
                             ` (9 subsequent siblings)
  11 siblings, 1 reply; 140+ messages in thread
From: Brandon Williams @ 2017-04-19 23:13 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, j6t, sbeller, e, jrnieder

Add a test to 't0061-run-command.sh' to ensure that run_command can
continue to execute scripts which don't include a '#!' line.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 t/t0061-run-command.sh | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/t/t0061-run-command.sh b/t/t0061-run-command.sh
index 12228b4aa..98c09dd98 100755
--- a/t/t0061-run-command.sh
+++ b/t/t0061-run-command.sh
@@ -26,6 +26,17 @@ test_expect_success 'run_command can run a command' '
 	test_cmp empty err
 '
 
+test_expect_success !MINGW 'run_command can run a script without a #! line' '
+	cat >hello <<-\EOF &&
+	cat hello-script
+	EOF
+	chmod +x hello &&
+	test-run-command run-command ./hello >actual 2>err &&
+
+	test_cmp hello-script actual &&
+	test_cmp empty err
+'
+
 test_expect_success POSIXPERM 'run_command reports EACCES' '
 	cat hello-script >hello.sh &&
 	chmod -x hello.sh &&
-- 
2.12.2.816.g2cccc81164-goog


^ permalink raw reply	[flat|nested] 140+ messages in thread

* [PATCH v6 03/11] run-command: prepare command before forking
  2017-04-19 23:13         ` [PATCH v6 00/11] forking and threading Brandon Williams
  2017-04-19 23:13           ` [PATCH v6 01/11] t5550: use write_script to generate post-update hook Brandon Williams
  2017-04-19 23:13           ` [PATCH v6 02/11] t0061: run_command executes scripts without a #! line Brandon Williams
@ 2017-04-19 23:13           ` Brandon Williams
  2017-04-19 23:13           ` [PATCH v6 04/11] run-command: use the async-signal-safe execv instead of execvp Brandon Williams
                             ` (8 subsequent siblings)
  11 siblings, 0 replies; 140+ messages in thread
From: Brandon Williams @ 2017-04-19 23:13 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, j6t, sbeller, e, jrnieder

According to [1] we need to only call async-signal-safe operations between fork
and exec.  Using malloc to build the argv array isn't async-signal-safe.

In order to avoid allocation between 'fork()' and 'exec()' prepare the
argv array used in the exec call prior to forking the process.

[1] http://pubs.opengroup.org/onlinepubs/009695399/functions/fork.html

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 run-command.c | 46 ++++++++++++++++++++++++++--------------------
 1 file changed, 26 insertions(+), 20 deletions(-)

diff --git a/run-command.c b/run-command.c
index 574b81d3e..d8d143795 100644
--- a/run-command.c
+++ b/run-command.c
@@ -221,18 +221,6 @@ static const char **prepare_shell_cmd(struct argv_array *out, const char **argv)
 }
 
 #ifndef GIT_WINDOWS_NATIVE
-static int execv_shell_cmd(const char **argv)
-{
-	struct argv_array nargv = ARGV_ARRAY_INIT;
-	prepare_shell_cmd(&nargv, argv);
-	trace_argv_printf(nargv.argv, "trace: exec:");
-	sane_execvp(nargv.argv[0], (char **)nargv.argv);
-	argv_array_clear(&nargv);
-	return -1;
-}
-#endif
-
-#ifndef GIT_WINDOWS_NATIVE
 static int child_notifier = -1;
 
 static void notify_parent(void)
@@ -244,6 +232,21 @@ static void notify_parent(void)
 	 */
 	xwrite(child_notifier, "", 1);
 }
+
+static void prepare_cmd(struct argv_array *out, const struct child_process *cmd)
+{
+	if (!cmd->argv[0])
+		die("BUG: command is empty");
+
+	if (cmd->git_cmd) {
+		argv_array_push(out, "git");
+		argv_array_pushv(out, cmd->argv);
+	} else if (cmd->use_shell) {
+		prepare_shell_cmd(out, cmd->argv);
+	} else {
+		argv_array_pushv(out, cmd->argv);
+	}
+}
 #endif
 
 static inline void set_cloexec(int fd)
@@ -372,9 +375,13 @@ int start_command(struct child_process *cmd)
 #ifndef GIT_WINDOWS_NATIVE
 {
 	int notify_pipe[2];
+	struct argv_array argv = ARGV_ARRAY_INIT;
+
 	if (pipe(notify_pipe))
 		notify_pipe[0] = notify_pipe[1] = -1;
 
+	prepare_cmd(&argv, cmd);
+
 	cmd->pid = fork();
 	failed_errno = errno;
 	if (!cmd->pid) {
@@ -437,12 +444,9 @@ int start_command(struct child_process *cmd)
 					unsetenv(*cmd->env);
 			}
 		}
-		if (cmd->git_cmd)
-			execv_git_cmd(cmd->argv);
-		else if (cmd->use_shell)
-			execv_shell_cmd(cmd->argv);
-		else
-			sane_execvp(cmd->argv[0], (char *const*) cmd->argv);
+
+		sane_execvp(argv.argv[0], (char *const *) argv.argv);
+
 		if (errno == ENOENT) {
 			if (!cmd->silent_exec_failure)
 				error("cannot run %s: %s", cmd->argv[0],
@@ -458,7 +462,7 @@ int start_command(struct child_process *cmd)
 		mark_child_for_cleanup(cmd->pid, cmd);
 
 	/*
-	 * Wait for child's execvp. If the execvp succeeds (or if fork()
+	 * Wait for child's exec. If the exec succeeds (or if fork()
 	 * failed), EOF is seen immediately by the parent. Otherwise, the
 	 * child process sends a single byte.
 	 * Note that use of this infrastructure is completely advisory,
@@ -467,7 +471,7 @@ int start_command(struct child_process *cmd)
 	close(notify_pipe[1]);
 	if (read(notify_pipe[0], &notify_pipe[1], 1) == 1) {
 		/*
-		 * At this point we know that fork() succeeded, but execvp()
+		 * At this point we know that fork() succeeded, but exec()
 		 * failed. Errors have been reported to our stderr.
 		 */
 		wait_or_whine(cmd->pid, cmd->argv[0], 0);
@@ -475,6 +479,8 @@ int start_command(struct child_process *cmd)
 		cmd->pid = -1;
 	}
 	close(notify_pipe[0]);
+
+	argv_array_clear(&argv);
 }
 #else
 {
-- 
2.12.2.816.g2cccc81164-goog


^ permalink raw reply	[flat|nested] 140+ messages in thread

* [PATCH v6 04/11] run-command: use the async-signal-safe execv instead of execvp
  2017-04-19 23:13         ` [PATCH v6 00/11] forking and threading Brandon Williams
                             ` (2 preceding siblings ...)
  2017-04-19 23:13           ` [PATCH v6 03/11] run-command: prepare command before forking Brandon Williams
@ 2017-04-19 23:13           ` Brandon Williams
  2017-05-17  2:15             ` Junio C Hamano
  2017-04-19 23:13           ` [PATCH v6 05/11] string-list: add string_list_remove function Brandon Williams
                             ` (7 subsequent siblings)
  11 siblings, 1 reply; 140+ messages in thread
From: Brandon Williams @ 2017-04-19 23:13 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, j6t, sbeller, e, jrnieder

Convert the function used to exec from 'execvp()' to 'execv()' as the (p)
variant of exec isn't async-signal-safe and has the potential to call malloc
during the path resolution it performs.  Instead we simply do the path
resolution ourselves during the preparation stage prior to forking.  There also
don't exist any portable (p) variants which also take in an environment to use
in the exec'd process.  This allows easy migration to using 'execve()' in a
future patch.

Also, as noted in [1], in the event of an ENOEXEC the (p) variants of
exec will attempt to execute the command by interpreting it with the
'sh' utility.  To maintain this functionality, if 'execv()' fails with
ENOEXEC, start_command will atempt to execute the command by
interpreting it with 'sh'.

[1] http://pubs.opengroup.org/onlinepubs/009695399/functions/exec.html

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 run-command.c | 30 +++++++++++++++++++++++++++++-
 1 file changed, 29 insertions(+), 1 deletion(-)

diff --git a/run-command.c b/run-command.c
index d8d143795..1c7a3b611 100644
--- a/run-command.c
+++ b/run-command.c
@@ -238,6 +238,12 @@ static void prepare_cmd(struct argv_array *out, const struct child_process *cmd)
 	if (!cmd->argv[0])
 		die("BUG: command is empty");
 
+	/*
+	 * Add SHELL_PATH so in the event exec fails with ENOEXEC we can
+	 * attempt to interpret the command with 'sh'.
+	 */
+	argv_array_push(out, SHELL_PATH);
+
 	if (cmd->git_cmd) {
 		argv_array_push(out, "git");
 		argv_array_pushv(out, cmd->argv);
@@ -246,6 +252,20 @@ static void prepare_cmd(struct argv_array *out, const struct child_process *cmd)
 	} else {
 		argv_array_pushv(out, cmd->argv);
 	}
+
+	/*
+	 * If there are no '/' characters in the command then perform a path
+	 * lookup and use the resolved path as the command to exec.  If there
+	 * are no '/' characters or if the command wasn't found in the path,
+	 * have exec attempt to invoke the command directly.
+	 */
+	if (!strchr(out->argv[1], '/')) {
+		char *program = locate_in_PATH(out->argv[1]);
+		if (program) {
+			free((char *)out->argv[1]);
+			out->argv[1] = program;
+		}
+	}
 }
 #endif
 
@@ -445,7 +465,15 @@ int start_command(struct child_process *cmd)
 			}
 		}
 
-		sane_execvp(argv.argv[0], (char *const *) argv.argv);
+		/*
+		 * Attempt to exec using the command and arguments starting at
+		 * argv.argv[1].  argv.argv[0] contains SHELL_PATH which will
+		 * be used in the event exec failed with ENOEXEC at which point
+		 * we will try to interpret the command using 'sh'.
+		 */
+		execv(argv.argv[1], (char *const *) argv.argv + 1);
+		if (errno == ENOEXEC)
+			execv(argv.argv[0], (char *const *) argv.argv);
 
 		if (errno == ENOENT) {
 			if (!cmd->silent_exec_failure)
-- 
2.12.2.816.g2cccc81164-goog


^ permalink raw reply	[flat|nested] 140+ messages in thread

* [PATCH v6 05/11] string-list: add string_list_remove function
  2017-04-19 23:13         ` [PATCH v6 00/11] forking and threading Brandon Williams
                             ` (3 preceding siblings ...)
  2017-04-19 23:13           ` [PATCH v6 04/11] run-command: use the async-signal-safe execv instead of execvp Brandon Williams
@ 2017-04-19 23:13           ` Brandon Williams
  2017-04-19 23:13           ` [PATCH v6 06/11] run-command: prepare child environment before forking Brandon Williams
                             ` (6 subsequent siblings)
  11 siblings, 0 replies; 140+ messages in thread
From: Brandon Williams @ 2017-04-19 23:13 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, j6t, sbeller, e, jrnieder

Teach string-list to be able to remove a string from a sorted
'struct string_list'.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 string-list.c | 18 ++++++++++++++++++
 string-list.h |  7 +++++++
 2 files changed, 25 insertions(+)

diff --git a/string-list.c b/string-list.c
index 45016ad86..8f7b69ada 100644
--- a/string-list.c
+++ b/string-list.c
@@ -67,6 +67,24 @@ struct string_list_item *string_list_insert(struct string_list *list, const char
 	return list->items + index;
 }
 
+void string_list_remove(struct string_list *list, const char *string,
+			int free_util)
+{
+	int exact_match;
+	int i = get_entry_index(list, string, &exact_match);
+
+	if (exact_match) {
+		if (list->strdup_strings)
+			free(list->items[i].string);
+		if (free_util)
+			free(list->items[i].util);
+
+		list->nr--;
+		memmove(list->items + i, list->items + i + 1,
+			(list->nr - i) * sizeof(struct string_list_item));
+	}
+}
+
 int string_list_has_string(const struct string_list *list, const char *string)
 {
 	int exact_match;
diff --git a/string-list.h b/string-list.h
index d3809a141..29bfb7ae4 100644
--- a/string-list.h
+++ b/string-list.h
@@ -63,6 +63,13 @@ int string_list_find_insert_index(const struct string_list *list, const char *st
 struct string_list_item *string_list_insert(struct string_list *list, const char *string);
 
 /*
+ * Removes the given string from the sorted list.
+ * If the string doesn't exist, the list is not altered.
+ */
+extern void string_list_remove(struct string_list *list, const char *string,
+			       int free_util);
+
+/*
  * Checks if the given string is part of a sorted list. If it is part of the list,
  * return the coresponding string_list_item, NULL otherwise.
  */
-- 
2.12.2.816.g2cccc81164-goog


^ permalink raw reply	[flat|nested] 140+ messages in thread

* [PATCH v6 06/11] run-command: prepare child environment before forking
  2017-04-19 23:13         ` [PATCH v6 00/11] forking and threading Brandon Williams
                             ` (4 preceding siblings ...)
  2017-04-19 23:13           ` [PATCH v6 05/11] string-list: add string_list_remove function Brandon Williams
@ 2017-04-19 23:13           ` Brandon Williams
  2017-04-19 23:13           ` [PATCH v6 07/11] run-command: don't die in child when duping /dev/null Brandon Williams
                             ` (5 subsequent siblings)
  11 siblings, 0 replies; 140+ messages in thread
From: Brandon Williams @ 2017-04-19 23:13 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, j6t, sbeller, e, jrnieder

In order to avoid allocation between 'fork()' and 'exec()' prepare the
environment to be used in the child process prior to forking.

Switch to using 'execve()' so that the construct child environment can
used in the exec'd process.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 run-command.c | 66 ++++++++++++++++++++++++++++++++++++++++++++++++++---------
 1 file changed, 56 insertions(+), 10 deletions(-)

diff --git a/run-command.c b/run-command.c
index 1c7a3b611..15e2e74a7 100644
--- a/run-command.c
+++ b/run-command.c
@@ -267,6 +267,55 @@ static void prepare_cmd(struct argv_array *out, const struct child_process *cmd)
 		}
 	}
 }
+
+static char **prep_childenv(const char *const *deltaenv)
+{
+	extern char **environ;
+	char **childenv;
+	struct string_list env = STRING_LIST_INIT_DUP;
+	struct strbuf key = STRBUF_INIT;
+	const char *const *p;
+	int i;
+
+	/* Construct a sorted string list consisting of the current environ */
+	for (p = (const char *const *) environ; p && *p; p++) {
+		const char *equals = strchr(*p, '=');
+
+		if (equals) {
+			strbuf_reset(&key);
+			strbuf_add(&key, *p, equals - *p);
+			string_list_append(&env, key.buf)->util = (void *) *p;
+		} else {
+			string_list_append(&env, *p)->util = (void *) *p;
+		}
+	}
+	string_list_sort(&env);
+
+	/* Merge in 'deltaenv' with the current environ */
+	for (p = deltaenv; p && *p; p++) {
+		const char *equals = strchr(*p, '=');
+
+		if (equals) {
+			/* ('key=value'), insert or replace entry */
+			strbuf_reset(&key);
+			strbuf_add(&key, *p, equals - *p);
+			string_list_insert(&env, key.buf)->util = (void *) *p;
+		} else {
+			/* otherwise ('key') remove existing entry */
+			string_list_remove(&env, *p, 0);
+		}
+	}
+
+	/* Create an array of 'char *' to be used as the childenv */
+	childenv = xmalloc((env.nr + 1) * sizeof(char *));
+	for (i = 0; i < env.nr; i++)
+		childenv[i] = env.items[i].util;
+	childenv[env.nr] = NULL;
+
+	string_list_clear(&env, 0);
+	strbuf_release(&key);
+	return childenv;
+}
 #endif
 
 static inline void set_cloexec(int fd)
@@ -395,12 +444,14 @@ int start_command(struct child_process *cmd)
 #ifndef GIT_WINDOWS_NATIVE
 {
 	int notify_pipe[2];
+	char **childenv;
 	struct argv_array argv = ARGV_ARRAY_INIT;
 
 	if (pipe(notify_pipe))
 		notify_pipe[0] = notify_pipe[1] = -1;
 
 	prepare_cmd(&argv, cmd);
+	childenv = prep_childenv(cmd->env);
 
 	cmd->pid = fork();
 	failed_errno = errno;
@@ -456,14 +507,6 @@ int start_command(struct child_process *cmd)
 		if (cmd->dir && chdir(cmd->dir))
 			die_errno("exec '%s': cd to '%s' failed", cmd->argv[0],
 			    cmd->dir);
-		if (cmd->env) {
-			for (; *cmd->env; cmd->env++) {
-				if (strchr(*cmd->env, '='))
-					putenv((char *)*cmd->env);
-				else
-					unsetenv(*cmd->env);
-			}
-		}
 
 		/*
 		 * Attempt to exec using the command and arguments starting at
@@ -471,9 +514,11 @@ int start_command(struct child_process *cmd)
 		 * be used in the event exec failed with ENOEXEC at which point
 		 * we will try to interpret the command using 'sh'.
 		 */
-		execv(argv.argv[1], (char *const *) argv.argv + 1);
+		execve(argv.argv[1], (char *const *) argv.argv + 1,
+		       (char *const *) childenv);
 		if (errno == ENOEXEC)
-			execv(argv.argv[0], (char *const *) argv.argv);
+			execve(argv.argv[0], (char *const *) argv.argv,
+			       (char *const *) childenv);
 
 		if (errno == ENOENT) {
 			if (!cmd->silent_exec_failure)
@@ -509,6 +554,7 @@ int start_command(struct child_process *cmd)
 	close(notify_pipe[0]);
 
 	argv_array_clear(&argv);
+	free(childenv);
 }
 #else
 {
-- 
2.12.2.816.g2cccc81164-goog


^ permalink raw reply	[flat|nested] 140+ messages in thread

* [PATCH v6 07/11] run-command: don't die in child when duping /dev/null
  2017-04-19 23:13         ` [PATCH v6 00/11] forking and threading Brandon Williams
                             ` (5 preceding siblings ...)
  2017-04-19 23:13           ` [PATCH v6 06/11] run-command: prepare child environment before forking Brandon Williams
@ 2017-04-19 23:13           ` Brandon Williams
  2017-04-19 23:13           ` [PATCH v6 08/11] run-command: eliminate calls to error handling functions in child Brandon Williams
                             ` (4 subsequent siblings)
  11 siblings, 0 replies; 140+ messages in thread
From: Brandon Williams @ 2017-04-19 23:13 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, j6t, sbeller, e, jrnieder

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 run-command.c | 28 +++++++++++++---------------
 1 file changed, 13 insertions(+), 15 deletions(-)

diff --git a/run-command.c b/run-command.c
index 15e2e74a7..b3a35dd82 100644
--- a/run-command.c
+++ b/run-command.c
@@ -117,18 +117,6 @@ static inline void close_pair(int fd[2])
 	close(fd[1]);
 }
 
-#ifndef GIT_WINDOWS_NATIVE
-static inline void dup_devnull(int to)
-{
-	int fd = open("/dev/null", O_RDWR);
-	if (fd < 0)
-		die_errno(_("open /dev/null failed"));
-	if (dup2(fd, to) < 0)
-		die_errno(_("dup2(%d,%d) failed"), fd, to);
-	close(fd);
-}
-#endif
-
 static char *locate_in_PATH(const char *file)
 {
 	const char *p = getenv("PATH");
@@ -444,12 +432,20 @@ int start_command(struct child_process *cmd)
 #ifndef GIT_WINDOWS_NATIVE
 {
 	int notify_pipe[2];
+	int null_fd = -1;
 	char **childenv;
 	struct argv_array argv = ARGV_ARRAY_INIT;
 
 	if (pipe(notify_pipe))
 		notify_pipe[0] = notify_pipe[1] = -1;
 
+	if (cmd->no_stdin || cmd->no_stdout || cmd->no_stderr) {
+		null_fd = open("/dev/null", O_RDWR | O_CLOEXEC);
+		if (null_fd < 0)
+			die_errno(_("open /dev/null failed"));
+		set_cloexec(null_fd);
+	}
+
 	prepare_cmd(&argv, cmd);
 	childenv = prep_childenv(cmd->env);
 
@@ -473,7 +469,7 @@ int start_command(struct child_process *cmd)
 		atexit(notify_parent);
 
 		if (cmd->no_stdin)
-			dup_devnull(0);
+			dup2(null_fd, 0);
 		else if (need_in) {
 			dup2(fdin[0], 0);
 			close_pair(fdin);
@@ -483,7 +479,7 @@ int start_command(struct child_process *cmd)
 		}
 
 		if (cmd->no_stderr)
-			dup_devnull(2);
+			dup2(null_fd, 2);
 		else if (need_err) {
 			dup2(fderr[1], 2);
 			close_pair(fderr);
@@ -493,7 +489,7 @@ int start_command(struct child_process *cmd)
 		}
 
 		if (cmd->no_stdout)
-			dup_devnull(1);
+			dup2(null_fd, 1);
 		else if (cmd->stdout_to_stderr)
 			dup2(2, 1);
 		else if (need_out) {
@@ -553,6 +549,8 @@ int start_command(struct child_process *cmd)
 	}
 	close(notify_pipe[0]);
 
+	if (null_fd >= 0)
+		close(null_fd);
 	argv_array_clear(&argv);
 	free(childenv);
 }
-- 
2.12.2.816.g2cccc81164-goog


^ permalink raw reply	[flat|nested] 140+ messages in thread

* [PATCH v6 08/11] run-command: eliminate calls to error handling functions in child
  2017-04-19 23:13         ` [PATCH v6 00/11] forking and threading Brandon Williams
                             ` (6 preceding siblings ...)
  2017-04-19 23:13           ` [PATCH v6 07/11] run-command: don't die in child when duping /dev/null Brandon Williams
@ 2017-04-19 23:13           ` Brandon Williams
  2017-04-19 23:13           ` [PATCH v6 09/11] run-command: handle dup2 and close errors " Brandon Williams
                             ` (3 subsequent siblings)
  11 siblings, 0 replies; 140+ messages in thread
From: Brandon Williams @ 2017-04-19 23:13 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, j6t, sbeller, e, jrnieder

All of our standard error handling paths have the potential to
call malloc or take stdio locks; so we must avoid them inside
the forked child.

Instead, the child only writes an 8 byte struct atomically to
the parent through the notification pipe to propagate an error.
All user-visible error reporting happens from the parent;
even avoiding functions like atexit(3) and exit(3).

Helped-by: Eric Wong <e@80x24.org>
Signed-off-by: Brandon Williams <bmwill@google.com>
---
 run-command.c | 121 ++++++++++++++++++++++++++++++++++++++++++----------------
 1 file changed, 89 insertions(+), 32 deletions(-)

diff --git a/run-command.c b/run-command.c
index b3a35dd82..1f15714b1 100644
--- a/run-command.c
+++ b/run-command.c
@@ -211,14 +211,82 @@ static const char **prepare_shell_cmd(struct argv_array *out, const char **argv)
 #ifndef GIT_WINDOWS_NATIVE
 static int child_notifier = -1;
 
-static void notify_parent(void)
+enum child_errcode {
+	CHILD_ERR_CHDIR,
+	CHILD_ERR_ENOENT,
+	CHILD_ERR_SILENT,
+	CHILD_ERR_ERRNO
+};
+
+struct child_err {
+	enum child_errcode err;
+	int syserr; /* errno */
+};
+
+static void child_die(enum child_errcode err)
 {
-	/*
-	 * execvp failed.  If possible, we'd like to let start_command
-	 * know, so failures like ENOENT can be handled right away; but
-	 * otherwise, finish_command will still report the error.
-	 */
-	xwrite(child_notifier, "", 1);
+	struct child_err buf;
+
+	buf.err = err;
+	buf.syserr = errno;
+
+	/* write(2) on buf smaller than PIPE_BUF (min 512) is atomic: */
+	xwrite(child_notifier, &buf, sizeof(buf));
+	_exit(1);
+}
+
+/*
+ * parent will make it look like the child spewed a fatal error and died
+ * this is needed to prevent changes to t0061.
+ */
+static void fake_fatal(const char *err, va_list params)
+{
+	vreportf("fatal: ", err, params);
+}
+
+static void child_error_fn(const char *err, va_list params)
+{
+	const char msg[] = "error() should not be called in child\n";
+	xwrite(2, msg, sizeof(msg) - 1);
+}
+
+static void child_warn_fn(const char *err, va_list params)
+{
+	const char msg[] = "warn() should not be called in child\n";
+	xwrite(2, msg, sizeof(msg) - 1);
+}
+
+static void NORETURN child_die_fn(const char *err, va_list params)
+{
+	const char msg[] = "die() should not be called in child\n";
+	xwrite(2, msg, sizeof(msg) - 1);
+	_exit(2);
+}
+
+/* this runs in the parent process */
+static void child_err_spew(struct child_process *cmd, struct child_err *cerr)
+{
+	static void (*old_errfn)(const char *err, va_list params);
+
+	old_errfn = get_error_routine();
+	set_error_routine(fake_fatal);
+	errno = cerr->syserr;
+
+	switch (cerr->err) {
+	case CHILD_ERR_CHDIR:
+		error_errno("exec '%s': cd to '%s' failed",
+			    cmd->argv[0], cmd->dir);
+		break;
+	case CHILD_ERR_ENOENT:
+		error_errno("cannot run %s", cmd->argv[0]);
+		break;
+	case CHILD_ERR_SILENT:
+		break;
+	case CHILD_ERR_ERRNO:
+		error_errno("cannot exec '%s'", cmd->argv[0]);
+		break;
+	}
+	set_error_routine(old_errfn);
 }
 
 static void prepare_cmd(struct argv_array *out, const struct child_process *cmd)
@@ -341,13 +409,6 @@ static int wait_or_whine(pid_t pid, const char *argv0, int in_signal)
 		code += 128;
 	} else if (WIFEXITED(status)) {
 		code = WEXITSTATUS(status);
-		/*
-		 * Convert special exit code when execvp failed.
-		 */
-		if (code == 127) {
-			code = -1;
-			failed_errno = ENOENT;
-		}
 	} else {
 		error("waitpid is confused (%s)", argv0);
 	}
@@ -435,6 +496,7 @@ int start_command(struct child_process *cmd)
 	int null_fd = -1;
 	char **childenv;
 	struct argv_array argv = ARGV_ARRAY_INIT;
+	struct child_err cerr;
 
 	if (pipe(notify_pipe))
 		notify_pipe[0] = notify_pipe[1] = -1;
@@ -453,20 +515,16 @@ int start_command(struct child_process *cmd)
 	failed_errno = errno;
 	if (!cmd->pid) {
 		/*
-		 * Redirect the channel to write syscall error messages to
-		 * before redirecting the process's stderr so that all die()
-		 * in subsequent call paths use the parent's stderr.
+		 * Ensure the default die/error/warn routines do not get
+		 * called, they can take stdio locks and malloc.
 		 */
-		if (cmd->no_stderr || need_err) {
-			int child_err = dup(2);
-			set_cloexec(child_err);
-			set_error_handle(fdopen(child_err, "w"));
-		}
+		set_die_routine(child_die_fn);
+		set_error_routine(child_error_fn);
+		set_warn_routine(child_warn_fn);
 
 		close(notify_pipe[0]);
 		set_cloexec(notify_pipe[1]);
 		child_notifier = notify_pipe[1];
-		atexit(notify_parent);
 
 		if (cmd->no_stdin)
 			dup2(null_fd, 0);
@@ -501,8 +559,7 @@ int start_command(struct child_process *cmd)
 		}
 
 		if (cmd->dir && chdir(cmd->dir))
-			die_errno("exec '%s': cd to '%s' failed", cmd->argv[0],
-			    cmd->dir);
+			child_die(CHILD_ERR_CHDIR);
 
 		/*
 		 * Attempt to exec using the command and arguments starting at
@@ -517,12 +574,11 @@ int start_command(struct child_process *cmd)
 			       (char *const *) childenv);
 
 		if (errno == ENOENT) {
-			if (!cmd->silent_exec_failure)
-				error("cannot run %s: %s", cmd->argv[0],
-					strerror(ENOENT));
-			exit(127);
+			if (cmd->silent_exec_failure)
+				child_die(CHILD_ERR_SILENT);
+			child_die(CHILD_ERR_ENOENT);
 		} else {
-			die_errno("cannot exec '%s'", cmd->argv[0]);
+			child_die(CHILD_ERR_ERRNO);
 		}
 	}
 	if (cmd->pid < 0)
@@ -533,17 +589,18 @@ int start_command(struct child_process *cmd)
 	/*
 	 * Wait for child's exec. If the exec succeeds (or if fork()
 	 * failed), EOF is seen immediately by the parent. Otherwise, the
-	 * child process sends a single byte.
+	 * child process sends a child_err struct.
 	 * Note that use of this infrastructure is completely advisory,
 	 * therefore, we keep error checks minimal.
 	 */
 	close(notify_pipe[1]);
-	if (read(notify_pipe[0], &notify_pipe[1], 1) == 1) {
+	if (xread(notify_pipe[0], &cerr, sizeof(cerr)) == sizeof(cerr)) {
 		/*
 		 * At this point we know that fork() succeeded, but exec()
 		 * failed. Errors have been reported to our stderr.
 		 */
 		wait_or_whine(cmd->pid, cmd->argv[0], 0);
+		child_err_spew(cmd, &cerr);
 		failed_errno = errno;
 		cmd->pid = -1;
 	}
-- 
2.12.2.816.g2cccc81164-goog


^ permalink raw reply	[flat|nested] 140+ messages in thread

* [PATCH v6 09/11] run-command: handle dup2 and close errors in child
  2017-04-19 23:13         ` [PATCH v6 00/11] forking and threading Brandon Williams
                             ` (7 preceding siblings ...)
  2017-04-19 23:13           ` [PATCH v6 08/11] run-command: eliminate calls to error handling functions in child Brandon Williams
@ 2017-04-19 23:13           ` " Brandon Williams
  2017-04-19 23:13           ` [PATCH v6 10/11] run-command: add note about forking and threading Brandon Williams
                             ` (2 subsequent siblings)
  11 siblings, 0 replies; 140+ messages in thread
From: Brandon Williams @ 2017-04-19 23:13 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, j6t, sbeller, e, jrnieder

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 run-command.c | 58 ++++++++++++++++++++++++++++++++++++++++++----------------
 1 file changed, 42 insertions(+), 16 deletions(-)

diff --git a/run-command.c b/run-command.c
index 1f15714b1..615b6e9c9 100644
--- a/run-command.c
+++ b/run-command.c
@@ -213,6 +213,8 @@ static int child_notifier = -1;
 
 enum child_errcode {
 	CHILD_ERR_CHDIR,
+	CHILD_ERR_DUP2,
+	CHILD_ERR_CLOSE,
 	CHILD_ERR_ENOENT,
 	CHILD_ERR_SILENT,
 	CHILD_ERR_ERRNO
@@ -235,6 +237,24 @@ static void child_die(enum child_errcode err)
 	_exit(1);
 }
 
+static void child_dup2(int fd, int to)
+{
+	if (dup2(fd, to) < 0)
+		child_die(CHILD_ERR_DUP2);
+}
+
+static void child_close(int fd)
+{
+	if (close(fd))
+		child_die(CHILD_ERR_CLOSE);
+}
+
+static void child_close_pair(int fd[2])
+{
+	child_close(fd[0]);
+	child_close(fd[1]);
+}
+
 /*
  * parent will make it look like the child spewed a fatal error and died
  * this is needed to prevent changes to t0061.
@@ -277,6 +297,12 @@ static void child_err_spew(struct child_process *cmd, struct child_err *cerr)
 		error_errno("exec '%s': cd to '%s' failed",
 			    cmd->argv[0], cmd->dir);
 		break;
+	case CHILD_ERR_DUP2:
+		error_errno("dup2() in child failed");
+		break;
+	case CHILD_ERR_CLOSE:
+		error_errno("close() in child failed");
+		break;
 	case CHILD_ERR_ENOENT:
 		error_errno("cannot run %s", cmd->argv[0]);
 		break;
@@ -527,35 +553,35 @@ int start_command(struct child_process *cmd)
 		child_notifier = notify_pipe[1];
 
 		if (cmd->no_stdin)
-			dup2(null_fd, 0);
+			child_dup2(null_fd, 0);
 		else if (need_in) {
-			dup2(fdin[0], 0);
-			close_pair(fdin);
+			child_dup2(fdin[0], 0);
+			child_close_pair(fdin);
 		} else if (cmd->in) {
-			dup2(cmd->in, 0);
-			close(cmd->in);
+			child_dup2(cmd->in, 0);
+			child_close(cmd->in);
 		}
 
 		if (cmd->no_stderr)
-			dup2(null_fd, 2);
+			child_dup2(null_fd, 2);
 		else if (need_err) {
-			dup2(fderr[1], 2);
-			close_pair(fderr);
+			child_dup2(fderr[1], 2);
+			child_close_pair(fderr);
 		} else if (cmd->err > 1) {
-			dup2(cmd->err, 2);
-			close(cmd->err);
+			child_dup2(cmd->err, 2);
+			child_close(cmd->err);
 		}
 
 		if (cmd->no_stdout)
-			dup2(null_fd, 1);
+			child_dup2(null_fd, 1);
 		else if (cmd->stdout_to_stderr)
-			dup2(2, 1);
+			child_dup2(2, 1);
 		else if (need_out) {
-			dup2(fdout[1], 1);
-			close_pair(fdout);
+			child_dup2(fdout[1], 1);
+			child_close_pair(fdout);
 		} else if (cmd->out > 1) {
-			dup2(cmd->out, 1);
-			close(cmd->out);
+			child_dup2(cmd->out, 1);
+			child_close(cmd->out);
 		}
 
 		if (cmd->dir && chdir(cmd->dir))
-- 
2.12.2.816.g2cccc81164-goog


^ permalink raw reply	[flat|nested] 140+ messages in thread

* [PATCH v6 10/11] run-command: add note about forking and threading
  2017-04-19 23:13         ` [PATCH v6 00/11] forking and threading Brandon Williams
                             ` (8 preceding siblings ...)
  2017-04-19 23:13           ` [PATCH v6 09/11] run-command: handle dup2 and close errors " Brandon Williams
@ 2017-04-19 23:13           ` Brandon Williams
  2017-04-19 23:13           ` [PATCH v6 11/11] run-command: block signals between fork and execve Brandon Williams
  2017-04-24 22:37           ` [PATCH v6 00/11] forking and threading Brandon Williams
  11 siblings, 0 replies; 140+ messages in thread
From: Brandon Williams @ 2017-04-19 23:13 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, j6t, sbeller, e, jrnieder

All non-Async-Signal-Safe functions (e.g. malloc and die) were removed
between 'fork' and 'exec' in start_command in order to avoid potential
deadlocking when forking while multiple threads are running.  This
deadlocking is possible when a thread (other than the one forking) has
acquired a lock and didn't get around to releasing it before the fork.
This leaves the lock in a locked state in the resulting process with no
hope of it ever being released.

Add a note describing this potential pitfall before the call to 'fork()'
so people working in this section of the code know to only use
Async-Signal-Safe functions in the child process.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 run-command.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/run-command.c b/run-command.c
index 615b6e9c9..df1edd963 100644
--- a/run-command.c
+++ b/run-command.c
@@ -537,6 +537,15 @@ int start_command(struct child_process *cmd)
 	prepare_cmd(&argv, cmd);
 	childenv = prep_childenv(cmd->env);
 
+	/*
+	 * NOTE: In order to prevent deadlocking when using threads special
+	 * care should be taken with the function calls made in between the
+	 * fork() and exec() calls.  No calls should be made to functions which
+	 * require acquiring a lock (e.g. malloc) as the lock could have been
+	 * held by another thread at the time of forking, causing the lock to
+	 * never be released in the child process.  This means only
+	 * Async-Signal-Safe functions are permitted in the child.
+	 */
 	cmd->pid = fork();
 	failed_errno = errno;
 	if (!cmd->pid) {
-- 
2.12.2.816.g2cccc81164-goog


^ permalink raw reply	[flat|nested] 140+ messages in thread

* [PATCH v6 11/11] run-command: block signals between fork and execve
  2017-04-19 23:13         ` [PATCH v6 00/11] forking and threading Brandon Williams
                             ` (9 preceding siblings ...)
  2017-04-19 23:13           ` [PATCH v6 10/11] run-command: add note about forking and threading Brandon Williams
@ 2017-04-19 23:13           ` Brandon Williams
  2017-04-24 22:37           ` [PATCH v6 00/11] forking and threading Brandon Williams
  11 siblings, 0 replies; 140+ messages in thread
From: Brandon Williams @ 2017-04-19 23:13 UTC (permalink / raw)
  To: git; +Cc: Eric Wong, j6t, sbeller, jrnieder, Brandon Williams

From: Eric Wong <e@80x24.org>

Signal handlers of the parent firing in the forked child may
have unintended side effects.  Rather than auditing every signal
handler we have and will ever have, block signals while forking
and restore default signal handlers in the child before execve.

Restoring default signal handlers is required because
execve does not unblock signals, it only restores default
signal handlers.  So we must restore them with sigprocmask
before execve, leaving a window when signal handlers
we control can fire in the child.  Continue ignoring
ignored signals, but reset the rest to defaults.

Similarly, disable pthread cancellation to future-proof our code
in case we start using cancellation; as cancellation is
implemented with signals in glibc.

Signed-off-by: Eric Wong <e@80x24.org>
Signed-off-by: Brandon Williams <bmwill@google.com>
---
 run-command.c | 68 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 68 insertions(+)

diff --git a/run-command.c b/run-command.c
index df1edd963..a97d7bf9f 100644
--- a/run-command.c
+++ b/run-command.c
@@ -215,6 +215,7 @@ enum child_errcode {
 	CHILD_ERR_CHDIR,
 	CHILD_ERR_DUP2,
 	CHILD_ERR_CLOSE,
+	CHILD_ERR_SIGPROCMASK,
 	CHILD_ERR_ENOENT,
 	CHILD_ERR_SILENT,
 	CHILD_ERR_ERRNO
@@ -303,6 +304,9 @@ static void child_err_spew(struct child_process *cmd, struct child_err *cerr)
 	case CHILD_ERR_CLOSE:
 		error_errno("close() in child failed");
 		break;
+	case CHILD_ERR_SIGPROCMASK:
+		error_errno("sigprocmask failed restoring signals");
+		break;
 	case CHILD_ERR_ENOENT:
 		error_errno("cannot run %s", cmd->argv[0]);
 		break;
@@ -398,7 +402,54 @@ static char **prep_childenv(const char *const *deltaenv)
 	strbuf_release(&key);
 	return childenv;
 }
+
+struct atfork_state {
+#ifndef NO_PTHREADS
+	int cs;
 #endif
+	sigset_t old;
+};
+
+#ifndef NO_PTHREADS
+static void bug_die(int err, const char *msg)
+{
+	if (err) {
+		errno = err;
+		die_errno("BUG: %s", msg);
+	}
+}
+#endif
+
+static void atfork_prepare(struct atfork_state *as)
+{
+	sigset_t all;
+
+	if (sigfillset(&all))
+		die_errno("sigfillset");
+#ifdef NO_PTHREADS
+	if (sigprocmask(SIG_SETMASK, &all, &as->old))
+		die_errno("sigprocmask");
+#else
+	bug_die(pthread_sigmask(SIG_SETMASK, &all, &as->old),
+		"blocking all signals");
+	bug_die(pthread_setcancelstate(PTHREAD_CANCEL_DISABLE, &as->cs),
+		"disabling cancellation");
+#endif
+}
+
+static void atfork_parent(struct atfork_state *as)
+{
+#ifdef NO_PTHREADS
+	if (sigprocmask(SIG_SETMASK, &as->old, NULL))
+		die_errno("sigprocmask");
+#else
+	bug_die(pthread_setcancelstate(as->cs, NULL),
+		"re-enabling cancellation");
+	bug_die(pthread_sigmask(SIG_SETMASK, &as->old, NULL),
+		"restoring signal mask");
+#endif
+}
+#endif /* GIT_WINDOWS_NATIVE */
 
 static inline void set_cloexec(int fd)
 {
@@ -523,6 +574,7 @@ int start_command(struct child_process *cmd)
 	char **childenv;
 	struct argv_array argv = ARGV_ARRAY_INIT;
 	struct child_err cerr;
+	struct atfork_state as;
 
 	if (pipe(notify_pipe))
 		notify_pipe[0] = notify_pipe[1] = -1;
@@ -536,6 +588,7 @@ int start_command(struct child_process *cmd)
 
 	prepare_cmd(&argv, cmd);
 	childenv = prep_childenv(cmd->env);
+	atfork_prepare(&as);
 
 	/*
 	 * NOTE: In order to prevent deadlocking when using threads special
@@ -549,6 +602,7 @@ int start_command(struct child_process *cmd)
 	cmd->pid = fork();
 	failed_errno = errno;
 	if (!cmd->pid) {
+		int sig;
 		/*
 		 * Ensure the default die/error/warn routines do not get
 		 * called, they can take stdio locks and malloc.
@@ -597,6 +651,19 @@ int start_command(struct child_process *cmd)
 			child_die(CHILD_ERR_CHDIR);
 
 		/*
+		 * restore default signal handlers here, in case
+		 * we catch a signal right before execve below
+		 */
+		for (sig = 1; sig < NSIG; sig++) {
+			/* ignored signals get reset to SIG_DFL on execve */
+			if (signal(sig, SIG_DFL) == SIG_IGN)
+				signal(sig, SIG_IGN);
+		}
+
+		if (sigprocmask(SIG_SETMASK, &as.old, NULL) != 0)
+			child_die(CHILD_ERR_SIGPROCMASK);
+
+		/*
 		 * Attempt to exec using the command and arguments starting at
 		 * argv.argv[1].  argv.argv[0] contains SHELL_PATH which will
 		 * be used in the event exec failed with ENOEXEC at which point
@@ -616,6 +683,7 @@ int start_command(struct child_process *cmd)
 			child_die(CHILD_ERR_ERRNO);
 		}
 	}
+	atfork_parent(&as);
 	if (cmd->pid < 0)
 		error_errno("cannot fork() for %s", cmd->argv[0]);
 	else if (cmd->clean_on_exit)
-- 
2.12.2.816.g2cccc81164-goog


^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [PATCH v5 02/11] t0061: run_command executes scripts without a #! line
  2017-04-19 15:56               ` Brandon Williams
  2017-04-19 18:18                 ` Johannes Sixt
@ 2017-04-20 10:47                 ` Johannes Schindelin
  2017-04-20 17:02                   ` Brandon Williams
  1 sibling, 1 reply; 140+ messages in thread
From: Johannes Schindelin @ 2017-04-20 10:47 UTC (permalink / raw)
  To: Brandon Williams; +Cc: Johannes Sixt, git, e, jrnieder

Hi Brandon,

On Wed, 19 Apr 2017, Brandon Williams wrote:

> On 04/19, Johannes Sixt wrote:
> > Am 19.04.2017 um 07:43 schrieb Johannes Sixt:
> > >Am 19.04.2017 um 01:17 schrieb Brandon Williams:
> > >>Add a test to 't0061-run-command.sh' to ensure that run_command can
> > >>continue to execute scripts which don't include a '#!' line.
> > >
> > >Why is this necessary? I am pretty certain that our emulation layer
> > >on Windows can only run scripts with a shbang line.
> 
> Out of curiosity how did you have t5550 passing on windows then?

This is the reason:

	1..0 # SKIP no web server found at '/usr/sbin/apache2'

As predicted by Hannes, your new test fails miserably on Windows:

	https://travis-ci.org/git/git/jobs/223830474#L2656-L2674

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [PATCH v6 02/11] t0061: run_command executes scripts without a #! line
  2017-04-19 23:13           ` [PATCH v6 02/11] t0061: run_command executes scripts without a #! line Brandon Williams
@ 2017-04-20 10:49             ` Johannes Schindelin
  2017-04-20 16:58               ` Brandon Williams
  0 siblings, 1 reply; 140+ messages in thread
From: Johannes Schindelin @ 2017-04-20 10:49 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, j6t, sbeller, e, jrnieder

Hi Brandon,

On Wed, 19 Apr 2017, Brandon Williams wrote:

> Add a test to 't0061-run-command.sh' to ensure that run_command can
> continue to execute scripts which don't include a '#!' line.
> 
> Signed-off-by: Brandon Williams <bmwill@google.com>

Please add something like this to the commit message lest future readers
wonder where that !MINGW comes from:

	On Windows, shell scripts are not natively executable. Git has a
	workaround to execute them, looking for the shebang line. Shell
	scripts without a shebang line will simply not execute on Windows.
	Therefore, disable the new test on Windows.

Thanks,
Dscho

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [PATCH v6 02/11] t0061: run_command executes scripts without a #! line
  2017-04-20 10:49             ` Johannes Schindelin
@ 2017-04-20 16:58               ` Brandon Williams
  0 siblings, 0 replies; 140+ messages in thread
From: Brandon Williams @ 2017-04-20 16:58 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, j6t, sbeller, e, jrnieder

On 04/20, Johannes Schindelin wrote:
> Hi Brandon,
> 
> On Wed, 19 Apr 2017, Brandon Williams wrote:
> 
> > Add a test to 't0061-run-command.sh' to ensure that run_command can
> > continue to execute scripts which don't include a '#!' line.
> > 
> > Signed-off-by: Brandon Williams <bmwill@google.com>
> 
> Please add something like this to the commit message lest future readers
> wonder where that !MINGW comes from:
> 
> 	On Windows, shell scripts are not natively executable. Git has a
> 	workaround to execute them, looking for the shebang line. Shell
> 	scripts without a shebang line will simply not execute on Windows.
> 	Therefore, disable the new test on Windows.

I'm fine with including this in the commit message.  If another reroll
needs to happen then I can make the update otherwise hopefully Junio can
just squash that in.

> 
> Thanks,
> Dscho

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [PATCH v5 02/11] t0061: run_command executes scripts without a #! line
  2017-04-20 10:47                 ` Johannes Schindelin
@ 2017-04-20 17:02                   ` Brandon Williams
  2017-04-20 20:24                     ` Johannes Schindelin
  0 siblings, 1 reply; 140+ messages in thread
From: Brandon Williams @ 2017-04-20 17:02 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Johannes Sixt, git, e, jrnieder

On 04/20, Johannes Schindelin wrote:
> Hi Brandon,
> 
> On Wed, 19 Apr 2017, Brandon Williams wrote:
> 
> > On 04/19, Johannes Sixt wrote:
> > > Am 19.04.2017 um 07:43 schrieb Johannes Sixt:
> > > >Am 19.04.2017 um 01:17 schrieb Brandon Williams:
> > > >>Add a test to 't0061-run-command.sh' to ensure that run_command can
> > > >>continue to execute scripts which don't include a '#!' line.
> > > >
> > > >Why is this necessary? I am pretty certain that our emulation layer
> > > >on Windows can only run scripts with a shbang line.
> > 
> > Out of curiosity how did you have t5550 passing on windows then?
> 
> This is the reason:
> 
> 	1..0 # SKIP no web server found at '/usr/sbin/apache2'

Hmm, that's interesting.  So do any of the http tests get run on windows
then?  I wonder if that lack of coverage could be an issue at some
point in the future.

> As predicted by Hannes, your new test fails miserably on Windows:

Isn't 'miserably' just a bit harsh ;P haha

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [PATCH v5 02/11] t0061: run_command executes scripts without a #! line
  2017-04-20 17:02                   ` Brandon Williams
@ 2017-04-20 20:24                     ` Johannes Schindelin
  2017-04-20 20:49                       ` Brandon Williams
  0 siblings, 1 reply; 140+ messages in thread
From: Johannes Schindelin @ 2017-04-20 20:24 UTC (permalink / raw)
  To: Brandon Williams; +Cc: Johannes Sixt, git, e, jrnieder

Hi Brandon,

On Thu, 20 Apr 2017, Brandon Williams wrote:

> On 04/20, Johannes Schindelin wrote:
> > 
> > On Wed, 19 Apr 2017, Brandon Williams wrote:
> > 
> > > On 04/19, Johannes Sixt wrote:
> > > > Am 19.04.2017 um 07:43 schrieb Johannes Sixt:
> > > > >Am 19.04.2017 um 01:17 schrieb Brandon Williams:
> > > > >>Add a test to 't0061-run-command.sh' to ensure that run_command
> > > > >>can continue to execute scripts which don't include a '#!' line.
> > > > >
> > > > >Why is this necessary? I am pretty certain that our emulation
> > > > >layer on Windows can only run scripts with a shbang line.
> > > 
> > > Out of curiosity how did you have t5550 passing on windows then?
> > 
> > This is the reason:
> > 
> > 	1..0 # SKIP no web server found at '/usr/sbin/apache2'
> 
> Hmm, that's interesting.  So do any of the http tests get run on windows
> then?  I wonder if that lack of coverage could be an issue at some
> point in the future.

Possibly. I'll put it at the bottom of my TODO list ;-)

> > As predicted by Hannes, your new test fails miserably on Windows:
> 
> Isn't 'miserably' just a bit harsh ;P haha

Ah, I tried to be funny. So much for my future career as a comedian.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [PATCH v5 02/11] t0061: run_command executes scripts without a #! line
  2017-04-20 20:24                     ` Johannes Schindelin
@ 2017-04-20 20:49                       ` Brandon Williams
  0 siblings, 0 replies; 140+ messages in thread
From: Brandon Williams @ 2017-04-20 20:49 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Johannes Sixt, git, e, jrnieder

On 04/20, Johannes Schindelin wrote:
> Hi Brandon,
> 
> On Thu, 20 Apr 2017, Brandon Williams wrote:
> 
> > On 04/20, Johannes Schindelin wrote:
> > > 
> > > On Wed, 19 Apr 2017, Brandon Williams wrote:
> > > 
> > > > On 04/19, Johannes Sixt wrote:
> > > > > Am 19.04.2017 um 07:43 schrieb Johannes Sixt:
> > > > > >Am 19.04.2017 um 01:17 schrieb Brandon Williams:
> > > > > >>Add a test to 't0061-run-command.sh' to ensure that run_command
> > > > > >>can continue to execute scripts which don't include a '#!' line.
> > > > > >
> > > > > >Why is this necessary? I am pretty certain that our emulation
> > > > > >layer on Windows can only run scripts with a shbang line.
> > > > 
> > > > Out of curiosity how did you have t5550 passing on windows then?
> > > 
> > > This is the reason:
> > > 
> > > 	1..0 # SKIP no web server found at '/usr/sbin/apache2'
> > 
> > Hmm, that's interesting.  So do any of the http tests get run on windows
> > then?  I wonder if that lack of coverage could be an issue at some
> > point in the future.
> 
> Possibly. I'll put it at the bottom of my TODO list ;-)
> 
> > > As predicted by Hannes, your new test fails miserably on Windows:
> > 
> > Isn't 'miserably' just a bit harsh ;P haha
> 
> Ah, I tried to be funny. So much for my future career as a comedian.

Haha yeah I figured, no worries.  I was just throwing it back at you :D

> 
> Ciao,
> Dscho

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [PATCH v6 00/11] forking and threading
  2017-04-19 23:13         ` [PATCH v6 00/11] forking and threading Brandon Williams
                             ` (10 preceding siblings ...)
  2017-04-19 23:13           ` [PATCH v6 11/11] run-command: block signals between fork and execve Brandon Williams
@ 2017-04-24 22:37           ` Brandon Williams
  2017-04-24 23:50             ` [PATCH v6 12/11] run-command: don't try to execute directories Brandon Williams
  11 siblings, 1 reply; 140+ messages in thread
From: Brandon Williams @ 2017-04-24 22:37 UTC (permalink / raw)
  To: git; +Cc: j6t, sbeller, e, jrnieder

On 04/19, Brandon Williams wrote:
> Changes in v6:
> * fix some windows compat issues
> * better comment on the string_list_remove function (also marked extern)
> 
> Brandon Williams (10):
>   t5550: use write_script to generate post-update hook
>   t0061: run_command executes scripts without a #! line
>   run-command: prepare command before forking
>   run-command: use the async-signal-safe execv instead of execvp
>   string-list: add string_list_remove function
>   run-command: prepare child environment before forking
>   run-command: don't die in child when duping /dev/null
>   run-command: eliminate calls to error handling functions in child
>   run-command: handle dup2 and close errors in child
>   run-command: add note about forking and threading
> 
> Eric Wong (1):
>   run-command: block signals between fork and execve

Just as an FYI there's a bug with this code where it'll try to execute a
directory.  I'm adding a test and fixing it.  Since this topic is in
next I'll base the patch on top of this series.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 140+ messages in thread

* [PATCH v6 12/11] run-command: don't try to execute directories
  2017-04-24 22:37           ` [PATCH v6 00/11] forking and threading Brandon Williams
@ 2017-04-24 23:50             ` Brandon Williams
  2017-04-25  0:17               ` Jonathan Nieder
                                 ` (3 more replies)
  0 siblings, 4 replies; 140+ messages in thread
From: Brandon Williams @ 2017-04-24 23:50 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, gitster, j6t, sbeller, e, jrnieder

In some situations run-command will incorrectly try (and fail) to
execute a directory instead of an executable.  For example:

Lets suppose a user has PATH=~/bin (where 'bin' is a directory) and they
happen to have another directory inside 'bin' named 'git-remote-blah'.
Then git tries to execute the directory:

	$ git ls-remote blah://blah
	fatal: cannot exec 'git-remote-blah': Permission denied

This is due to only checking 'access()' when locating an executable in
PATH, which doesn't distinguish between files and directories.  Instead
use 'stat()' and check that the path is to a regular file.  Now
run-command won't try to execute the directory 'git-remote-blah':

	$ git ls-remote blah://blah
	fatal: Unable to find remote helper for 'blah'

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 run-command.c          | 3 ++-
 t/t0061-run-command.sh | 7 +++++++
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/run-command.c b/run-command.c
index a97d7bf9f..ece0bf342 100644
--- a/run-command.c
+++ b/run-command.c
@@ -127,6 +127,7 @@ static char *locate_in_PATH(const char *file)
 
 	while (1) {
 		const char *end = strchrnul(p, ':');
+		struct stat st;
 
 		strbuf_reset(&buf);
 
@@ -137,7 +138,7 @@ static char *locate_in_PATH(const char *file)
 		}
 		strbuf_addstr(&buf, file);
 
-		if (!access(buf.buf, F_OK))
+		if (!stat(buf.buf, &st) && S_ISREG(st.st_mode))
 			return strbuf_detach(&buf, NULL);
 
 		if (!*end)
diff --git a/t/t0061-run-command.sh b/t/t0061-run-command.sh
index 98c09dd98..30c4ad75f 100755
--- a/t/t0061-run-command.sh
+++ b/t/t0061-run-command.sh
@@ -37,6 +37,13 @@ test_expect_success !MINGW 'run_command can run a script without a #! line' '
 	test_cmp empty err
 '
 
+test_expect_success 'run_command should not try to execute a directory' '
+	test_when_finished "rm -rf bin/blah" &&
+	mkdir -p bin/blah &&
+	PATH=bin:$PATH test_must_fail test-run-command run-command blah 2>err &&
+	test_i18ngrep "No such file or directory" err
+'
+
 test_expect_success POSIXPERM 'run_command reports EACCES' '
 	cat hello-script >hello.sh &&
 	chmod -x hello.sh &&
-- 
2.13.0.rc0.306.g87b477812d-goog


^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [PATCH v6 12/11] run-command: don't try to execute directories
  2017-04-24 23:50             ` [PATCH v6 12/11] run-command: don't try to execute directories Brandon Williams
@ 2017-04-25  0:17               ` Jonathan Nieder
  2017-04-25  1:58                 ` Junio C Hamano
  2017-04-25  2:56                 ` Jeff King
  2017-04-25  1:47               ` Junio C Hamano
                                 ` (2 subsequent siblings)
  3 siblings, 2 replies; 140+ messages in thread
From: Jonathan Nieder @ 2017-04-25  0:17 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, gitster, j6t, sbeller, e, peff

Brandon Williams wrote:

> In some situations run-command will incorrectly try (and fail) to
> execute a directory instead of an executable.  For example:
>
> Lets suppose a user has PATH=~/bin (where 'bin' is a directory) and they
> happen to have another directory inside 'bin' named 'git-remote-blah'.
> Then git tries to execute the directory:
>
> 	$ git ls-remote blah://blah
> 	fatal: cannot exec 'git-remote-blah': Permission denied
>
> This is due to only checking 'access()' when locating an executable in
> PATH, which doesn't distinguish between files and directories.  Instead
> use 'stat()' and check that the path is to a regular file.  Now
> run-command won't try to execute the directory 'git-remote-blah':
>
> 	$ git ls-remote blah://blah
> 	fatal: Unable to find remote helper for 'blah'
>
> Signed-off-by: Brandon Williams <bmwill@google.com>

For the interested, the context in which this was reported was trying
to execute a directory named 'ssh'.  Thanks for a quick fix.

Technically this bug has existed since

	commit 38f865c27d1f2560afb48efd2b7b105c1278c4b5
	Author: Jeff King <peff@peff.net>
	Date:   Fri Mar 30 03:52:18 2012 -0400

	   run-command: treat inaccessible directories as ENOENT

Until we switched from using execvp to execve, the symptom was very
subtle: it only affected the error message when a program could not be
found, instead of affecting functionality more substantially.

[...]
> --- a/run-command.c
> +++ b/run-command.c
> @@ -127,6 +127,7 @@ static char *locate_in_PATH(const char *file)
>  
>  	while (1) {
>  		const char *end = strchrnul(p, ':');
> +		struct stat st;
>  
>  		strbuf_reset(&buf);
>  
> @@ -137,7 +138,7 @@ static char *locate_in_PATH(const char *file)
>  		}
>  		strbuf_addstr(&buf, file);
>  
> -		if (!access(buf.buf, F_OK))
> +		if (!stat(buf.buf, &st) && S_ISREG(st.st_mode))
>  			return strbuf_detach(&buf, NULL);

Should this share code with help.c's is_executable()?

I suppose not, since that would have trouble finding scripts without
the executable bit set.

I was momentarily nervous about what happens if this gets run on
Windows. This is just looking for a file's existence, not
executability, so it should be fine.

> --- a/t/t0061-run-command.sh
> +++ b/t/t0061-run-command.sh
> @@ -37,6 +37,13 @@ test_expect_success !MINGW 'run_command can run a script without a #! line' '
>  	test_cmp empty err
>  '
>  
> +test_expect_success 'run_command should not try to execute a directory' '
> +	test_when_finished "rm -rf bin/blah" &&
> +	mkdir -p bin/blah &&
> +	PATH=bin:$PATH test_must_fail test-run-command run-command blah 2>err &&

Two nits:

- this environment variable setting leaks past the test_must_fail
  invocation in some shells.  When running external comments, they
  update the environment after forking, but when running shell
  functions, they update the environment first and never set it back.

  A search with "git grep -e '=.* test_must_fail'" finds no other
  instances of this pattern, so apparently we've done a good job of
  being careful about that. *surprised*  t/check-non-portable-shell.pl
  doesn't check for this.  Perhaps it should.

  Standard workarounds:

  	(
		PATH=... &&
		export PATH &&
		test_must_fail ...
	)

  or

	test_must_fail env PATH=... ...

- using a relative path (other than '.') in $PATH feels unusual.  We
  can mimic a typical user setup more closely by using "$PWD/bin". 

> +	test_i18ngrep "No such file or directory" err

This string comes from libc.  Is there some other way to test for
what this patch does?

E.g. how about something like the following?

Thanks,
Jonathan

diff --git i/t/t0061-run-command.sh w/t/t0061-run-command.sh
index 30c4ad75ff..68cd0a8072 100755
--- i/t/t0061-run-command.sh
+++ w/t/t0061-run-command.sh
@@ -38,10 +38,16 @@ test_expect_success !MINGW 'run_command can run a script without a #! line' '
 '
 
 test_expect_success 'run_command should not try to execute a directory' '
-	test_when_finished "rm -rf bin/blah" &&
-	mkdir -p bin/blah &&
-	PATH=bin:$PATH test_must_fail test-run-command run-command blah 2>err &&
-	test_i18ngrep "No such file or directory" err
+	test_when_finished "rm -rf bin1 bin2" &&
+	mkdir -p bin1/blah &&
+	mkdir bin2 &&
+	cat hello-script >bin2/blah &&
+	chmod +x bin2/blah &&
+	PATH=$PWD/bin1:$PWD/bin2:$PATH \
+	test-run-command run-command blah >actual 2>err &&
+
+	test_cmp hello-script actual &&
+	test_cmp empty err
 '
 
 test_expect_success POSIXPERM 'run_command reports EACCES' '

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [PATCH v6 12/11] run-command: don't try to execute directories
  2017-04-24 23:50             ` [PATCH v6 12/11] run-command: don't try to execute directories Brandon Williams
  2017-04-25  0:17               ` Jonathan Nieder
@ 2017-04-25  1:47               ` Junio C Hamano
  2017-04-25  2:57               ` Jonathan Nieder
  2017-04-25 17:54               ` [PATCH v7 1/2] exec_cmd: expose is_executable function Brandon Williams
  3 siblings, 0 replies; 140+ messages in thread
From: Junio C Hamano @ 2017-04-25  1:47 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, j6t, sbeller, e, jrnieder

Brandon Williams <bmwill@google.com> writes:

> This is due to only checking 'access()' when locating an executable in
> PATH, which doesn't distinguish between files and directories.  Instead
> use 'stat()' and check that the path is to a regular file.  Now
> run-command won't try to execute the directory 'git-remote-blah':
>
> 	$ git ls-remote blah://blah
> 	fatal: Unable to find remote helper for 'blah'

The above is not a very interesting example.  

More important is that $PATH may have a directory with
git-remote-blah directory (your setup above) and then another
directory with the git-remote-blah executable that the user wanted
to use.  Without this change, we won't get to the real one, and that
makes this change truly valuable.

The added test demostrates the "uninteresting" behaviour.  Even
though it is correct and technically sufficient, it would make it
more relevant to do something like this:

	mkdir -p bin/blah bin2 &&
	write_script bin2/blah <<-\EOF &&
	echo We found blah in bin2
	EOF
	PATH=bin:$PATH test_must_fail ... what you have
	...
	PATH=bin:bin2:$PATH test-run-command run-command blah >actual &&
	bin2/blah >expect &&
	test_cmp expect actual

as the point of locate_in_PATH() is to successfully find one,
without getting confused by an earlier unusable one.

Thanks.

> Signed-off-by: Brandon Williams <bmwill@google.com>
> ---
>  run-command.c          | 3 ++-
>  t/t0061-run-command.sh | 7 +++++++
>  2 files changed, 9 insertions(+), 1 deletion(-)
>
> diff --git a/run-command.c b/run-command.c
> index a97d7bf9f..ece0bf342 100644
> --- a/run-command.c
> +++ b/run-command.c
> @@ -127,6 +127,7 @@ static char *locate_in_PATH(const char *file)
>  
>  	while (1) {
>  		const char *end = strchrnul(p, ':');
> +		struct stat st;
>  
>  		strbuf_reset(&buf);
>  
> @@ -137,7 +138,7 @@ static char *locate_in_PATH(const char *file)
>  		}
>  		strbuf_addstr(&buf, file);
>  
> -		if (!access(buf.buf, F_OK))
> +		if (!stat(buf.buf, &st) && S_ISREG(st.st_mode))
>  			return strbuf_detach(&buf, NULL);
>  
>  		if (!*end)
> diff --git a/t/t0061-run-command.sh b/t/t0061-run-command.sh
> index 98c09dd98..30c4ad75f 100755
> --- a/t/t0061-run-command.sh
> +++ b/t/t0061-run-command.sh
> @@ -37,6 +37,13 @@ test_expect_success !MINGW 'run_command can run a script without a #! line' '
>  	test_cmp empty err
>  '
>  
> +test_expect_success 'run_command should not try to execute a directory' '
> +	test_when_finished "rm -rf bin/blah" &&
> +	mkdir -p bin/blah &&
> +	PATH=bin:$PATH test_must_fail test-run-command run-command blah 2>err &&
> +	test_i18ngrep "No such file or directory" err
> +'
> +
>  test_expect_success POSIXPERM 'run_command reports EACCES' '
>  	cat hello-script >hello.sh &&
>  	chmod -x hello.sh &&

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [PATCH v6 12/11] run-command: don't try to execute directories
  2017-04-25  0:17               ` Jonathan Nieder
@ 2017-04-25  1:58                 ` Junio C Hamano
  2017-04-25  2:51                   ` Jonathan Nieder
  2017-04-25  2:56                 ` Jeff King
  1 sibling, 1 reply; 140+ messages in thread
From: Junio C Hamano @ 2017-04-25  1:58 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: Brandon Williams, git, j6t, sbeller, e, peff

Jonathan Nieder <jrnieder@gmail.com> writes:

> Until we switched from using execvp to execve, the symptom was very
> subtle: it only affected the error message when a program could not be
> found, instead of affecting functionality more substantially.

Hmph, what if you had bin/ssh/ directory and bin2/ssh executable and
had bin:bin2 listed in this order in your $PATH?  Without this change
you'll get an error and that's the end of it.  With this change,
you'd be able to execute bin2/ssh executable, no?  So I am not sure
if I agree with the "this is just an error message subtlety".

What does execvp() do when bin/ssh/ directory, bin2/ssh
non-executable regular file, and bin3/ssh executable file exist and
you have bin:bin2:bin3 on your $PATH?  That is what locate_in_PATH()
should emulate, I would think.

>> +		if (!stat(buf.buf, &st) && S_ISREG(st.st_mode))
>>  			return strbuf_detach(&buf, NULL);
>
> Should this share code with help.c's is_executable()?
>
> I suppose not, since that would have trouble finding scripts without
> the executable bit set.
>
> I was momentarily nervous about what happens if this gets run on
> Windows. This is just looking for a file's existence, not
> executability, so it should be fine.

When we are looking for "ssh" with locate_in_PATH(), shouldn't we
look for "ssh.exe" on Windows, though?

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [PATCH v6 12/11] run-command: don't try to execute directories
  2017-04-25  1:58                 ` Junio C Hamano
@ 2017-04-25  2:51                   ` Jonathan Nieder
  0 siblings, 0 replies; 140+ messages in thread
From: Jonathan Nieder @ 2017-04-25  2:51 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Brandon Williams, git, j6t, sbeller, e, peff

Junio C Hamano wrote:
> Jonathan Nieder <jrnieder@gmail.com> writes:

>> Until we switched from using execvp to execve, the symptom was very
>> subtle: it only affected the error message when a program could not be
>> found, instead of affecting functionality more substantially.
>
> Hmph, what if you had bin/ssh/ directory and bin2/ssh executable and
> had bin:bin2 listed in this order in your $PATH?  Without this change
> you'll get an error and that's the end of it.  With this change,
> you'd be able to execute bin2/ssh executable, no?  So I am not sure
> if I agree with the "this is just an error message subtlety".

I think you misunderstood what I meant.  execvp() does not have this
bug.  In current master, run_command() (within function sane_execvp())
double-checks execvp()'s work when it sees EACCES to decide whether
to convert it into a more user-friendly ENOENT.  Because of this bug,
if you have a bin/ssh/ directory and no bin2/ssh executable, instead
of reporting this condition as a user-friendly ENOENT, it would leave
it as EACCES.

> What does execvp() do when bin/ssh/ directory, bin2/ssh
> non-executable regular file, and bin3/ssh executable file exist and
> you have bin:bin2:bin3 on your $PATH?  That is what locate_in_PATH()
> should emulate, I would think.

Good catch.

 $ mkdir -p $HOME/bin1/greet
 $ mkdir $HOME/bin2
 $ printf '%s\n' 'echo bin2' >$HOME/bin2/greet
 $ mkdir $HOME/bin3
 $ printf '%s\n' '#!/bin/sh' 'echo bin3' >$HOME/bin3/greet
 $ chmod +x $HOME/bin3/greet
 $ PATH=$HOME/bin1:$HOME/bin2:$HOME/bin3:$PATH perl -e 'exec("greet")'
 bin3

It needs to skip over non-executable files.

I think this means we'd want to reuse something like is_executable
from help.c.

[...]
>>> +		if (!stat(buf.buf, &st) && S_ISREG(st.st_mode))
>>>  			return strbuf_detach(&buf, NULL);
>>
>> Should this share code with help.c's is_executable()?
>>
>> I suppose not, since that would have trouble finding scripts without
>> the executable bit set.

I confused myself about the script special-case: they are supposed to
have the executable bit set, too --- the special-casing is just about
lacking #!/bin/sh at the top (and hence not being directly executable
with execve).

>> I was momentarily nervous about what happens if this gets run on
>> Windows. This is just looking for a file's existence, not
>> executability, so it should be fine.
>
> When we are looking for "ssh" with locate_in_PATH(), shouldn't we
> look for "ssh.exe" on Windows, though?

Fortunately this is in a #if !defined(GIT_WINDOWS_NATIVE) block.  It's
probably worth adding a comment so people know not to rely on it
matching Windows path search behavior.

Thanks for looking it over,
Jonathan

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [PATCH v6 12/11] run-command: don't try to execute directories
  2017-04-25  0:17               ` Jonathan Nieder
  2017-04-25  1:58                 ` Junio C Hamano
@ 2017-04-25  2:56                 ` Jeff King
  1 sibling, 0 replies; 140+ messages in thread
From: Jeff King @ 2017-04-25  2:56 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: Brandon Williams, git, gitster, j6t, sbeller, e

On Mon, Apr 24, 2017 at 05:17:24PM -0700, Jonathan Nieder wrote:

> > This is due to only checking 'access()' when locating an executable in
> > PATH, which doesn't distinguish between files and directories.  Instead
> > use 'stat()' and check that the path is to a regular file.  Now
> > run-command won't try to execute the directory 'git-remote-blah':
> >
> > 	$ git ls-remote blah://blah
> > 	fatal: Unable to find remote helper for 'blah'
> >
> > Signed-off-by: Brandon Williams <bmwill@google.com>
> 
> For the interested, the context in which this was reported was trying
> to execute a directory named 'ssh'.  Thanks for a quick fix.
> 
> Technically this bug has existed since
> 
> 	commit 38f865c27d1f2560afb48efd2b7b105c1278c4b5
> 	Author: Jeff King <peff@peff.net>
> 	Date:   Fri Mar 30 03:52:18 2012 -0400
> 
> 	   run-command: treat inaccessible directories as ENOENT
> 
> Until we switched from using execvp to execve, the symptom was very
> subtle: it only affected the error message when a program could not be
> found, instead of affecting functionality more substantially.

Yeah, I'm pretty sure I didn't think at all about access() matching
directories when doing that commit. Using stat() does seem like the
right solution.

-Peff

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [PATCH v6 12/11] run-command: don't try to execute directories
  2017-04-24 23:50             ` [PATCH v6 12/11] run-command: don't try to execute directories Brandon Williams
  2017-04-25  0:17               ` Jonathan Nieder
  2017-04-25  1:47               ` Junio C Hamano
@ 2017-04-25  2:57               ` Jonathan Nieder
  2017-04-25 17:54               ` [PATCH v7 1/2] exec_cmd: expose is_executable function Brandon Williams
  3 siblings, 0 replies; 140+ messages in thread
From: Jonathan Nieder @ 2017-04-25  2:57 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, gitster, j6t, sbeller, e, Brian Hatfield, Jeff King

(cc-ing the reporter)
Brandon Williams wrote:

> In some situations run-command will incorrectly try (and fail) to
> execute a directory instead of an executable.  For example:
>
> Lets suppose a user has PATH=~/bin (where 'bin' is a directory) and they
> happen to have another directory inside 'bin' named 'git-remote-blah'.
> Then git tries to execute the directory:
>
> 	$ git ls-remote blah://blah
> 	fatal: cannot exec 'git-remote-blah': Permission denied
>
> This is due to only checking 'access()' when locating an executable in
> PATH, which doesn't distinguish between files and directories.  Instead
> use 'stat()' and check that the path is to a regular file.  Now
> run-command won't try to execute the directory 'git-remote-blah':
>
> 	$ git ls-remote blah://blah
> 	fatal: Unable to find remote helper for 'blah'

Reported-by: Brian Hatfield <bhatfield@google.com>

This was observed by having a directory called "ssh" in $PATH before
the real ssh and trying to use ssh protoccol.  Thanks for catching it.

> Signed-off-by: Brandon Williams <bmwill@google.com>
> ---
>  run-command.c          | 3 ++-
>  t/t0061-run-command.sh | 7 +++++++
>  2 files changed, 9 insertions(+), 1 deletion(-)
>
> diff --git a/run-command.c b/run-command.c
> index a97d7bf9f..ece0bf342 100644
> --- a/run-command.c
> +++ b/run-command.c
> @@ -127,6 +127,7 @@ static char *locate_in_PATH(const char *file)
>  
>  	while (1) {
>  		const char *end = strchrnul(p, ':');
> +		struct stat st;
>  
>  		strbuf_reset(&buf);
>  
> @@ -137,7 +138,7 @@ static char *locate_in_PATH(const char *file)
>  		}
>  		strbuf_addstr(&buf, file);
>  
> -		if (!access(buf.buf, F_OK))
> +		if (!stat(buf.buf, &st) && S_ISREG(st.st_mode))
>  			return strbuf_detach(&buf, NULL);
>  
>  		if (!*end)
> diff --git a/t/t0061-run-command.sh b/t/t0061-run-command.sh
> index 98c09dd98..30c4ad75f 100755
> --- a/t/t0061-run-command.sh
> +++ b/t/t0061-run-command.sh
> @@ -37,6 +37,13 @@ test_expect_success !MINGW 'run_command can run a script without a #! line' '
>  	test_cmp empty err
>  '
>  
> +test_expect_success 'run_command should not try to execute a directory' '
> +	test_when_finished "rm -rf bin/blah" &&
> +	mkdir -p bin/blah &&
> +	PATH=bin:$PATH test_must_fail test-run-command run-command blah 2>err &&
> +	test_i18ngrep "No such file or directory" err
> +'
> +
>  test_expect_success POSIXPERM 'run_command reports EACCES' '
>  	cat hello-script >hello.sh &&
>  	chmod -x hello.sh &&

^ permalink raw reply	[flat|nested] 140+ messages in thread

* [PATCH v7 1/2] exec_cmd: expose is_executable function
  2017-04-24 23:50             ` [PATCH v6 12/11] run-command: don't try to execute directories Brandon Williams
                                 ` (2 preceding siblings ...)
  2017-04-25  2:57               ` Jonathan Nieder
@ 2017-04-25 17:54               ` Brandon Williams
  2017-04-25 17:54                 ` [PATCH v7 2/2] run-command: don't try to execute directories Brandon Williams
                                   ` (3 more replies)
  3 siblings, 4 replies; 140+ messages in thread
From: Brandon Williams @ 2017-04-25 17:54 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, gitster, j6t, sbeller, e, jrnieder, peff

Move the logic for 'is_executable()' from help.c to exec_cmd.c and
expose it so that callers from outside help.c can access the function.
This is to enable run-command to be able to query if a file is
executable in a future patch.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 exec_cmd.c | 42 ++++++++++++++++++++++++++++++++++++++++++
 exec_cmd.h |  1 +
 help.c     | 42 ------------------------------------------
 3 files changed, 43 insertions(+), 42 deletions(-)

diff --git a/exec_cmd.c b/exec_cmd.c
index fb94aeba9..6d9481e26 100644
--- a/exec_cmd.c
+++ b/exec_cmd.c
@@ -149,3 +149,45 @@ int execl_git_cmd(const char *cmd,...)
 	argv[argc] = NULL;
 	return execv_git_cmd(argv);
 }
+
+int is_executable(const char *name)
+{
+	struct stat st;
+
+	if (stat(name, &st) || /* stat, not lstat */
+	    !S_ISREG(st.st_mode))
+		return 0;
+
+#if defined(GIT_WINDOWS_NATIVE)
+	/*
+	 * On Windows there is no executable bit. The file extension
+	 * indicates whether it can be run as an executable, and Git
+	 * has special-handling to detect scripts and launch them
+	 * through the indicated script interpreter. We test for the
+	 * file extension first because virus scanners may make
+	 * it quite expensive to open many files.
+	 */
+	if (ends_with(name, ".exe"))
+		return S_IXUSR;
+
+{
+	/*
+	 * Now that we know it does not have an executable extension,
+	 * peek into the file instead.
+	 */
+	char buf[3] = { 0 };
+	int n;
+	int fd = open(name, O_RDONLY);
+	st.st_mode &= ~S_IXUSR;
+	if (fd >= 0) {
+		n = read(fd, buf, 2);
+		if (n == 2)
+			/* look for a she-bang */
+			if (!strcmp(buf, "#!"))
+				st.st_mode |= S_IXUSR;
+		close(fd);
+	}
+}
+#endif
+	return st.st_mode & S_IXUSR;
+}
diff --git a/exec_cmd.h b/exec_cmd.h
index ff0b48048..48dd18a0d 100644
--- a/exec_cmd.h
+++ b/exec_cmd.h
@@ -12,5 +12,6 @@ extern int execv_git_cmd(const char **argv); /* NULL terminated */
 LAST_ARG_MUST_BE_NULL
 extern int execl_git_cmd(const char *cmd, ...);
 extern char *system_path(const char *path);
+extern int is_executable(const char *name);
 
 #endif /* GIT_EXEC_CMD_H */
diff --git a/help.c b/help.c
index bc6cd19cf..50f84b430 100644
--- a/help.c
+++ b/help.c
@@ -96,48 +96,6 @@ static void pretty_print_cmdnames(struct cmdnames *cmds, unsigned int colopts)
 	string_list_clear(&list, 0);
 }
 
-static int is_executable(const char *name)
-{
-	struct stat st;
-
-	if (stat(name, &st) || /* stat, not lstat */
-	    !S_ISREG(st.st_mode))
-		return 0;
-
-#if defined(GIT_WINDOWS_NATIVE)
-	/*
-	 * On Windows there is no executable bit. The file extension
-	 * indicates whether it can be run as an executable, and Git
-	 * has special-handling to detect scripts and launch them
-	 * through the indicated script interpreter. We test for the
-	 * file extension first because virus scanners may make
-	 * it quite expensive to open many files.
-	 */
-	if (ends_with(name, ".exe"))
-		return S_IXUSR;
-
-{
-	/*
-	 * Now that we know it does not have an executable extension,
-	 * peek into the file instead.
-	 */
-	char buf[3] = { 0 };
-	int n;
-	int fd = open(name, O_RDONLY);
-	st.st_mode &= ~S_IXUSR;
-	if (fd >= 0) {
-		n = read(fd, buf, 2);
-		if (n == 2)
-			/* look for a she-bang */
-			if (!strcmp(buf, "#!"))
-				st.st_mode |= S_IXUSR;
-		close(fd);
-	}
-}
-#endif
-	return st.st_mode & S_IXUSR;
-}
-
 static void list_commands_in_dir(struct cmdnames *cmds,
 					 const char *path,
 					 const char *prefix)
-- 
2.13.0.rc0.306.g87b477812d-goog


^ permalink raw reply	[flat|nested] 140+ messages in thread

* [PATCH v7 2/2] run-command: don't try to execute directories
  2017-04-25 17:54               ` [PATCH v7 1/2] exec_cmd: expose is_executable function Brandon Williams
@ 2017-04-25 17:54                 ` Brandon Williams
  2017-04-25 18:51                   ` Jonathan Nieder
  2017-04-25 18:04                 ` [PATCH v7 1/2] exec_cmd: expose is_executable function Jonathan Nieder
                                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 140+ messages in thread
From: Brandon Williams @ 2017-04-25 17:54 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, gitster, j6t, sbeller, e, jrnieder, peff

In some situations run-command will incorrectly try (and fail) to
execute a directory instead of an executable file.  This was observed by
having a directory called "ssh" in $PATH before the real ssh and trying
to use ssh protoccol, reslting in the following:

	$ git ls-remote ssh://url
	fatal: cannot exec 'ssh': Permission denied

It ends up being worse and run-command will even try to execute a
non-executable file if it preceeds the executable version of a file on
the PATH.  For example, if PATH=~/bin1:~/bin2:~/bin3 and there exists a
directory 'git-hello' in 'bin1', a non-executable file 'git-hello' in
bin2 and an executable file 'git-hello' (which prints "Hello World!") in
bin3 the following will occur:

	$ git hello
	fatal: cannot exec 'git-hello': Permission denied

This is due to only checking 'access()' when locating an executable in
PATH, which doesn't distinguish between files and directories.  Instead
use 'is_executable()' which check that the path is to a regular,
executable file.  Now run-command won't try to execute the directory or
non-executable file 'git-hello':

	$ git hello
	Hello World!

Reported-by: Brian Hatfield <bhatfield@google.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
---
 run-command.c          |  2 +-
 t/t0061-run-command.sh | 23 +++++++++++++++++++++++
 2 files changed, 24 insertions(+), 1 deletion(-)

diff --git a/run-command.c b/run-command.c
index a97d7bf9f..ec08e0951 100644
--- a/run-command.c
+++ b/run-command.c
@@ -137,7 +137,7 @@ static char *locate_in_PATH(const char *file)
 		}
 		strbuf_addstr(&buf, file);
 
-		if (!access(buf.buf, F_OK))
+		if (is_executable(buf.buf))
 			return strbuf_detach(&buf, NULL);
 
 		if (!*end)
diff --git a/t/t0061-run-command.sh b/t/t0061-run-command.sh
index 98c09dd98..fd5e43766 100755
--- a/t/t0061-run-command.sh
+++ b/t/t0061-run-command.sh
@@ -37,6 +37,29 @@ test_expect_success !MINGW 'run_command can run a script without a #! line' '
 	test_cmp empty err
 '
 
+test_expect_success 'run_command should not try to execute a directory' '
+	test_when_finished "rm -rf bin1 bin2 bin3" &&
+	mkdir -p bin1/greet bin2 bin3 &&
+	write_script bin2/greet <<-\EOF &&
+	cat bin2/greet
+	EOF
+	chmod -x bin2/greet &&
+	write_script bin3/greet <<-\EOF &&
+	cat bin3/greet
+	EOF
+
+	# Test that run-command does not try to execute the "greet" directory in
+	# "bin1", or the non-executable file "greet" in "bin2", but rather
+	# correcty executes the "greet" script located in bin3.
+	(
+		PATH=$PWD/bin1:$PWD/bin2:$PWD/bin3:$PATH &&
+		export PATH &&
+		test-run-command run-command greet >actual 2>err
+	) &&
+	test_cmp bin3/greet actual &&
+	test_cmp empty err
+'
+
 test_expect_success POSIXPERM 'run_command reports EACCES' '
 	cat hello-script >hello.sh &&
 	chmod -x hello.sh &&
-- 
2.13.0.rc0.306.g87b477812d-goog


^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [PATCH v7 1/2] exec_cmd: expose is_executable function
  2017-04-25 17:54               ` [PATCH v7 1/2] exec_cmd: expose is_executable function Brandon Williams
  2017-04-25 17:54                 ` [PATCH v7 2/2] run-command: don't try to execute directories Brandon Williams
@ 2017-04-25 18:04                 ` Jonathan Nieder
  2017-04-25 18:18                 ` Johannes Sixt
  2017-04-25 23:46                 ` [PATCH v8 1/2] run-command: " Brandon Williams
  3 siblings, 0 replies; 140+ messages in thread
From: Jonathan Nieder @ 2017-04-25 18:04 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, gitster, j6t, sbeller, e, peff

Hi,

Brandon Williams wrote:

> Move the logic for 'is_executable()' from help.c to exec_cmd.c and
> expose it so that callers from outside help.c can access the function.
> This is to enable run-command to be able to query if a file is
> executable in a future patch.
>
> Signed-off-by: Brandon Williams <bmwill@google.com>
> ---
>  exec_cmd.c | 42 ++++++++++++++++++++++++++++++++++++++++++
>  exec_cmd.h |  1 +
>  help.c     | 42 ------------------------------------------
>  3 files changed, 43 insertions(+), 42 deletions(-)

Makes sense.  Seems like as fine place for it.  (exec_cmd.c is mostly
an implementation detail of run_command, to hold the logic for
executing a git command.  This is another run_command helper, and it
doesn't feel illogical to find it there.  Another alternative place to
put t would be in run-command.c.)

> diff --git a/exec_cmd.c b/exec_cmd.c
> index fb94aeba9..6d9481e26 100644
> --- a/exec_cmd.c
> +++ b/exec_cmd.c
> @@ -149,3 +149,45 @@ int execl_git_cmd(const char *cmd,...)
>  	argv[argc] = NULL;
>  	return execv_git_cmd(argv);
>  }
> +
> +int is_executable(const char *name)

nit: it's a good practice to find a logical place for a new function in
the existing file, instead of defaulting to the end.  That way, the file
is easier to read sequentially, and if two different patches want to
add a function to the same file then they are less likely to conflict.

This could go before prepare_git_cmd, for example.

With or without such a tweak,
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>

diff --git i/exec_cmd.c w/exec_cmd.c
index 6d9481e26d..601fbc43bc 100644
--- i/exec_cmd.c
+++ w/exec_cmd.c
@@ -104,6 +104,48 @@ void setup_path(void)
 	strbuf_release(&new_path);
 }
 
+int is_executable(const char *name)
+{
+	struct stat st;
+
+	if (stat(name, &st) || /* stat, not lstat */
+	    !S_ISREG(st.st_mode))
+		return 0;
+
+#if defined(GIT_WINDOWS_NATIVE)
+	/*
+	 * On Windows there is no executable bit. The file extension
+	 * indicates whether it can be run as an executable, and Git
+	 * has special-handling to detect scripts and launch them
+	 * through the indicated script interpreter. We test for the
+	 * file extension first because virus scanners may make
+	 * it quite expensive to open many files.
+	 */
+	if (ends_with(name, ".exe"))
+		return S_IXUSR;
+
+{
+	/*
+	 * Now that we know it does not have an executable extension,
+	 * peek into the file instead.
+	 */
+	char buf[3] = { 0 };
+	int n;
+	int fd = open(name, O_RDONLY);
+	st.st_mode &= ~S_IXUSR;
+	if (fd >= 0) {
+		n = read(fd, buf, 2);
+		if (n == 2)
+			/* look for a she-bang */
+			if (!strcmp(buf, "#!"))
+				st.st_mode |= S_IXUSR;
+		close(fd);
+	}
+}
+#endif
+	return st.st_mode & S_IXUSR;
+}
+
 const char **prepare_git_cmd(struct argv_array *out, const char **argv)
 {
 	argv_array_push(out, "git");
@@ -149,45 +191,3 @@ int execl_git_cmd(const char *cmd,...)
 	argv[argc] = NULL;
 	return execv_git_cmd(argv);
 }
-
-int is_executable(const char *name)
-{
-	struct stat st;
-
-	if (stat(name, &st) || /* stat, not lstat */
-	    !S_ISREG(st.st_mode))
-		return 0;
-
-#if defined(GIT_WINDOWS_NATIVE)
-	/*
-	 * On Windows there is no executable bit. The file extension
-	 * indicates whether it can be run as an executable, and Git
-	 * has special-handling to detect scripts and launch them
-	 * through the indicated script interpreter. We test for the
-	 * file extension first because virus scanners may make
-	 * it quite expensive to open many files.
-	 */
-	if (ends_with(name, ".exe"))
-		return S_IXUSR;
-
-{
-	/*
-	 * Now that we know it does not have an executable extension,
-	 * peek into the file instead.
-	 */
-	char buf[3] = { 0 };
-	int n;
-	int fd = open(name, O_RDONLY);
-	st.st_mode &= ~S_IXUSR;
-	if (fd >= 0) {
-		n = read(fd, buf, 2);
-		if (n == 2)
-			/* look for a she-bang */
-			if (!strcmp(buf, "#!"))
-				st.st_mode |= S_IXUSR;
-		close(fd);
-	}
-}
-#endif
-	return st.st_mode & S_IXUSR;
-}
diff --git i/exec_cmd.h w/exec_cmd.h
index 48dd18a0d4..5e8200b952 100644
--- i/exec_cmd.h
+++ w/exec_cmd.h
@@ -7,11 +7,11 @@ extern void git_set_argv_exec_path(const char *exec_path);
 extern void git_extract_argv0_path(const char *path);
 extern const char *git_exec_path(void);
 extern void setup_path(void);
+extern int is_executable(const char *name);
 extern const char **prepare_git_cmd(struct argv_array *out, const char **argv);
 extern int execv_git_cmd(const char **argv); /* NULL terminated */
 LAST_ARG_MUST_BE_NULL
 extern int execl_git_cmd(const char *cmd, ...);
 extern char *system_path(const char *path);
-extern int is_executable(const char *name);
 
 #endif /* GIT_EXEC_CMD_H */

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [PATCH v7 1/2] exec_cmd: expose is_executable function
  2017-04-25 17:54               ` [PATCH v7 1/2] exec_cmd: expose is_executable function Brandon Williams
  2017-04-25 17:54                 ` [PATCH v7 2/2] run-command: don't try to execute directories Brandon Williams
  2017-04-25 18:04                 ` [PATCH v7 1/2] exec_cmd: expose is_executable function Jonathan Nieder
@ 2017-04-25 18:18                 ` Johannes Sixt
  2017-04-25 18:38                   ` Brandon Williams
  2017-04-25 23:46                 ` [PATCH v8 1/2] run-command: " Brandon Williams
  3 siblings, 1 reply; 140+ messages in thread
From: Johannes Sixt @ 2017-04-25 18:18 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, gitster, sbeller, e, jrnieder, peff

Am 25.04.2017 um 19:54 schrieb Brandon Williams:
> Move the logic for 'is_executable()' from help.c to exec_cmd.c and
> expose it so that callers from outside help.c can access the function.

The function is quite low-level. IMO, run-command.[ch] would be a better 
home for it. Additionally, that would reduce the number of files that 
contain #ifdef GIT_WINDOWS_NATIVE.

-- Hannes


^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [PATCH v7 1/2] exec_cmd: expose is_executable function
  2017-04-25 18:18                 ` Johannes Sixt
@ 2017-04-25 18:38                   ` Brandon Williams
  0 siblings, 0 replies; 140+ messages in thread
From: Brandon Williams @ 2017-04-25 18:38 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: git, gitster, sbeller, e, jrnieder, peff

On 04/25, Johannes Sixt wrote:
> Am 25.04.2017 um 19:54 schrieb Brandon Williams:
> >Move the logic for 'is_executable()' from help.c to exec_cmd.c and
> >expose it so that callers from outside help.c can access the function.
> 
> The function is quite low-level. IMO, run-command.[ch] would be a
> better home for it. Additionally, that would reduce the number of
> files that contain #ifdef GIT_WINDOWS_NATIVE.

Fair enough, Jonathan suggested the same so I'll move it to
run-command in a re-roll.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [PATCH v7 2/2] run-command: don't try to execute directories
  2017-04-25 17:54                 ` [PATCH v7 2/2] run-command: don't try to execute directories Brandon Williams
@ 2017-04-25 18:51                   ` Jonathan Nieder
  2017-04-25 19:32                     ` Brandon Williams
  0 siblings, 1 reply; 140+ messages in thread
From: Jonathan Nieder @ 2017-04-25 18:51 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, gitster, j6t, sbeller, e, peff, Brian Hatfield

Brandon Williams wrote:

> Subject: run-command: don't try to execute directories

nit: this is also about non-executable files, now.  That would mean
something like

 run-command: don't try to execute directories or non-executable files

or

 run-command: restrict PATH search to files we can execute

[...]
> Reported-by: Brian Hatfield <bhatfield@google.com>
> Signed-off-by: Brandon Williams <bmwill@google.com>
> ---
>  run-command.c          |  2 +-
>  t/t0061-run-command.sh | 23 +++++++++++++++++++++++
>  2 files changed, 24 insertions(+), 1 deletion(-)
> 
> diff --git a/run-command.c b/run-command.c
> index a97d7bf9f..ec08e0951 100644
> --- a/run-command.c
> +++ b/run-command.c
> @@ -137,7 +137,7 @@ static char *locate_in_PATH(const char *file)
>  		}
>  		strbuf_addstr(&buf, file);
>  
> -		if (!access(buf.buf, F_OK))
> +		if (is_executable(buf.buf))
>  			return strbuf_detach(&buf, NULL);

It's probably worth a docstring for this function to explain
that this is not a complete emulation of execvp on Windows, since
it doesn't look for .com and .exe files.


>  
>  		if (!*end)
> diff --git a/t/t0061-run-command.sh b/t/t0061-run-command.sh
> index 98c09dd98..fd5e43766 100755
> --- a/t/t0061-run-command.sh
> +++ b/t/t0061-run-command.sh
> @@ -37,6 +37,29 @@ test_expect_success !MINGW 'run_command can run a script without a #! line' '
>  	test_cmp empty err
>  '
>  
> +test_expect_success 'run_command should not try to execute a directory' '
> +	test_when_finished "rm -rf bin1 bin2 bin3" &&
> +	mkdir -p bin1/greet bin2 bin3 &&
> +	write_script bin2/greet <<-\EOF &&
> +	cat bin2/greet
> +	EOF
> +	chmod -x bin2/greet &&

This probably implies that the test needs a POSIXPERM dependency.
Should it be a separate test_expect_success case so that the other
part can still run on Windows?

The rest looks good.  Thanks for your patient work.

With whatever subset of the changes described makes sense,
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>

Thanks.

diff --git i/run-command.c w/run-command.c
index ec08e09518..dbbaec932e 100644
--- i/run-command.c
+++ w/run-command.c
@@ -117,6 +117,21 @@ static inline void close_pair(int fd[2])
 	close(fd[1]);
 }
 
+/*
+ * Search $PATH for a command.  This emulates the path search that
+ * execvp would perform, without actually executing the command so it
+ * can be used before fork() to prepare to run a command using
+ * execve() or after execvp() to diagnose why it failed.
+ *
+ * The caller should ensure that file contains no directory
+ * separators.
+ *
+ * Returns NULL if the command could not be found.
+ *
+ * This should not be used on Windows, where the $PATH search rules
+ * are more complicated (e.g., a search for "foo" should find
+ * "foo.exe").
+ */
 static char *locate_in_PATH(const char *file)
 {
 	const char *p = getenv("PATH");
diff --git i/t/t0061-run-command.sh w/t/t0061-run-command.sh
index fd5e43766a..e48a207fae 100755
--- i/t/t0061-run-command.sh
+++ w/t/t0061-run-command.sh
@@ -37,26 +37,33 @@ test_expect_success !MINGW 'run_command can run a script without a #! line' '
 	test_cmp empty err
 '
 
-test_expect_success 'run_command should not try to execute a directory' '
+test_expect_success 'run_command does not try to execute a directory' '
 	test_when_finished "rm -rf bin1 bin2" &&
-	mkdir -p bin1/greet bin2 bin3 &&
+	mkdir -p bin1/greet bin2 &&
 	write_script bin2/greet <<-\EOF &&
 	cat bin2/greet
 	EOF
-	chmod -x bin2/greet &&
-	write_script bin3/greet <<-\EOF &&
-	cat bin3/greet
+
+	PATH=$PWD/bin1:$PWD/bin2:$PATH \
+		test-run-command run-command greet >actual 2>err &&
+	test_cmp bin2/greet actual &&
+	test_cmp empty err
+'
+
+test_expect_success POSIXPERM 'run_command passes over non-executable file' '
+	test_when_finished "rm -rf bin1 bin2" &&
+	mkdir -p bin1 bin2 &&
+	write_script bin1/greet <<-\EOF &&
+	cat bin1/greet
+	EOF
+	chmod -x bin1/greet &&
+	write_script bin2/greet <<-\EOF &&
+	cat bin2/greet
 	EOF
 
-	# Test that run-command does not try to execute the "greet" directory in
-	# "bin1", or the non-executable file "greet" in "bin2", but rather
-	# correcty executes the "greet" script located in bin3.
-	(
-		PATH=$PWD/bin1:$PWD/bin2:$PWD/bin3:$PATH &&
-		export PATH &&
-		test-run-command run-command greet >actual 2>err
-	) &&
-	test_cmp bin3/greet actual &&
+	PATH=$PWD/bin1:$PWD/bin2:$PATH \
+		test-run-command run-command greet >actual 2>err &&
+	test_cmp bin2/greet actual &&
 	test_cmp empty err
 '
 

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [PATCH v7 2/2] run-command: don't try to execute directories
  2017-04-25 18:51                   ` Jonathan Nieder
@ 2017-04-25 19:32                     ` Brandon Williams
  0 siblings, 0 replies; 140+ messages in thread
From: Brandon Williams @ 2017-04-25 19:32 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: git, gitster, j6t, sbeller, e, peff, Brian Hatfield

On 04/25, Jonathan Nieder wrote:
> Brandon Williams wrote:
> 
> > Subject: run-command: don't try to execute directories
> 
> nit: this is also about non-executable files, now.  That would mean
> something like
> 
>  run-command: don't try to execute directories or non-executable files
> 
> or
> 
>  run-command: restrict PATH search to files we can execute
> 
> [...]
> > Reported-by: Brian Hatfield <bhatfield@google.com>
> > Signed-off-by: Brandon Williams <bmwill@google.com>
> > ---
> >  run-command.c          |  2 +-
> >  t/t0061-run-command.sh | 23 +++++++++++++++++++++++
> >  2 files changed, 24 insertions(+), 1 deletion(-)
> > 
> > diff --git a/run-command.c b/run-command.c
> > index a97d7bf9f..ec08e0951 100644
> > --- a/run-command.c
> > +++ b/run-command.c
> > @@ -137,7 +137,7 @@ static char *locate_in_PATH(const char *file)
> >  		}
> >  		strbuf_addstr(&buf, file);
> >  
> > -		if (!access(buf.buf, F_OK))
> > +		if (is_executable(buf.buf))
> >  			return strbuf_detach(&buf, NULL);
> 
> It's probably worth a docstring for this function to explain
> that this is not a complete emulation of execvp on Windows, since
> it doesn't look for .com and .exe files.
> 
> 
> >  
> >  		if (!*end)
> > diff --git a/t/t0061-run-command.sh b/t/t0061-run-command.sh
> > index 98c09dd98..fd5e43766 100755
> > --- a/t/t0061-run-command.sh
> > +++ b/t/t0061-run-command.sh
> > @@ -37,6 +37,29 @@ test_expect_success !MINGW 'run_command can run a script without a #! line' '
> >  	test_cmp empty err
> >  '
> >  
> > +test_expect_success 'run_command should not try to execute a directory' '
> > +	test_when_finished "rm -rf bin1 bin2 bin3" &&
> > +	mkdir -p bin1/greet bin2 bin3 &&
> > +	write_script bin2/greet <<-\EOF &&
> > +	cat bin2/greet
> > +	EOF
> > +	chmod -x bin2/greet &&
> 
> This probably implies that the test needs a POSIXPERM dependency.
> Should it be a separate test_expect_success case so that the other
> part can still run on Windows?
> 
> The rest looks good.  Thanks for your patient work.
> 
> With whatever subset of the changes described makes sense,
> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
> 
> Thanks.
> 
> diff --git i/run-command.c w/run-command.c
> index ec08e09518..dbbaec932e 100644
> --- i/run-command.c
> +++ w/run-command.c
> @@ -117,6 +117,21 @@ static inline void close_pair(int fd[2])
>  	close(fd[1]);
>  }
>  
> +/*
> + * Search $PATH for a command.  This emulates the path search that
> + * execvp would perform, without actually executing the command so it
> + * can be used before fork() to prepare to run a command using
> + * execve() or after execvp() to diagnose why it failed.
> + *
> + * The caller should ensure that file contains no directory
> + * separators.
> + *
> + * Returns NULL if the command could not be found.
> + *
> + * This should not be used on Windows, where the $PATH search rules
> + * are more complicated (e.g., a search for "foo" should find
> + * "foo.exe").
> + */
>  static char *locate_in_PATH(const char *file)
>  {
>  	const char *p = getenv("PATH");
> diff --git i/t/t0061-run-command.sh w/t/t0061-run-command.sh
> index fd5e43766a..e48a207fae 100755
> --- i/t/t0061-run-command.sh
> +++ w/t/t0061-run-command.sh
> @@ -37,26 +37,33 @@ test_expect_success !MINGW 'run_command can run a script without a #! line' '
>  	test_cmp empty err
>  '
>  
> -test_expect_success 'run_command should not try to execute a directory' '
> +test_expect_success 'run_command does not try to execute a directory' '
>  	test_when_finished "rm -rf bin1 bin2" &&
> -	mkdir -p bin1/greet bin2 bin3 &&
> +	mkdir -p bin1/greet bin2 &&
>  	write_script bin2/greet <<-\EOF &&
>  	cat bin2/greet
>  	EOF
> -	chmod -x bin2/greet &&
> -	write_script bin3/greet <<-\EOF &&
> -	cat bin3/greet
> +
> +	PATH=$PWD/bin1:$PWD/bin2:$PATH \
> +		test-run-command run-command greet >actual 2>err &&
> +	test_cmp bin2/greet actual &&
> +	test_cmp empty err
> +'
> +
> +test_expect_success POSIXPERM 'run_command passes over non-executable file' '
> +	test_when_finished "rm -rf bin1 bin2" &&
> +	mkdir -p bin1 bin2 &&
> +	write_script bin1/greet <<-\EOF &&
> +	cat bin1/greet
> +	EOF
> +	chmod -x bin1/greet &&
> +	write_script bin2/greet <<-\EOF &&
> +	cat bin2/greet
>  	EOF
>  
> -	# Test that run-command does not try to execute the "greet" directory in
> -	# "bin1", or the non-executable file "greet" in "bin2", but rather
> -	# correcty executes the "greet" script located in bin3.
> -	(
> -		PATH=$PWD/bin1:$PWD/bin2:$PWD/bin3:$PATH &&
> -		export PATH &&
> -		test-run-command run-command greet >actual 2>err
> -	) &&
> -	test_cmp bin3/greet actual &&
> +	PATH=$PWD/bin1:$PWD/bin2:$PATH \
> +		test-run-command run-command greet >actual 2>err &&
> +	test_cmp bin2/greet actual &&
>  	test_cmp empty err
>  '
>  

Yeah the POSIXPERM is going to be necessary.  I'll roll in some of these
changes when I do a reroll.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 140+ messages in thread

* [PATCH v8 1/2] run-command: expose is_executable function
  2017-04-25 17:54               ` [PATCH v7 1/2] exec_cmd: expose is_executable function Brandon Williams
                                   ` (2 preceding siblings ...)
  2017-04-25 18:18                 ` Johannes Sixt
@ 2017-04-25 23:46                 ` " Brandon Williams
  2017-04-25 23:47                   ` [PATCH v8 2/2] run-command: restrict PATH search to executable files Brandon Williams
  2017-04-25 23:48                   ` [PATCH v8 1/2] run-command: expose is_executable function Jonathan Nieder
  3 siblings, 2 replies; 140+ messages in thread
From: Brandon Williams @ 2017-04-25 23:46 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, gitster, j6t, sbeller, e, jrnieder, peff

Move the logic for 'is_executable()' from help.c to run_command.c and
expose it so that callers from outside help.c can access the function.
This is to enable run-command to be able to query if a file is
executable in a future patch.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 help.c        | 43 +------------------------------------------
 run-command.c | 42 ++++++++++++++++++++++++++++++++++++++++++
 run-command.h |  1 +
 3 files changed, 44 insertions(+), 42 deletions(-)

diff --git a/help.c b/help.c
index bc6cd19cf..0c65a2d21 100644
--- a/help.c
+++ b/help.c
@@ -1,6 +1,7 @@
 #include "cache.h"
 #include "builtin.h"
 #include "exec_cmd.h"
+#include "run-command.h"
 #include "levenshtein.h"
 #include "help.h"
 #include "common-cmds.h"
@@ -96,48 +97,6 @@ static void pretty_print_cmdnames(struct cmdnames *cmds, unsigned int colopts)
 	string_list_clear(&list, 0);
 }
 
-static int is_executable(const char *name)
-{
-	struct stat st;
-
-	if (stat(name, &st) || /* stat, not lstat */
-	    !S_ISREG(st.st_mode))
-		return 0;
-
-#if defined(GIT_WINDOWS_NATIVE)
-	/*
-	 * On Windows there is no executable bit. The file extension
-	 * indicates whether it can be run as an executable, and Git
-	 * has special-handling to detect scripts and launch them
-	 * through the indicated script interpreter. We test for the
-	 * file extension first because virus scanners may make
-	 * it quite expensive to open many files.
-	 */
-	if (ends_with(name, ".exe"))
-		return S_IXUSR;
-
-{
-	/*
-	 * Now that we know it does not have an executable extension,
-	 * peek into the file instead.
-	 */
-	char buf[3] = { 0 };
-	int n;
-	int fd = open(name, O_RDONLY);
-	st.st_mode &= ~S_IXUSR;
-	if (fd >= 0) {
-		n = read(fd, buf, 2);
-		if (n == 2)
-			/* look for a she-bang */
-			if (!strcmp(buf, "#!"))
-				st.st_mode |= S_IXUSR;
-		close(fd);
-	}
-}
-#endif
-	return st.st_mode & S_IXUSR;
-}
-
 static void list_commands_in_dir(struct cmdnames *cmds,
 					 const char *path,
 					 const char *prefix)
diff --git a/run-command.c b/run-command.c
index a97d7bf9f..2ffbd7e67 100644
--- a/run-command.c
+++ b/run-command.c
@@ -117,6 +117,48 @@ static inline void close_pair(int fd[2])
 	close(fd[1]);
 }
 
+int is_executable(const char *name)
+{
+	struct stat st;
+
+	if (stat(name, &st) || /* stat, not lstat */
+	    !S_ISREG(st.st_mode))
+		return 0;
+
+#if defined(GIT_WINDOWS_NATIVE)
+	/*
+	 * On Windows there is no executable bit. The file extension
+	 * indicates whether it can be run as an executable, and Git
+	 * has special-handling to detect scripts and launch them
+	 * through the indicated script interpreter. We test for the
+	 * file extension first because virus scanners may make
+	 * it quite expensive to open many files.
+	 */
+	if (ends_with(name, ".exe"))
+		return S_IXUSR;
+
+{
+	/*
+	 * Now that we know it does not have an executable extension,
+	 * peek into the file instead.
+	 */
+	char buf[3] = { 0 };
+	int n;
+	int fd = open(name, O_RDONLY);
+	st.st_mode &= ~S_IXUSR;
+	if (fd >= 0) {
+		n = read(fd, buf, 2);
+		if (n == 2)
+			/* look for a she-bang */
+			if (!strcmp(buf, "#!"))
+				st.st_mode |= S_IXUSR;
+		close(fd);
+	}
+}
+#endif
+	return st.st_mode & S_IXUSR;
+}
+
 static char *locate_in_PATH(const char *file)
 {
 	const char *p = getenv("PATH");
diff --git a/run-command.h b/run-command.h
index 4fa8f65ad..3932420ec 100644
--- a/run-command.h
+++ b/run-command.h
@@ -51,6 +51,7 @@ struct child_process {
 #define CHILD_PROCESS_INIT { NULL, ARGV_ARRAY_INIT, ARGV_ARRAY_INIT }
 void child_process_init(struct child_process *);
 void child_process_clear(struct child_process *);
+extern int is_executable(const char *name);
 
 int start_command(struct child_process *);
 int finish_command(struct child_process *);
-- 
2.13.0.rc0.306.g87b477812d-goog


^ permalink raw reply	[flat|nested] 140+ messages in thread

* [PATCH v8 2/2] run-command: restrict PATH search to executable files
  2017-04-25 23:46                 ` [PATCH v8 1/2] run-command: " Brandon Williams
@ 2017-04-25 23:47                   ` Brandon Williams
  2017-04-25 23:50                     ` Jonathan Nieder
  2017-04-26  1:44                     ` Junio C Hamano
  2017-04-25 23:48                   ` [PATCH v8 1/2] run-command: expose is_executable function Jonathan Nieder
  1 sibling, 2 replies; 140+ messages in thread
From: Brandon Williams @ 2017-04-25 23:47 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, gitster, j6t, sbeller, e, jrnieder, peff

In some situations run-command will incorrectly try (and fail) to
execute a directory instead of an executable file.  This was observed by
having a directory called "ssh" in $PATH before the real ssh and trying
to use ssh protoccol, reslting in the following:

	$ git ls-remote ssh://url
	fatal: cannot exec 'ssh': Permission denied

It ends up being worse and run-command will even try to execute a
non-executable file if it preceeds the executable version of a file on
the PATH.  For example, if PATH=~/bin1:~/bin2:~/bin3 and there exists a
directory 'git-hello' in 'bin1', a non-executable file 'git-hello' in
bin2 and an executable file 'git-hello' (which prints "Hello World!") in
bin3 the following will occur:

	$ git hello
	fatal: cannot exec 'git-hello': Permission denied

This is due to only checking 'access()' when locating an executable in
PATH, which doesn't distinguish between files and directories.  Instead
use 'is_executable()' which check that the path is to a regular,
executable file.  Now run-command won't try to execute the directory or
non-executable file 'git-hello':

	$ git hello
	Hello World!

Reported-by: Brian Hatfield <bhatfield@google.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
---
 run-command.c          | 19 ++++++++++++++++++-
 t/t0061-run-command.sh | 30 ++++++++++++++++++++++++++++++
 2 files changed, 48 insertions(+), 1 deletion(-)

diff --git a/run-command.c b/run-command.c
index 2ffbd7e67..9e36151bf 100644
--- a/run-command.c
+++ b/run-command.c
@@ -159,6 +159,23 @@ int is_executable(const char *name)
 	return st.st_mode & S_IXUSR;
 }
 
+/*
+ * Search $PATH for a command.  This emulates the path search that
+ * execvp would perform, without actually executing the command so it
+ * can be used before fork() to prepare to run a command using
+ * execve() or after execvp() to diagnose why it failed.
+ *
+ * The caller should ensure that file contains no directory
+ * separators.
+ *
+ * Returns the path to the command, as found in $PATH or NULL if the
+ * command could not be found.  The caller inherits ownership of the memory
+ * used to store the resultant path.
+ *
+ * This should not be used on Windows, where the $PATH search rules
+ * are more complicated (e.g., a search for "foo" should find
+ * "foo.exe").
+ */
 static char *locate_in_PATH(const char *file)
 {
 	const char *p = getenv("PATH");
@@ -179,7 +196,7 @@ static char *locate_in_PATH(const char *file)
 		}
 		strbuf_addstr(&buf, file);
 
-		if (!access(buf.buf, F_OK))
+		if (is_executable(buf.buf))
 			return strbuf_detach(&buf, NULL);
 
 		if (!*end)
diff --git a/t/t0061-run-command.sh b/t/t0061-run-command.sh
index 98c09dd98..e4739170a 100755
--- a/t/t0061-run-command.sh
+++ b/t/t0061-run-command.sh
@@ -37,6 +37,36 @@ test_expect_success !MINGW 'run_command can run a script without a #! line' '
 	test_cmp empty err
 '
 
+test_expect_success 'run_command does not try to execute a directory' '
+	test_when_finished "rm -rf bin1 bin2" &&
+	mkdir -p bin1/greet bin2 &&
+	write_script bin2/greet <<-\EOF &&
+	cat bin2/greet
+	EOF
+
+	PATH=$PWD/bin1:$PWD/bin2:$PATH \
+		test-run-command run-command greet >actual 2>err &&
+	test_cmp bin2/greet actual &&
+	test_cmp empty err
+'
+
+test_expect_success POSIXPERM 'run_command passes over non-executable file' '
+	test_when_finished "rm -rf bin1 bin2" &&
+	mkdir -p bin1 bin2 &&
+	write_script bin1/greet <<-\EOF &&
+	cat bin1/greet
+	EOF
+	chmod -x bin1/greet &&
+	write_script bin2/greet <<-\EOF &&
+	cat bin2/greet
+	EOF
+
+	PATH=$PWD/bin1:$PWD/bin2:$PATH \
+		test-run-command run-command greet >actual 2>err &&
+	test_cmp bin2/greet actual &&
+	test_cmp empty err
+'
+
 test_expect_success POSIXPERM 'run_command reports EACCES' '
 	cat hello-script >hello.sh &&
 	chmod -x hello.sh &&
-- 
2.13.0.rc0.306.g87b477812d-goog


^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [PATCH v8 1/2] run-command: expose is_executable function
  2017-04-25 23:46                 ` [PATCH v8 1/2] run-command: " Brandon Williams
  2017-04-25 23:47                   ` [PATCH v8 2/2] run-command: restrict PATH search to executable files Brandon Williams
@ 2017-04-25 23:48                   ` Jonathan Nieder
  1 sibling, 0 replies; 140+ messages in thread
From: Jonathan Nieder @ 2017-04-25 23:48 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, gitster, j6t, sbeller, e, peff

Brandon Williams wrote:

> Move the logic for 'is_executable()' from help.c to run_command.c and
> expose it so that callers from outside help.c can access the function.
> This is to enable run-command to be able to query if a file is
> executable in a future patch.
>
> Signed-off-by: Brandon Williams <bmwill@google.com>
> ---
>  help.c        | 43 +------------------------------------------
>  run-command.c | 42 ++++++++++++++++++++++++++++++++++++++++++
>  run-command.h |  1 +
>  3 files changed, 44 insertions(+), 42 deletions(-)

Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [PATCH v8 2/2] run-command: restrict PATH search to executable files
  2017-04-25 23:47                   ` [PATCH v8 2/2] run-command: restrict PATH search to executable files Brandon Williams
@ 2017-04-25 23:50                     ` Jonathan Nieder
  2017-04-26  1:44                     ` Junio C Hamano
  1 sibling, 0 replies; 140+ messages in thread
From: Jonathan Nieder @ 2017-04-25 23:50 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, gitster, j6t, sbeller, e, peff, Brian Hatfield

Brandon Williams wrote:

> In some situations run-command will incorrectly try (and fail) to
> execute a directory instead of an executable file.  This was observed by
> having a directory called "ssh" in $PATH before the real ssh and trying
> to use ssh protoccol, reslting in the following:
>
> 	$ git ls-remote ssh://url
> 	fatal: cannot exec 'ssh': Permission denied
>
> It ends up being worse and run-command will even try to execute a
> non-executable file if it preceeds the executable version of a file on
> the PATH.  For example, if PATH=~/bin1:~/bin2:~/bin3 and there exists a
> directory 'git-hello' in 'bin1', a non-executable file 'git-hello' in
> bin2 and an executable file 'git-hello' (which prints "Hello World!") in
> bin3 the following will occur:
>
> 	$ git hello
> 	fatal: cannot exec 'git-hello': Permission denied
>
> This is due to only checking 'access()' when locating an executable in
> PATH, which doesn't distinguish between files and directories.  Instead
> use 'is_executable()' which check that the path is to a regular,
> executable file.  Now run-command won't try to execute the directory or
> non-executable file 'git-hello':
>
> 	$ git hello
> 	Hello World!
>
> Reported-by: Brian Hatfield <bhatfield@google.com>
> Signed-off-by: Brandon Williams <bmwill@google.com>
> ---
>  run-command.c          | 19 ++++++++++++++++++-
>  t/t0061-run-command.sh | 30 ++++++++++++++++++++++++++++++
>  2 files changed, 48 insertions(+), 1 deletion(-)

Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>

Thanks.  Patch left unsnipped for reference.

> diff --git a/run-command.c b/run-command.c
> index 2ffbd7e67..9e36151bf 100644
> --- a/run-command.c
> +++ b/run-command.c
> @@ -159,6 +159,23 @@ int is_executable(const char *name)
>  	return st.st_mode & S_IXUSR;
>  }
>  
> +/*
> + * Search $PATH for a command.  This emulates the path search that
> + * execvp would perform, without actually executing the command so it
> + * can be used before fork() to prepare to run a command using
> + * execve() or after execvp() to diagnose why it failed.
> + *
> + * The caller should ensure that file contains no directory
> + * separators.
> + *
> + * Returns the path to the command, as found in $PATH or NULL if the
> + * command could not be found.  The caller inherits ownership of the memory
> + * used to store the resultant path.
> + *
> + * This should not be used on Windows, where the $PATH search rules
> + * are more complicated (e.g., a search for "foo" should find
> + * "foo.exe").
> + */
>  static char *locate_in_PATH(const char *file)
>  {
>  	const char *p = getenv("PATH");
> @@ -179,7 +196,7 @@ static char *locate_in_PATH(const char *file)
>  		}
>  		strbuf_addstr(&buf, file);
>  
> -		if (!access(buf.buf, F_OK))
> +		if (is_executable(buf.buf))
>  			return strbuf_detach(&buf, NULL);
>  
>  		if (!*end)
> diff --git a/t/t0061-run-command.sh b/t/t0061-run-command.sh
> index 98c09dd98..e4739170a 100755
> --- a/t/t0061-run-command.sh
> +++ b/t/t0061-run-command.sh
> @@ -37,6 +37,36 @@ test_expect_success !MINGW 'run_command can run a script without a #! line' '
>  	test_cmp empty err
>  '
>  
> +test_expect_success 'run_command does not try to execute a directory' '
> +	test_when_finished "rm -rf bin1 bin2" &&
> +	mkdir -p bin1/greet bin2 &&
> +	write_script bin2/greet <<-\EOF &&
> +	cat bin2/greet
> +	EOF
> +
> +	PATH=$PWD/bin1:$PWD/bin2:$PATH \
> +		test-run-command run-command greet >actual 2>err &&
> +	test_cmp bin2/greet actual &&
> +	test_cmp empty err
> +'
> +
> +test_expect_success POSIXPERM 'run_command passes over non-executable file' '
> +	test_when_finished "rm -rf bin1 bin2" &&
> +	mkdir -p bin1 bin2 &&
> +	write_script bin1/greet <<-\EOF &&
> +	cat bin1/greet
> +	EOF
> +	chmod -x bin1/greet &&
> +	write_script bin2/greet <<-\EOF &&
> +	cat bin2/greet
> +	EOF
> +
> +	PATH=$PWD/bin1:$PWD/bin2:$PATH \
> +		test-run-command run-command greet >actual 2>err &&
> +	test_cmp bin2/greet actual &&
> +	test_cmp empty err
> +'
> +
>  test_expect_success POSIXPERM 'run_command reports EACCES' '
>  	cat hello-script >hello.sh &&
>  	chmod -x hello.sh &&

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [PATCH v8 2/2] run-command: restrict PATH search to executable files
  2017-04-25 23:47                   ` [PATCH v8 2/2] run-command: restrict PATH search to executable files Brandon Williams
  2017-04-25 23:50                     ` Jonathan Nieder
@ 2017-04-26  1:44                     ` Junio C Hamano
  2017-04-26 17:10                       ` [PATCH v9 " Brandon Williams
  1 sibling, 1 reply; 140+ messages in thread
From: Junio C Hamano @ 2017-04-26  1:44 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, j6t, sbeller, e, jrnieder, peff

Brandon Williams <bmwill@google.com> writes:

> In some situations run-command will incorrectly try (and fail) to
> execute a directory instead of an executable file.  This was observed by
> having a directory called "ssh" in $PATH before the real ssh and trying
> to use ssh protoccol, reslting in the following:
>
> 	$ git ls-remote ssh://url
> 	fatal: cannot exec 'ssh': Permission denied
>
> It ends up being worse and run-command will even try to execute a
> non-executable file if it preceeds the executable version of a file on
> the PATH.  For example, if PATH=~/bin1:~/bin2:~/bin3 and there exists a
> directory 'git-hello' in 'bin1', a non-executable file 'git-hello' in
> bin2 and an executable file 'git-hello' (which prints "Hello World!") in
> bin3 the following will occur:
>
> 	$ git hello
> 	fatal: cannot exec 'git-hello': Permission denied
>
> This is due to only checking 'access()' when locating an executable in
> PATH, which doesn't distinguish between files and directories.  Instead
> use 'is_executable()' which check that the path is to a regular,
> executable file.  Now run-command won't try to execute the directory or
> non-executable file 'git-hello':
>
> 	$ git hello
> 	Hello World!

Could you add a line after this example, that says something like
"which matches what execvp() would have done with a request to
execute git-hello with such a $PATH."

That is because it can be argued that bin1/git-hello should be found
and get complaint "not an executable file", or that bin1/git-hello
should be skipped but bin2/git-hello should be found and get
complaint "not an executable file", both to help the user diagnose
and fix the broken $PATH (or director contents).  It is the easiest
to justify why we chose this other definition to skip both git-hello
in bin1 and bin2 if that is an established existing practice---we
can say "sure, what you propose also may make sense, but we match
what execvp(3) does".

The patch text looks good.

Thanks.

^ permalink raw reply	[flat|nested] 140+ messages in thread

* [PATCH v9 2/2] run-command: restrict PATH search to executable files
  2017-04-26  1:44                     ` Junio C Hamano
@ 2017-04-26 17:10                       ` " Brandon Williams
  2017-04-27  0:33                         ` Junio C Hamano
  0 siblings, 1 reply; 140+ messages in thread
From: Brandon Williams @ 2017-04-26 17:10 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, gitster, j6t, sbeller, e, jrnieder, peff

In some situations run-command will incorrectly try (and fail) to
execute a directory instead of an executable file.  This was observed by
having a directory called "ssh" in $PATH before the real ssh and trying
to use ssh protoccol, reslting in the following:

	$ git ls-remote ssh://url
	fatal: cannot exec 'ssh': Permission denied

It ends up being worse and run-command will even try to execute a
non-executable file if it preceeds the executable version of a file on
the PATH.  For example, if PATH=~/bin1:~/bin2:~/bin3 and there exists a
directory 'git-hello' in 'bin1', a non-executable file 'git-hello' in
bin2 and an executable file 'git-hello' (which prints "Hello World!") in
bin3 the following will occur:

	$ git hello
	fatal: cannot exec 'git-hello': Permission denied

This is due to only checking 'access()' when locating an executable in
PATH, which doesn't distinguish between files and directories.  Instead
use 'is_executable()' which check that the path is to a regular,
executable file.  Now run-command won't try to execute the directory or
non-executable file 'git-hello':

	$ git hello
	Hello World!

This matches what 'execvp()' would have done with a request to execute
'git-hello' with such a $PATH.

Reported-by: Brian Hatfield <bhatfield@google.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
---

This [2/2] patch has the exact same diff as v8, the only difference is to the
commit message per Junio's request to add an explanation for why this
particular behavior is desirable (because it matches what execvp() does).

I didn't resend out [1/2] of this fixup because it is identical to the v8
version.

 run-command.c          | 19 ++++++++++++++++++-
 t/t0061-run-command.sh | 30 ++++++++++++++++++++++++++++++
 2 files changed, 48 insertions(+), 1 deletion(-)

diff --git a/run-command.c b/run-command.c
index 2ffbd7e67..9e36151bf 100644
--- a/run-command.c
+++ b/run-command.c
@@ -159,6 +159,23 @@ int is_executable(const char *name)
 	return st.st_mode & S_IXUSR;
 }
 
+/*
+ * Search $PATH for a command.  This emulates the path search that
+ * execvp would perform, without actually executing the command so it
+ * can be used before fork() to prepare to run a command using
+ * execve() or after execvp() to diagnose why it failed.
+ *
+ * The caller should ensure that file contains no directory
+ * separators.
+ *
+ * Returns the path to the command, as found in $PATH or NULL if the
+ * command could not be found.  The caller inherits ownership of the memory
+ * used to store the resultant path.
+ *
+ * This should not be used on Windows, where the $PATH search rules
+ * are more complicated (e.g., a search for "foo" should find
+ * "foo.exe").
+ */
 static char *locate_in_PATH(const char *file)
 {
 	const char *p = getenv("PATH");
@@ -179,7 +196,7 @@ static char *locate_in_PATH(const char *file)
 		}
 		strbuf_addstr(&buf, file);
 
-		if (!access(buf.buf, F_OK))
+		if (is_executable(buf.buf))
 			return strbuf_detach(&buf, NULL);
 
 		if (!*end)
diff --git a/t/t0061-run-command.sh b/t/t0061-run-command.sh
index 98c09dd98..e4739170a 100755
--- a/t/t0061-run-command.sh
+++ b/t/t0061-run-command.sh
@@ -37,6 +37,36 @@ test_expect_success !MINGW 'run_command can run a script without a #! line' '
 	test_cmp empty err
 '
 
+test_expect_success 'run_command does not try to execute a directory' '
+	test_when_finished "rm -rf bin1 bin2" &&
+	mkdir -p bin1/greet bin2 &&
+	write_script bin2/greet <<-\EOF &&
+	cat bin2/greet
+	EOF
+
+	PATH=$PWD/bin1:$PWD/bin2:$PATH \
+		test-run-command run-command greet >actual 2>err &&
+	test_cmp bin2/greet actual &&
+	test_cmp empty err
+'
+
+test_expect_success POSIXPERM 'run_command passes over non-executable file' '
+	test_when_finished "rm -rf bin1 bin2" &&
+	mkdir -p bin1 bin2 &&
+	write_script bin1/greet <<-\EOF &&
+	cat bin1/greet
+	EOF
+	chmod -x bin1/greet &&
+	write_script bin2/greet <<-\EOF &&
+	cat bin2/greet
+	EOF
+
+	PATH=$PWD/bin1:$PWD/bin2:$PATH \
+		test-run-command run-command greet >actual 2>err &&
+	test_cmp bin2/greet actual &&
+	test_cmp empty err
+'
+
 test_expect_success POSIXPERM 'run_command reports EACCES' '
 	cat hello-script >hello.sh &&
 	chmod -x hello.sh &&
-- 
2.13.0.rc0.306.g87b477812d-goog


^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [PATCH v9 2/2] run-command: restrict PATH search to executable files
  2017-04-26 17:10                       ` [PATCH v9 " Brandon Williams
@ 2017-04-27  0:33                         ` Junio C Hamano
  0 siblings, 0 replies; 140+ messages in thread
From: Junio C Hamano @ 2017-04-27  0:33 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, j6t, sbeller, e, jrnieder, peff

Brandon Williams <bmwill@google.com> writes:

> This [2/2] patch has the exact same diff as v8, the only difference is to the
> commit message per Junio's request to add an explanation for why this
> particular behavior is desirable (because it matches what execvp() does).
>
> I didn't resend out [1/2] of this fixup because it is identical to the v8
> version.

Thanks.  It is already explained well in the in-code comment, but it
is good to help "git log" readers with this information.

Will queue.

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [PATCH v6 04/11] run-command: use the async-signal-safe execv instead of execvp
  2017-04-19 23:13           ` [PATCH v6 04/11] run-command: use the async-signal-safe execv instead of execvp Brandon Williams
@ 2017-05-17  2:15             ` Junio C Hamano
  2017-05-17  2:26               ` Jeff King
  0 siblings, 1 reply; 140+ messages in thread
From: Junio C Hamano @ 2017-05-17  2:15 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, j6t, sbeller, e, jrnieder

Brandon Williams <bmwill@google.com> writes:

> @@ -238,6 +238,12 @@ static void prepare_cmd(struct argv_array *out, const struct child_process *cmd)
>  	if (!cmd->argv[0])
>  		die("BUG: command is empty");
>  
> +	/*
> +	 * Add SHELL_PATH so in the event exec fails with ENOEXEC we can
> +	 * attempt to interpret the command with 'sh'.
> +	 */
> +	argv_array_push(out, SHELL_PATH);
> +
>  	if (cmd->git_cmd) {
>  		argv_array_push(out, "git");
>  		argv_array_pushv(out, cmd->argv);


So, given "cat-file", "-t", "HEAD" with cmd->git_cmd == TRUE, this
will now prepare ("/bin/sh", "git", "cat-file", "-t", "HEAD", NULL) in
the argv-array, and then ...

> @@ -246,6 +252,20 @@ static void prepare_cmd(struct argv_array *out, const struct child_process *cmd)
>  	} else {
>  		argv_array_pushv(out, cmd->argv);
>  	}
> +
> +	/*
> +	 * If there are no '/' characters in the command then perform a path
> +	 * lookup and use the resolved path as the command to exec.  If there
> +	 * are no '/' characters or if the command wasn't found in the path,
> +	 * have exec attempt to invoke the command directly.
> +	 */
> +	if (!strchr(out->argv[1], '/')) {
> +		char *program = locate_in_PATH(out->argv[1]);
> +		if (program) {
> +			free((char *)out->argv[1]);
> +			out->argv[1] = program;
> +		}
> +	}

... turn the first element from "git" to "/usr/bin/git", i.e.

	("/bin/sh", "/usr/bin/git", "cat-file", "-t", "HEAD", NULL)

which ...

>  #endif
>  
> @@ -445,7 +465,15 @@ int start_command(struct child_process *cmd)
>  			}
>  		}
>  
> -		sane_execvp(argv.argv[0], (char *const *) argv.argv);
> +		/*
> +		 * Attempt to exec using the command and arguments starting at
> +		 * argv.argv[1].  argv.argv[0] contains SHELL_PATH which will
> +		 * be used in the event exec failed with ENOEXEC at which point
> +		 * we will try to interpret the command using 'sh'.
> +		 */
> +		execv(argv.argv[1], (char *const *) argv.argv + 1);

... first is given without the leading "/bin/sh", as the end-user
intended (sort of), but if it fails

> +		if (errno == ENOEXEC)
> +			execv(argv.argv[0], (char *const *) argv.argv);

"/bin/sh" tries to run "/usr/bin/git" that was not executable (well,
the one in "usr/bin/" would have +x bit, but let's pretend that we
are trying to run one from bin-wrappers/ and somehow forgot +x bit)?

I think all of that is sensible, but there is one "huh?" I can't
figure out.  Typically we do "sh -c git cat-file -t HEAD" but this
lacks the "-c" (cf. the original prepare_shell_cmd()); why do we not
need it in this case?

Thanks.

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [PATCH v6 04/11] run-command: use the async-signal-safe execv instead of execvp
  2017-05-17  2:15             ` Junio C Hamano
@ 2017-05-17  2:26               ` Jeff King
  2017-05-17  2:28                 ` Jeff King
                                   ` (2 more replies)
  0 siblings, 3 replies; 140+ messages in thread
From: Jeff King @ 2017-05-17  2:26 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Brandon Williams, git, j6t, sbeller, e, jrnieder

On Wed, May 17, 2017 at 11:15:43AM +0900, Junio C Hamano wrote:

> > +		if (errno == ENOEXEC)
> > +			execv(argv.argv[0], (char *const *) argv.argv);
> 
> "/bin/sh" tries to run "/usr/bin/git" that was not executable (well,
> the one in "usr/bin/" would have +x bit, but let's pretend that we
> are trying to run one from bin-wrappers/ and somehow forgot +x bit)?
> 
> I think all of that is sensible, but there is one "huh?" I can't
> figure out.  Typically we do "sh -c git cat-file -t HEAD" but this
> lacks the "-c" (cf. the original prepare_shell_cmd()); why do we not
> need it in this case?

I think this is the same case we were discussing over in the "rebase"
thread. This isn't about running the user's command as a shell command.
Note that this kicks in even when cmd->shell_cmd isn't set.

This is about finding "/usr/bin/foo", realizing it cannot be exec'd
because it lacks a shebang line, and then pretending that it did have
"#!/bin/sh". IOW, maintaining compatibility with execvp().

So the command itself isn't a shell command, but it may execute a shell
script. If that makes sense.

-Peff

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [PATCH v6 04/11] run-command: use the async-signal-safe execv instead of execvp
  2017-05-17  2:26               ` Jeff King
@ 2017-05-17  2:28                 ` Jeff King
  2017-05-17  3:41                 ` Junio C Hamano
  2017-05-17 14:52                 ` Brandon Williams
  2 siblings, 0 replies; 140+ messages in thread
From: Jeff King @ 2017-05-17  2:28 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Brandon Williams, git, j6t, sbeller, e, jrnieder

On Tue, May 16, 2017 at 10:26:02PM -0400, Jeff King wrote:

> On Wed, May 17, 2017 at 11:15:43AM +0900, Junio C Hamano wrote:
> 
> > > +		if (errno == ENOEXEC)
> > > +			execv(argv.argv[0], (char *const *) argv.argv);
> > 
> > "/bin/sh" tries to run "/usr/bin/git" that was not executable (well,
> > the one in "usr/bin/" would have +x bit, but let's pretend that we
> > are trying to run one from bin-wrappers/ and somehow forgot +x bit)?
> > 
> > I think all of that is sensible, but there is one "huh?" I can't
> > figure out.  Typically we do "sh -c git cat-file -t HEAD" but this
> > lacks the "-c" (cf. the original prepare_shell_cmd()); why do we not
> > need it in this case?
> 
> I think this is the same case we were discussing over in the "rebase"
> thread. This isn't about running the user's command as a shell command.
> Note that this kicks in even when cmd->shell_cmd isn't set.
> 
> This is about finding "/usr/bin/foo", realizing it cannot be exec'd
> because it lacks a shebang line, and then pretending that it did have
> "#!/bin/sh". IOW, maintaining compatibility with execvp().
> 
> So the command itself isn't a shell command, but it may execute a shell
> script. If that makes sense.

And note that this isn't about the "+x" bit. That would result in
EACCES.  Getting ENOEXEC means that the contents of the binary are not
something execv thought it could run.

-Peff

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [PATCH v6 04/11] run-command: use the async-signal-safe execv instead of execvp
  2017-05-17  2:26               ` Jeff King
  2017-05-17  2:28                 ` Jeff King
@ 2017-05-17  3:41                 ` Junio C Hamano
  2017-05-17 14:52                 ` Brandon Williams
  2 siblings, 0 replies; 140+ messages in thread
From: Junio C Hamano @ 2017-05-17  3:41 UTC (permalink / raw)
  To: Jeff King; +Cc: Brandon Williams, git, j6t, sbeller, e, jrnieder

Jeff King <peff@peff.net> writes:

> This is about finding "/usr/bin/foo", realizing it cannot be exec'd
> because it lacks a shebang line, and then pretending that it did have
> "#!/bin/sh". IOW, maintaining compatibility with execvp().
>
> So the command itself isn't a shell command, but it may execute a shell
> script. If that makes sense.

Ah, OK.  Thanks.

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [PATCH v6 04/11] run-command: use the async-signal-safe execv instead of execvp
  2017-05-17  2:26               ` Jeff King
  2017-05-17  2:28                 ` Jeff King
  2017-05-17  3:41                 ` Junio C Hamano
@ 2017-05-17 14:52                 ` Brandon Williams
  2 siblings, 0 replies; 140+ messages in thread
From: Brandon Williams @ 2017-05-17 14:52 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, git, j6t, sbeller, e, jrnieder

On 05/16, Jeff King wrote:
> On Wed, May 17, 2017 at 11:15:43AM +0900, Junio C Hamano wrote:
> 
> > > +		if (errno == ENOEXEC)
> > > +			execv(argv.argv[0], (char *const *) argv.argv);
> > 
> > "/bin/sh" tries to run "/usr/bin/git" that was not executable (well,
> > the one in "usr/bin/" would have +x bit, but let's pretend that we
> > are trying to run one from bin-wrappers/ and somehow forgot +x bit)?
> > 
> > I think all of that is sensible, but there is one "huh?" I can't
> > figure out.  Typically we do "sh -c git cat-file -t HEAD" but this
> > lacks the "-c" (cf. the original prepare_shell_cmd()); why do we not
> > need it in this case?
> 
> I think this is the same case we were discussing over in the "rebase"
> thread. This isn't about running the user's command as a shell command.
> Note that this kicks in even when cmd->shell_cmd isn't set.
> 
> This is about finding "/usr/bin/foo", realizing it cannot be exec'd
> because it lacks a shebang line, and then pretending that it did have
> "#!/bin/sh". IOW, maintaining compatibility with execvp().

Exactly this is all about ensuring we do the same thing the execvp does,
because there isn't a portable variant which allows for passing in an
environment.

> 
> So the command itself isn't a shell command, but it may execute a shell
> script. If that makes sense.
> 
> -Peff

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 140+ messages in thread

end of thread, back to index

Thread overview: 140+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-10 23:49 [PATCH 0/5] forking and threading Brandon Williams
2017-04-10 23:49 ` [PATCH 1/5] run-command: convert sane_execvp to sane_execvpe Brandon Williams
2017-04-12 19:22   ` Brandon Williams
2017-04-10 23:49 ` [PATCH 2/5] run-command: prepare argv before forking Brandon Williams
2017-04-10 23:49 ` [PATCH 3/5] run-command: allocate child_err " Brandon Williams
2017-04-10 23:49 ` [PATCH 4/5] run-command: prepare child environment " Brandon Williams
2017-04-11  0:58   ` Jonathan Nieder
2017-04-11 17:27     ` Brandon Williams
2017-04-11 17:30       ` Jonathan Nieder
2017-04-10 23:49 ` [PATCH 5/5] run-command: add note about forking and threading Brandon Williams
2017-04-11  0:26   ` Jonathan Nieder
2017-04-11  0:53     ` Eric Wong
2017-04-11 17:33       ` Jonathan Nieder
2017-04-11 17:34       ` Brandon Williams
2017-04-11 17:40         ` Eric Wong
2017-04-11  7:05 ` [PATCH 6/5] run-command: avoid potential dangers in forked child Eric Wong
2017-04-11 16:29   ` Brandon Williams
2017-04-11 16:59     ` Eric Wong
2017-04-11 17:17       ` Brandon Williams
2017-04-11 17:37 ` [PATCH 0/5] forking and threading Jonathan Nieder
2017-04-11 17:54   ` Brandon Williams
2017-04-13 18:32 ` [PATCH v2 0/6] " Brandon Williams
2017-04-13 18:32   ` [PATCH v2 1/6] t5550: use write_script to generate post-update hook Brandon Williams
2017-04-13 20:43     ` Jonathan Nieder
2017-04-13 20:59       ` Eric Wong
2017-04-13 21:35         ` Brandon Williams
2017-04-13 21:39           ` Eric Wong
2017-04-13 18:32   ` [PATCH v2 2/6] run-command: prepare command before forking Brandon Williams
2017-04-13 21:14     ` Jonathan Nieder
2017-04-13 22:41       ` Brandon Williams
2017-04-13 18:32   ` [PATCH v2 3/6] run-command: prepare child environment " Brandon Williams
2017-04-13 18:32   ` [PATCH v2 4/6] run-command: don't die in child when duping /dev/null Brandon Williams
2017-04-13 19:29     ` Eric Wong
2017-04-13 19:43       ` Brandon Williams
2017-04-13 18:32   ` [PATCH v2 5/6] run-command: eliminate calls to error handling functions in child Brandon Williams
2017-04-13 18:32   ` [PATCH v2 6/6] run-command: add note about forking and threading Brandon Williams
2017-04-13 20:50   ` [PATCH v2 0/6] " Jonathan Nieder
2017-04-13 23:44     ` Brandon Williams
2017-04-13 21:14   ` [PATCH 7/6] run-command: block signals between fork and execve Eric Wong
2017-04-13 23:37     ` Brandon Williams
2017-04-14  2:42     ` Brandon Williams
2017-04-14  5:26       ` Eric Wong
2017-04-14  5:35         ` Eric Wong
2017-04-14 16:58   ` [PATCH v3 00/10] forking and threading Brandon Williams
2017-04-14 16:58     ` [PATCH v3 01/10] t5550: use write_script to generate post-update hook Brandon Williams
2017-04-14 16:58     ` [PATCH v3 02/10] t0061: run_command executes scripts without a #! line Brandon Williams
2017-04-14 16:58     ` [PATCH v3 03/10] run-command: prepare command before forking Brandon Williams
2017-04-14 16:58     ` [PATCH v3 04/10] run-command: use the async-signal-safe execv instead of execvp Brandon Williams
2017-04-14 16:58     ` [PATCH v3 05/10] run-command: prepare child environment before forking Brandon Williams
2017-04-14 16:58     ` [PATCH v3 06/10] run-command: don't die in child when duping /dev/null Brandon Williams
2017-04-14 19:38       ` Eric Wong
2017-04-14 20:19         ` Brandon Williams
2017-04-14 16:58     ` [PATCH v3 07/10] run-command: eliminate calls to error handling functions in child Brandon Williams
2017-04-14 18:50       ` Eric Wong
2017-04-14 20:22         ` Brandon Williams
2017-04-14 16:59     ` [PATCH v3 08/10] run-command: handle dup2 and close errors " Brandon Williams
2017-04-14 16:59     ` [PATCH v3 09/10] run-command: add note about forking and threading Brandon Williams
2017-04-14 16:59     ` [PATCH v3 10/10] run-command: block signals between fork and execve Brandon Williams
2017-04-14 20:24       ` Brandon Williams
2017-04-14 21:35         ` Eric Wong
2017-04-17 22:08     ` [PATCH v4 00/10] forking and threading Brandon Williams
2017-04-17 22:08       ` [PATCH v4 01/10] t5550: use write_script to generate post-update hook Brandon Williams
2017-04-17 22:08       ` [PATCH v4 02/10] t0061: run_command executes scripts without a #! line Brandon Williams
2017-04-17 22:08       ` [PATCH v4 03/10] run-command: prepare command before forking Brandon Williams
2017-04-17 22:08       ` [PATCH v4 04/10] run-command: use the async-signal-safe execv instead of execvp Brandon Williams
2017-04-17 22:08       ` [PATCH v4 05/10] run-command: prepare child environment before forking Brandon Williams
2017-04-18  0:26         ` Eric Wong
2017-04-18 21:02           ` Brandon Williams
2017-04-17 22:08       ` [PATCH v4 06/10] run-command: don't die in child when duping /dev/null Brandon Williams
2017-04-17 22:08       ` [PATCH v4 07/10] run-command: eliminate calls to error handling functions in child Brandon Williams
2017-04-17 22:08       ` [PATCH v4 08/10] run-command: handle dup2 and close errors " Brandon Williams
2017-04-17 22:08       ` [PATCH v4 09/10] run-command: add note about forking and threading Brandon Williams
2017-04-17 22:08       ` [PATCH v4 10/10] run-command: block signals between fork and execve Brandon Williams
2017-04-18 23:17       ` [PATCH v5 00/11] forking and threading Brandon Williams
2017-04-18 23:17         ` [PATCH v5 01/11] t5550: use write_script to generate post-update hook Brandon Williams
2017-04-18 23:17         ` [PATCH v5 02/11] t0061: run_command executes scripts without a #! line Brandon Williams
2017-04-19  5:43           ` Johannes Sixt
2017-04-19  6:21             ` Johannes Sixt
2017-04-19 15:56               ` Brandon Williams
2017-04-19 18:18                 ` Johannes Sixt
2017-04-20 10:47                 ` Johannes Schindelin
2017-04-20 17:02                   ` Brandon Williams
2017-04-20 20:24                     ` Johannes Schindelin
2017-04-20 20:49                       ` Brandon Williams
2017-04-18 23:17         ` [PATCH v5 03/11] run-command: prepare command before forking Brandon Williams
2017-04-18 23:17         ` [PATCH v5 04/11] run-command: use the async-signal-safe execv instead of execvp Brandon Williams
2017-04-18 23:17         ` [PATCH v5 05/11] string-list: add string_list_remove function Brandon Williams
2017-04-18 23:31           ` Stefan Beller
2017-04-18 23:36             ` Brandon Williams
2017-04-18 23:40               ` Stefan Beller
2017-04-18 23:18         ` [PATCH v5 06/11] run-command: prepare child environment before forking Brandon Williams
2017-04-18 23:18         ` [PATCH v5 07/11] run-command: don't die in child when duping /dev/null Brandon Williams
2017-04-18 23:18         ` [PATCH v5 08/11] run-command: eliminate calls to error handling functions in child Brandon Williams
2017-04-18 23:18         ` [PATCH v5 09/11] run-command: handle dup2 and close errors " Brandon Williams
2017-04-18 23:18         ` [PATCH v5 10/11] run-command: add note about forking and threading Brandon Williams
2017-04-18 23:18         ` [PATCH v5 11/11] run-command: block signals between fork and execve Brandon Williams
2017-04-19  6:00           ` Johannes Sixt
2017-04-19  7:48             ` Eric Wong
2017-04-19 16:10               ` Brandon Williams
2017-04-19 23:13         ` [PATCH v6 00/11] forking and threading Brandon Williams
2017-04-19 23:13           ` [PATCH v6 01/11] t5550: use write_script to generate post-update hook Brandon Williams
2017-04-19 23:13           ` [PATCH v6 02/11] t0061: run_command executes scripts without a #! line Brandon Williams
2017-04-20 10:49             ` Johannes Schindelin
2017-04-20 16:58               ` Brandon Williams
2017-04-19 23:13           ` [PATCH v6 03/11] run-command: prepare command before forking Brandon Williams
2017-04-19 23:13           ` [PATCH v6 04/11] run-command: use the async-signal-safe execv instead of execvp Brandon Williams
2017-05-17  2:15             ` Junio C Hamano
2017-05-17  2:26               ` Jeff King
2017-05-17  2:28                 ` Jeff King
2017-05-17  3:41                 ` Junio C Hamano
2017-05-17 14:52                 ` Brandon Williams
2017-04-19 23:13           ` [PATCH v6 05/11] string-list: add string_list_remove function Brandon Williams
2017-04-19 23:13           ` [PATCH v6 06/11] run-command: prepare child environment before forking Brandon Williams
2017-04-19 23:13           ` [PATCH v6 07/11] run-command: don't die in child when duping /dev/null Brandon Williams
2017-04-19 23:13           ` [PATCH v6 08/11] run-command: eliminate calls to error handling functions in child Brandon Williams
2017-04-19 23:13           ` [PATCH v6 09/11] run-command: handle dup2 and close errors " Brandon Williams
2017-04-19 23:13           ` [PATCH v6 10/11] run-command: add note about forking and threading Brandon Williams
2017-04-19 23:13           ` [PATCH v6 11/11] run-command: block signals between fork and execve Brandon Williams
2017-04-24 22:37           ` [PATCH v6 00/11] forking and threading Brandon Williams
2017-04-24 23:50             ` [PATCH v6 12/11] run-command: don't try to execute directories Brandon Williams
2017-04-25  0:17               ` Jonathan Nieder
2017-04-25  1:58                 ` Junio C Hamano
2017-04-25  2:51                   ` Jonathan Nieder
2017-04-25  2:56                 ` Jeff King
2017-04-25  1:47               ` Junio C Hamano
2017-04-25  2:57               ` Jonathan Nieder
2017-04-25 17:54               ` [PATCH v7 1/2] exec_cmd: expose is_executable function Brandon Williams
2017-04-25 17:54                 ` [PATCH v7 2/2] run-command: don't try to execute directories Brandon Williams
2017-04-25 18:51                   ` Jonathan Nieder
2017-04-25 19:32                     ` Brandon Williams
2017-04-25 18:04                 ` [PATCH v7 1/2] exec_cmd: expose is_executable function Jonathan Nieder
2017-04-25 18:18                 ` Johannes Sixt
2017-04-25 18:38                   ` Brandon Williams
2017-04-25 23:46                 ` [PATCH v8 1/2] run-command: " Brandon Williams
2017-04-25 23:47                   ` [PATCH v8 2/2] run-command: restrict PATH search to executable files Brandon Williams
2017-04-25 23:50                     ` Jonathan Nieder
2017-04-26  1:44                     ` Junio C Hamano
2017-04-26 17:10                       ` [PATCH v9 " Brandon Williams
2017-04-27  0:33                         ` Junio C Hamano
2017-04-25 23:48                   ` [PATCH v8 1/2] run-command: expose is_executable function Jonathan Nieder

git@vger.kernel.org mailing list mirror (one of many)

Archives are clonable:
	git clone --mirror https://public-inbox.org/git
	git clone --mirror http://ou63pmih66umazou.onion/git
	git clone --mirror http://czquwvybam4bgbro.onion/git
	git clone --mirror http://hjrcffqmbrq6wope.onion/git

Newsgroups are available over NNTP:
	nntp://news.public-inbox.org/inbox.comp.version-control.git
	nntp://ou63pmih66umazou.onion/inbox.comp.version-control.git
	nntp://czquwvybam4bgbro.onion/inbox.comp.version-control.git
	nntp://hjrcffqmbrq6wope.onion/inbox.comp.version-control.git
	nntp://news.gmane.org/gmane.comp.version-control.git

 note: .onion URLs require Tor: https://www.torproject.org/
       or Tor2web: https://www.tor2web.org/

AGPL code for this site: git clone https://public-inbox.org/ public-inbox