* git rebase regression: cannot pass a shell expression directly to --exec @ 2017-05-15 18:08 Eric Rannaud 2017-05-16 3:25 ` Jeff King 2017-05-16 10:23 ` Johannes Schindelin 0 siblings, 2 replies; 29+ messages in thread From: Eric Rannaud @ 2017-05-15 18:08 UTC (permalink / raw) To: git, Johannes Schindelin; +Cc: Jeremy Serror Hi all, It used to be possible to run a sequence like: foo() { echo X; } export -f foo git rebase --exec foo HEAD~10 Since upgrading to 2.13.0, I had to update my scripts to run: git rebase --exec "bash -c foo" HEAD~10 I'm not sure if this was an intended change. Bisecting with the following script: #!/usr/bin/env bash make -j8 || exit 3 function foo() { echo OK } export -f foo pushd tmp ../git --exec-path=.. rebase --exec foo HEAD^^ ret=$? # Cleanup if failure ../git --exec-path=.. rebase --abort &> /dev/null popd exit $ret It points to this commit: commit 18633e1a22a68bbe8e6311a1039d13ebbf6fd041 (refs/bisect/bad) Author: Johannes Schindelin <johannes.schindelin@gmx.de> Date: Thu Feb 9 23:23:11 2017 +0100 rebase -i: use the rebase--helper builtin Now that the sequencer learned to process a "normal" interactive rebase, we use it. The original shell script is still used for "non-normal" interactive rebases, i.e. when --root or --preserve-merges was passed. Please note that the --root option (via the $squash_onto variable) needs special handling only for the very first command, hence it is still okay to use the helper upon continue/skip. Also please note that the --no-ff setting is volatile, i.e. when the interactive rebase is interrupted at any stage, there is no record of it. Therefore, we have to pass it from the shell script to the rebase--helper. Note: the test t3404 had to be adjusted because the the error messages produced by the sequencer comply with our current convention to start with a lower-case letter. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> Thanks, Eric ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: git rebase regression: cannot pass a shell expression directly to --exec 2017-05-15 18:08 git rebase regression: cannot pass a shell expression directly to --exec Eric Rannaud @ 2017-05-16 3:25 ` Jeff King 2017-05-16 3:37 ` Jeff King 2017-05-16 3:40 ` Junio C Hamano 2017-05-16 10:23 ` Johannes Schindelin 1 sibling, 2 replies; 29+ messages in thread From: Jeff King @ 2017-05-16 3:25 UTC (permalink / raw) To: Eric Rannaud; +Cc: git, Johannes Schindelin, Jeremy Serror On Mon, May 15, 2017 at 11:08:39AM -0700, Eric Rannaud wrote: > It used to be possible to run a sequence like: > > foo() { echo X; } > export -f foo > git rebase --exec foo HEAD~10 > > Since upgrading to 2.13.0, I had to update my scripts to run: > > git rebase --exec "bash -c foo" HEAD~10 Interesting. The "exec" string is still run with a shell. E.g.: $ git rebase --exec 'for i in 1 2 3; do echo >&2 $i; done' HEAD~5 Executing: for i in 1 2 3; do echo >&2 $i; done 1 2 3 Executing: for i in 1 2 3; do echo >&2 $i; done 1 2 3 [...and so on...] I wonder if this is falling afoul of the optimization in run-command's prepare_shell_cmd() to skip shell invocations for "simple" commands. E.g., if I do: strace -fe execve git rebase -x foo HEAD^ 2>&1 | grep foo I see: ... Executing: foo [pid 21820] execve("foo", ["foo"], [/* 57 vars */]) = -1 ENOENT (No such file or directory) fatal: cannot run foo: No such file or directory But if I were to add a shell meta-character, like this: strace -fe execve git rebase -x 'foo;' HEAD^ 2>&1 | grep foo then I see: ... Executing: foo; [pid 22569] execve("/bin/sh", ["/bin/sh", "-c", "foo;", "foo;"], [/* 57 vars */]) = 0 foo;: 1: foo;: foo: not found So I suspect if you added an extraneous semi-colon, your case would work again (and that would confirm for us that this is indeed the problem). > I'm not sure if this was an intended change. Bisecting with the > following script: > [...] > It points to this commit: > > commit 18633e1a22a68bbe8e6311a1039d13ebbf6fd041 (refs/bisect/bad) > Author: Johannes Schindelin <johannes.schindelin@gmx.de> > Date: Thu Feb 9 23:23:11 2017 +0100 > > rebase -i: use the rebase--helper builtin The optimization in run-command is very old, but the switch to rebase--helper presumably moved us from doing that exec in the actual shell script to doing it via the C code. Which means your exported-function technique has been broken for _most_ of Git all along, but it just now affected this particular spot. I'm not sure how to feel about it. In the face of exported functions, we can never do the shell-skipping optimization, because we don't know how the shell is going to interpret even a simple-looking command. And it is kind of a neat trick. But I also hate to add extra useless shell invocations for the predominantly common case that people aren't using this trick (or aren't even using a shell that supports function exports). One hack would be to look for BASH_FUNC_* in the environment and disable the optimization in that case. I think that would make your case Just Work. It doesn't help other oddball cases, like: - you're trying to run a shell builtin that behaves differently than its exec-able counterpart - your shell has some other mechanism for defining commands that we would not find via exec. I don't know of one offhand. Obviously $ENV could point to a file which defines some, but for most shells would not read any startup files for a non-interactive "sh -c" invocation. -Peff ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: git rebase regression: cannot pass a shell expression directly to --exec 2017-05-16 3:25 ` Jeff King @ 2017-05-16 3:37 ` Jeff King 2017-05-16 16:41 ` Jonathan Nieder 2017-05-16 3:40 ` Junio C Hamano 1 sibling, 1 reply; 29+ messages in thread From: Jeff King @ 2017-05-16 3:37 UTC (permalink / raw) To: Eric Rannaud; +Cc: git, Johannes Schindelin, Jeremy Serror On Mon, May 15, 2017 at 11:25:03PM -0400, Jeff King wrote: > One hack would be to look for BASH_FUNC_* in the environment and disable > the optimization in that case. I think that would make your case Just > Work. It doesn't help other oddball cases, like: > > - you're trying to run a shell builtin that behaves differently than > its exec-able counterpart > > - your shell has some other mechanism for defining commands that we > would not find via exec. I don't know of one offhand. Obviously $ENV > could point to a file which defines some, but for most shells would > not read any startup files for a non-interactive "sh -c" invocation. So I was thinking something like the patch below, though I guess technically you could look for BASH_FUNC_$argv[0]%%, which seems to be bash's magic variable name. I hate to get too intimate with those details, though. Another option is to speculatively run "foo" without the shell, and if execve fails to find it, then fall back to running the shell. That would catch any number of cases where the shell "somehow" finds a command that we can't. You'd still have confusing behavior if your shell builtin behaved differently than the exec-able version, though (because we'd quietly use the exec-able one), but I would imagine that's exceedingly rare. I dunno. Maybe the whole thing is a fool's errand. diff --git a/run-command.c b/run-command.c index 1c02bfb2e..8328d27fb 100644 --- a/run-command.c +++ b/run-command.c @@ -240,12 +240,24 @@ int sane_execvp(const char *file, char * const argv[]) return -1; } +static int env_has_bash_functions(void) +{ + extern char **environ; + char **key; + + for (key = environ; *key; key++) + if (starts_with(*key, "BASH_FUNC_")) + return 1; + return 0; +} + static const char **prepare_shell_cmd(struct argv_array *out, const char **argv) { if (!argv[0]) die("BUG: shell command is empty"); - if (strcspn(argv[0], "|&;<>()$`\\\"' \t\n*?[#~=%") != strlen(argv[0])) { + if (strcspn(argv[0], "|&;<>()$`\\\"' \t\n*?[#~=%") != strlen(argv[0]) || + env_has_bash_functions()) { #ifndef GIT_WINDOWS_NATIVE argv_array_push(out, SHELL_PATH); #else ^ permalink raw reply related [flat|nested] 29+ messages in thread
* Re: git rebase regression: cannot pass a shell expression directly to --exec 2017-05-16 3:37 ` Jeff King @ 2017-05-16 16:41 ` Jonathan Nieder 2017-05-16 16:47 ` Jeff King 0 siblings, 1 reply; 29+ messages in thread From: Jonathan Nieder @ 2017-05-16 16:41 UTC (permalink / raw) To: Jeff King Cc: Eric Rannaud, git, Johannes Schindelin, Jeremy Serror, Brandon Williams (+Brandon Williams, who may have more context for execvp-related things) Hi, Jeff King wrote: > On Mon, May 15, 2017 at 11:25:03PM -0400, Jeff King wrote: >> One hack would be to look for BASH_FUNC_* in the environment and disable >> the optimization in that case. I think that would make your case Just >> Work. It doesn't help other oddball cases, like: >> >> - you're trying to run a shell builtin that behaves differently than >> its exec-able counterpart >> >> - your shell has some other mechanism for defining commands that we >> would not find via exec. I don't know of one offhand. Obviously $ENV >> could point to a file which defines some, but for most shells would >> not read any startup files for a non-interactive "sh -c" invocation. > > So I was thinking something like the patch below, though I guess > technically you could look for BASH_FUNC_$argv[0]%%, which seems to be > bash's magic variable name. I hate to get too intimate with those > details, though. > > Another option is to speculatively run "foo" without the shell, and if > execve fails to find it, then fall back to running the shell. That would > catch any number of cases where the shell "somehow" finds a command that > we can't. Hm. execvp explicitly does this when it hits ENOEXEC, but not for ENOENT. Do you know why that is? I think we want to behave consistently for shell builtins and for exported functions --- they are different sides of the same problem, and behaving differently between the two feels confusing. Thanks, Jonathan ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: git rebase regression: cannot pass a shell expression directly to --exec 2017-05-16 16:41 ` Jonathan Nieder @ 2017-05-16 16:47 ` Jeff King 2017-05-16 17:11 ` Eric Rannaud ` (2 more replies) 0 siblings, 3 replies; 29+ messages in thread From: Jeff King @ 2017-05-16 16:47 UTC (permalink / raw) To: Jonathan Nieder Cc: Eric Rannaud, git, Johannes Schindelin, Jeremy Serror, Brandon Williams On Tue, May 16, 2017 at 09:41:24AM -0700, Jonathan Nieder wrote: > Jeff King wrote: > > On Mon, May 15, 2017 at 11:25:03PM -0400, Jeff King wrote: > > >> One hack would be to look for BASH_FUNC_* in the environment and disable > >> the optimization in that case. I think that would make your case Just > >> Work. It doesn't help other oddball cases, like: > >> > >> - you're trying to run a shell builtin that behaves differently than > >> its exec-able counterpart > >> > >> - your shell has some other mechanism for defining commands that we > >> would not find via exec. I don't know of one offhand. Obviously $ENV > >> could point to a file which defines some, but for most shells would > >> not read any startup files for a non-interactive "sh -c" invocation. > > > > So I was thinking something like the patch below, though I guess > > technically you could look for BASH_FUNC_$argv[0]%%, which seems to be > > bash's magic variable name. I hate to get too intimate with those > > details, though. > > > > Another option is to speculatively run "foo" without the shell, and if > > execve fails to find it, then fall back to running the shell. That would > > catch any number of cases where the shell "somehow" finds a command that > > we can't. > > Hm. execvp explicitly does this when it hits ENOEXEC, but not for > ENOENT. Do you know why that is? When execvp(foo) falls back on ENOEXEC, it is not running "sh -c foo". It is actually running "sh foo" to run the script contents. So it's about letting you do: echo "no shebang line" >script chmod +x script ./script And nothing to do with shell builtins. > I think we want to behave consistently for shell builtins and for > exported functions --- they are different sides of the same problem, > and behaving differently between the two feels confusing. Yes, I think ideally we'd handle it all transparently. Although I'd also point out that Git the behavior under discussion dates back to 2009 and this is (as far as I can recall) the first report. So documenting the current behavior might not be that bad an option. -Peff ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: git rebase regression: cannot pass a shell expression directly to --exec 2017-05-16 16:47 ` Jeff King @ 2017-05-16 17:11 ` Eric Rannaud 2017-05-16 17:15 ` Brandon Williams 2017-05-16 17:37 ` Jonathan Nieder 2 siblings, 0 replies; 29+ messages in thread From: Eric Rannaud @ 2017-05-16 17:11 UTC (permalink / raw) To: Jeff King Cc: Jonathan Nieder, git, Johannes Schindelin, Jeremy Serror, Brandon Williams On Tue, May 16, 2017 at 9:47 AM, Jeff King <peff@peff.net> wrote: >> I think we want to behave consistently for shell builtins and for >> exported functions --- they are different sides of the same problem, >> and behaving differently between the two feels confusing. > > Yes, I think ideally we'd handle it all transparently. Although I'd also > point out that Git the behavior under discussion dates back to 2009 and > this is (as far as I can recall) the first report. So documenting the > current behavior might not be that bad an option. This is on Arch Linux which is usually pretty aggressive with their Git upgrades and I first saw the problem this week. The script that hits this case runs constantly on my machine so my guess is that if anyone relies on the old behavior, they're just starting to see a new-enough Git to have a problem. ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: git rebase regression: cannot pass a shell expression directly to --exec 2017-05-16 16:47 ` Jeff King 2017-05-16 17:11 ` Eric Rannaud @ 2017-05-16 17:15 ` Brandon Williams 2017-05-16 17:23 ` Jeff King 2017-05-16 17:37 ` Jonathan Nieder 2 siblings, 1 reply; 29+ messages in thread From: Brandon Williams @ 2017-05-16 17:15 UTC (permalink / raw) To: Jeff King Cc: Jonathan Nieder, Eric Rannaud, git, Johannes Schindelin, Jeremy Serror On 05/16, Jeff King wrote: > On Tue, May 16, 2017 at 09:41:24AM -0700, Jonathan Nieder wrote: > > > Jeff King wrote: > > > On Mon, May 15, 2017 at 11:25:03PM -0400, Jeff King wrote: > > > > >> One hack would be to look for BASH_FUNC_* in the environment and disable > > >> the optimization in that case. I think that would make your case Just > > >> Work. It doesn't help other oddball cases, like: > > >> > > >> - you're trying to run a shell builtin that behaves differently than > > >> its exec-able counterpart > > >> > > >> - your shell has some other mechanism for defining commands that we > > >> would not find via exec. I don't know of one offhand. Obviously $ENV > > >> could point to a file which defines some, but for most shells would > > >> not read any startup files for a non-interactive "sh -c" invocation. > > > > > > So I was thinking something like the patch below, though I guess > > > technically you could look for BASH_FUNC_$argv[0]%%, which seems to be > > > bash's magic variable name. I hate to get too intimate with those > > > details, though. One concern with that is what about all other shells that are not BASH? I'm sure they use a different env var for storing functions so we may need to handle other shell's too? That is assuming we want to keep the old behavior. > > > > > > Another option is to speculatively run "foo" without the shell, and if > > > execve fails to find it, then fall back to running the shell. That would > > > catch any number of cases where the shell "somehow" finds a command that > > > we can't. > > > > Hm. execvp explicitly does this when it hits ENOEXEC, but not for > > ENOENT. Do you know why that is? > > When execvp(foo) falls back on ENOEXEC, it is not running "sh -c foo". > It is actually running "sh foo" to run the script contents. So it's > about letting you do: > > echo "no shebang line" >script > chmod +x script > ./script > > And nothing to do with shell builtins. That's correct, and is the exact behavior I was trying to mimic with the changes to run_command. 1. try executing the command. 2. if it fails with ENOEXEC then attempt to execute it with a shell > > > I think we want to behave consistently for shell builtins and for > > exported functions --- they are different sides of the same problem, > > and behaving differently between the two feels confusing. > > Yes, I think ideally we'd handle it all transparently. Although I'd also > point out that Git the behavior under discussion dates back to 2009 and > this is (as far as I can recall) the first report. So documenting the > current behavior might not be that bad an option. > > -Peff -- Brandon Williams ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: git rebase regression: cannot pass a shell expression directly to --exec 2017-05-16 17:15 ` Brandon Williams @ 2017-05-16 17:23 ` Jeff King 2017-05-16 17:30 ` Brandon Williams 2017-05-16 19:14 ` Linus Torvalds 0 siblings, 2 replies; 29+ messages in thread From: Jeff King @ 2017-05-16 17:23 UTC (permalink / raw) To: Brandon Williams Cc: Jonathan Nieder, Eric Rannaud, git, Johannes Schindelin, Jeremy Serror On Tue, May 16, 2017 at 10:15:40AM -0700, Brandon Williams wrote: > > > > So I was thinking something like the patch below, though I guess > > > > technically you could look for BASH_FUNC_$argv[0]%%, which seems to be > > > > bash's magic variable name. I hate to get too intimate with those > > > > details, though. > > One concern with that is what about all other shells that are not BASH? > I'm sure they use a different env var for storing functions so we may > need to handle other shell's too? That is assuming we want to keep the > old behavior. Most other shells don't do function-exporting at all. Certainly dash and most traditional bourne shells don't. I wouldn't be surprised if zsh does. But yeah, we'd have to support them one by one (and possibly variants across different versions of each shell). Workable, but gross. > > When execvp(foo) falls back on ENOEXEC, it is not running "sh -c foo". > > It is actually running "sh foo" to run the script contents. So it's > > about letting you do: > > > > echo "no shebang line" >script > > chmod +x script > > ./script > > > > And nothing to do with shell builtins. > > That's correct, and is the exact behavior I was trying to mimic with the > changes to run_command. > 1. try executing the command. > 2. if it fails with ENOEXEC then attempt to execute it with a shell I think the logic here would be more like: 1. During prepare_shell_cmd(), even if we optimize out the shell call, still prepare a fallback argv (since we can't allocate memory post-fork). 2. In the forked child, if we get ENOENT from exec and cmd->use_shell is set, then exec the fallback shell argv instead. Propagate its results, even if it's 127. That still means we'd prefer a $PATH copy of a command to its shell builtin variant, but that can't be helped (and I kind of doubt anybody would care too much). -Peff ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: git rebase regression: cannot pass a shell expression directly to --exec 2017-05-16 17:23 ` Jeff King @ 2017-05-16 17:30 ` Brandon Williams 2017-05-16 19:14 ` Linus Torvalds 1 sibling, 0 replies; 29+ messages in thread From: Brandon Williams @ 2017-05-16 17:30 UTC (permalink / raw) To: Jeff King Cc: Jonathan Nieder, Eric Rannaud, git, Johannes Schindelin, Jeremy Serror On 05/16, Jeff King wrote: > On Tue, May 16, 2017 at 10:15:40AM -0700, Brandon Williams wrote: > > > > > > So I was thinking something like the patch below, though I guess > > > > > technically you could look for BASH_FUNC_$argv[0]%%, which seems to be > > > > > bash's magic variable name. I hate to get too intimate with those > > > > > details, though. > > > > One concern with that is what about all other shells that are not BASH? > > I'm sure they use a different env var for storing functions so we may > > need to handle other shell's too? That is assuming we want to keep the > > old behavior. > > Most other shells don't do function-exporting at all. Certainly dash and > most traditional bourne shells don't. I wouldn't be surprised if zsh > does. But yeah, we'd have to support them one by one (and possibly > variants across different versions of each shell). Workable, but gross. > > > > When execvp(foo) falls back on ENOEXEC, it is not running "sh -c foo". > > > It is actually running "sh foo" to run the script contents. So it's > > > about letting you do: > > > > > > echo "no shebang line" >script > > > chmod +x script > > > ./script > > > > > > And nothing to do with shell builtins. > > > > That's correct, and is the exact behavior I was trying to mimic with the > > changes to run_command. > > 1. try executing the command. > > 2. if it fails with ENOEXEC then attempt to execute it with a shell > > I think the logic here would be more like: Oh yes, I was just saying what I did not taking into account how we would solve this issue. > > 1. During prepare_shell_cmd(), even if we optimize out the shell call, > still prepare a fallback argv (since we can't allocate memory > post-fork). > > 2. In the forked child, if we get ENOENT from exec and cmd->use_shell > is set, then exec the fallback shell argv instead. Propagate its > results, even if it's 127. > > That still means we'd prefer a $PATH copy of a command to its shell > builtin variant, but that can't be helped (and I kind of doubt anybody > would care too much). > > -Peff -- Brandon Williams ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: git rebase regression: cannot pass a shell expression directly to --exec 2017-05-16 17:23 ` Jeff King 2017-05-16 17:30 ` Brandon Williams @ 2017-05-16 19:14 ` Linus Torvalds 2017-05-16 19:35 ` [TANGENT] run-command: use vfork instead of fork Eric Wong 2017-05-16 20:12 ` git rebase regression: cannot pass a shell expression directly to --exec Johannes Schindelin 1 sibling, 2 replies; 29+ messages in thread From: Linus Torvalds @ 2017-05-16 19:14 UTC (permalink / raw) To: Jeff King Cc: Brandon Williams, Jonathan Nieder, Eric Rannaud, Git Mailing List, Johannes Schindelin, Jeremy Serror On Tue, May 16, 2017 at 10:23 AM, Jeff King <peff@peff.net> wrote: > > I think the logic here would be more like: > > 1. During prepare_shell_cmd(), even if we optimize out the shell call, > still prepare a fallback argv (since we can't allocate memory > post-fork). > > 2. In the forked child, if we get ENOENT from exec and cmd->use_shell > is set, then exec the fallback shell argv instead. Propagate its > results, even if it's 127. > > That still means we'd prefer a $PATH copy of a command to its shell > builtin variant, but that can't be helped (and I kind of doubt anybody > would care too much). I think it would be better to just (a) get rid of the magic strcspn() entirely (b) make the 'can we optimize this' test be simply just looking up 'argv[0]' in $PATH Hmm? If you have executables with special characters in them in your PATH, you have bigger issues. Also, if people really want to optimize the code that executes an external program (whether in shell or directly), I think it might be worth it to look at replacing the "fork()" with a "vfork()". Something like this - cmd->pid = fork(); + cmd->pid = (cmd->git_cmd || cmd->env) ? fork() : vfork(); might work (the native git_cmd case needs a real fork, and if we change the environment variables we need it too, but the other cases look like they might work with vfork()). Using vfork() can be hugely more efficient, because you don't have the extra page table copies and teardown, but also avoid a lot of possible copy-on-write faults. Linus ^ permalink raw reply [flat|nested] 29+ messages in thread
* [TANGENT] run-command: use vfork instead of fork 2017-05-16 19:14 ` Linus Torvalds @ 2017-05-16 19:35 ` Eric Wong 2017-05-16 20:47 ` Linus Torvalds 2017-05-16 20:12 ` git rebase regression: cannot pass a shell expression directly to --exec Johannes Schindelin 1 sibling, 1 reply; 29+ messages in thread From: Eric Wong @ 2017-05-16 19:35 UTC (permalink / raw) To: Linus Torvalds Cc: Jeff King, Brandon Williams, Jonathan Nieder, Eric Rannaud, Git Mailing List, Johannes Schindelin, Jeremy Serror Linus Torvalds <torvalds@linux-foundation.org> wrote: > Also, if people really want to optimize the code that executes an > external program (whether in shell or directly), I think it might be > worth it to look at replacing the "fork()" with a "vfork()". > > Something like this > > - cmd->pid = fork(); > + cmd->pid = (cmd->git_cmd || cmd->env) ? fork() : vfork(); > > might work (the native git_cmd case needs a real fork, and if we > change the environment variables we need it too, but the other cases > look like they might work with vfork()). > > Using vfork() can be hugely more efficient, because you don't have the > extra page table copies and teardown, but also avoid a lot of possible > copy-on-write faults. Fwiw, most of the vfork preparation was already done by Brandon and myself a few weeks ago, and cooking in pu. I think only the patch below would be needed to enable vfork (along with any build-time detection) However, I haven't noticed enough forking in git to make a difference (but maybe others do). I think it would make a bigger difference if such changes were made to bash, dash, make, and perl5. --------8<------------ Subject: [PATCH] run-command: use vfork instead of fork To enable vfork, we merely have to avoid modifying memory we share with the parent, so the guard functions `child_(error|warn|die)_fn` can now be disabled. FIXME: still missing autoconf + Makefile portability tweaks. Signed-off-by: Eric Wong <e@80x24.org> --- run-command.c | 28 +--------------------------- 1 file changed, 1 insertion(+), 27 deletions(-) diff --git a/run-command.c b/run-command.c index 9e36151bf9..0292dd94b6 100644 --- a/run-command.c +++ b/run-command.c @@ -324,25 +324,6 @@ static void fake_fatal(const char *err, va_list params) vreportf("fatal: ", err, params); } -static void child_error_fn(const char *err, va_list params) -{ - const char msg[] = "error() should not be called in child\n"; - xwrite(2, msg, sizeof(msg) - 1); -} - -static void child_warn_fn(const char *err, va_list params) -{ - const char msg[] = "warn() should not be called in child\n"; - xwrite(2, msg, sizeof(msg) - 1); -} - -static void NORETURN child_die_fn(const char *err, va_list params) -{ - const char msg[] = "die() should not be called in child\n"; - xwrite(2, msg, sizeof(msg) - 1); - _exit(2); -} - /* this runs in the parent process */ static void child_err_spew(struct child_process *cmd, struct child_err *cerr) { @@ -658,17 +639,10 @@ int start_command(struct child_process *cmd) * never be released in the child process. This means only * Async-Signal-Safe functions are permitted in the child. */ - cmd->pid = fork(); + cmd->pid = vfork(); failed_errno = errno; if (!cmd->pid) { int sig; - /* - * Ensure the default die/error/warn routines do not get - * called, they can take stdio locks and malloc. - */ - set_die_routine(child_die_fn); - set_error_routine(child_error_fn); - set_warn_routine(child_warn_fn); close(notify_pipe[0]); set_cloexec(notify_pipe[1]); -- Also fetchable: git://bogomips.org/git-svn vfork-test commit 5f88d79182aaabc5ea467d1d29e13e45bd2b99bf ^ permalink raw reply related [flat|nested] 29+ messages in thread
* Re: [TANGENT] run-command: use vfork instead of fork 2017-05-16 19:35 ` [TANGENT] run-command: use vfork instead of fork Eric Wong @ 2017-05-16 20:47 ` Linus Torvalds 2017-05-16 21:11 ` Brandon Williams 0 siblings, 1 reply; 29+ messages in thread From: Linus Torvalds @ 2017-05-16 20:47 UTC (permalink / raw) To: Eric Wong Cc: Jeff King, Brandon Williams, Jonathan Nieder, Eric Rannaud, Git Mailing List, Johannes Schindelin, Jeremy Serror On Tue, May 16, 2017 at 12:35 PM, Eric Wong <e@80x24.org> wrote: > > Fwiw, most of the vfork preparation was already done by Brandon > and myself a few weeks ago, and cooking in pu. Oh, interesting. Was that done for vfork(), or is it for something else? Some of the changes seem almost overly careful. Is this for prep-work porting to some odd environment that doesn't really have a MMU at all? There's nothing fundamentally wrong with allocating memory after fork(). But yes, it looks like it helps the vfork case. Linus ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [TANGENT] run-command: use vfork instead of fork 2017-05-16 20:47 ` Linus Torvalds @ 2017-05-16 21:11 ` Brandon Williams 0 siblings, 0 replies; 29+ messages in thread From: Brandon Williams @ 2017-05-16 21:11 UTC (permalink / raw) To: Linus Torvalds Cc: Eric Wong, Jeff King, Jonathan Nieder, Eric Rannaud, Git Mailing List, Johannes Schindelin, Jeremy Serror On 05/16, Linus Torvalds wrote: > On Tue, May 16, 2017 at 12:35 PM, Eric Wong <e@80x24.org> wrote: > > > > Fwiw, most of the vfork preparation was already done by Brandon > > and myself a few weeks ago, and cooking in pu. > > Oh, interesting. Was that done for vfork(), or is it for something > else? Some of the changes seem almost overly careful. Is this for > prep-work porting to some odd environment that doesn't really have a > MMU at all? There's nothing fundamentally wrong with allocating memory > after fork(). > > But yes, it looks like it helps the vfork case. > > Linus I started working on the run-command code when I ran into a deadlock in 'git grep --recurse-submodules'. When I added support for submodules to grep I just assumed that launching a process (which atm is unfortunately the only way to work on a submodule) would work in a multi-threaded environment. I was naive and wrong! The deadlock was due to a malloc lock being held by thread 'A' while thread 'b' tried to launch a process. Since that lock was in a locked-state at the time of forking, it remained in a locked-state with no hope of ever being released. So when the child process that thread 'b' spawned tried to malloc a chunk of memory after forking, it deadlocked. I didn't catch this in initial testing because gclib registers atfork_handelers in order to prevent this sort of thing, while libraries like tcmalloc don't do this. So to account for this, I worked to make run-command safe to use in the presence of threads, which had the benefit of also preparing it to be vfork() ready. Ultimately I'd like to drop the requirement to spawn a child process to work on a submodule, but that's going to take a lot more effort. -- Brandon Williams ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: git rebase regression: cannot pass a shell expression directly to --exec 2017-05-16 19:14 ` Linus Torvalds 2017-05-16 19:35 ` [TANGENT] run-command: use vfork instead of fork Eric Wong @ 2017-05-16 20:12 ` Johannes Schindelin 2017-05-16 20:27 ` Linus Torvalds 1 sibling, 1 reply; 29+ messages in thread From: Johannes Schindelin @ 2017-05-16 20:12 UTC (permalink / raw) To: Linus Torvalds Cc: Jeff King, Brandon Williams, Jonathan Nieder, Eric Rannaud, Git Mailing List, Jeremy Serror Hi Linus, On Tue, 16 May 2017, Linus Torvalds wrote: > On Tue, May 16, 2017 at 10:23 AM, Jeff King <peff@peff.net> wrote: > > > > I think the logic here would be more like: > > > > 1. During prepare_shell_cmd(), even if we optimize out the shell call, > > still prepare a fallback argv (since we can't allocate memory > > post-fork). > > > > 2. In the forked child, if we get ENOENT from exec and cmd->use_shell > > is set, then exec the fallback shell argv instead. Propagate its > > results, even if it's 127. > > > > That still means we'd prefer a $PATH copy of a command to its shell > > builtin variant, but that can't be helped (and I kind of doubt anybody > > would care too much). > > I think it would be better to just > > (a) get rid of the magic strcspn() entirely > > (b) make the 'can we optimize this' test be simply just looking up > 'argv[0]' in $PATH What about ABC=1 my-executable my-arg Ciao, Dscho ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: git rebase regression: cannot pass a shell expression directly to --exec 2017-05-16 20:12 ` git rebase regression: cannot pass a shell expression directly to --exec Johannes Schindelin @ 2017-05-16 20:27 ` Linus Torvalds 0 siblings, 0 replies; 29+ messages in thread From: Linus Torvalds @ 2017-05-16 20:27 UTC (permalink / raw) To: Johannes Schindelin Cc: Jeff King, Brandon Williams, Jonathan Nieder, Eric Rannaud, Git Mailing List, Jeremy Serror On Tue, May 16, 2017 at 1:12 PM, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote: >> >> I think it would be better to just >> >> (a) get rid of the magic strcspn() entirely >> >> (b) make the 'can we optimize this' test be simply just looking up >> 'argv[0]' in $PATH > > What about > > ABC=1 my-executable my-arg What about it? Do you have a binary like '/usr/bin/ABC=1 my-executable my-arg' or something? If so, it gets executed. If not, it would get passed off to "/bin/sh". That sounds a lot more obvious than "some random set of characters are magical". Linus ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: git rebase regression: cannot pass a shell expression directly to --exec 2017-05-16 16:47 ` Jeff King 2017-05-16 17:11 ` Eric Rannaud 2017-05-16 17:15 ` Brandon Williams @ 2017-05-16 17:37 ` Jonathan Nieder 2 siblings, 0 replies; 29+ messages in thread From: Jonathan Nieder @ 2017-05-16 17:37 UTC (permalink / raw) To: Jeff King Cc: Eric Rannaud, git, Johannes Schindelin, Jeremy Serror, Brandon Williams Jeff King wrote: >>> On Mon, May 15, 2017 at 11:25:03PM -0400, Jeff King wrote: >>>> One hack would be to look for BASH_FUNC_* in the environment and disable >>>> the optimization in that case. I think that would make your case Just >>>> Work. It doesn't help other oddball cases, like: [...] >> Hm. execvp explicitly does this when it hits ENOEXEC, but not for >> ENOENT. Do you know why that is? > > When execvp(foo) falls back on ENOEXEC, it is not running "sh -c foo". > It is actually running "sh foo" to run the script contents. Oh --- now I get it. This is about the optimization to bypass the shell in prepare_shell_command. It has nothing to do with execvp --- our execvp emulation already does the right thing, and Brandon's patches did not make this problem any worse or better. Unconditionally falling back to sh -c when we get ENOENT in the RUN_USING_SHELL case makes sense to me. Sorry for the confusion, Jonathan ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: git rebase regression: cannot pass a shell expression directly to --exec 2017-05-16 3:25 ` Jeff King 2017-05-16 3:37 ` Jeff King @ 2017-05-16 3:40 ` Junio C Hamano 2017-05-16 3:53 ` Jeff King 2017-05-16 10:46 ` Johannes Schindelin 1 sibling, 2 replies; 29+ messages in thread From: Junio C Hamano @ 2017-05-16 3:40 UTC (permalink / raw) To: Jeff King; +Cc: Eric Rannaud, git, Johannes Schindelin, Jeremy Serror Jeff King <peff@peff.net> writes: > Interesting. The "exec" string is still run with a shell. E.g.: > ... > I wonder if this is falling afoul of the optimization in run-command's > prepare_shell_cmd() to skip shell invocations for "simple" commands. > ... > So I suspect if you added an extraneous semi-colon, your case would work > again (and that would confirm for us that this is indeed the problem). Wow, that's a brilliant analysis. > The optimization in run-command is very old, but the switch to > rebase--helper presumably moved us from doing that exec in the actual > shell script to doing it via the C code. > > Which means your exported-function technique has been broken for _most_ > of Git all along, but it just now affected this particular spot. > > I'm not sure how to feel about it. In the face of exported functions, we > can never do the shell-skipping optimization, because we don't know how > the shell is going to interpret even a simple-looking command. And it is > kind of a neat trick. But I also hate to add extra useless shell > invocations for the predominantly common case that people aren't using > this trick (or aren't even using a shell that supports function > exports). I was about to write this off as "an unfortunate regression, a fallout that will likely left unfixed, due to lack of a good practical workaround." The point of rewriting things in C and using run_command() interface was to avoid shell overhead. We are doing an exec already, but adding a shell invocation in the middle will double the number of exec (and probably add an extra fork as well), which probably is measurable on systems with slow fork/exec. The "semicolon" trick is way too obscure, but perhaps can be documented as an escape hatch? ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: git rebase regression: cannot pass a shell expression directly to --exec 2017-05-16 3:40 ` Junio C Hamano @ 2017-05-16 3:53 ` Jeff King 2017-05-16 4:08 ` Jeff King 2017-05-16 16:45 ` Eric Rannaud 2017-05-16 10:46 ` Johannes Schindelin 1 sibling, 2 replies; 29+ messages in thread From: Jeff King @ 2017-05-16 3:53 UTC (permalink / raw) To: Junio C Hamano; +Cc: Eric Rannaud, git, Johannes Schindelin, Jeremy Serror On Tue, May 16, 2017 at 12:40:51PM +0900, Junio C Hamano wrote: > Jeff King <peff@peff.net> writes: > > > Interesting. The "exec" string is still run with a shell. E.g.: > > ... > > I wonder if this is falling afoul of the optimization in run-command's > > prepare_shell_cmd() to skip shell invocations for "simple" commands. > > ... > > So I suspect if you added an extraneous semi-colon, your case would work > > again (and that would confirm for us that this is indeed the problem). > > Wow, that's a brilliant analysis. If it's right. :) It's all theory at this point. My /bin/sh isn't bash, but I should be able to build with SHELL_PATH=/bin/bash to reproduce. But I can't: $ bash $ foo() { echo >&2 "in foo"; } $ export -f foo $ bash -c foo in foo $ strace -f 2>&1 git.compile rebase -x 'foo;' HEAD^ | grep foo Executing: foo; [pid 1788] execve("/bin/bash", ["/bin/bash", "-c", "foo;", "foo;"], [/* 60 vars */] <unfinished ...> foo;: foo: command not found So I'm not sure why the direct "bash -c" can find it, but somehow the variable doesn't make it through to the "bash -c" at the lower level. Replacing "foo;" with "env" shows the environment, but BASH_FUNC_foo isn't set in it. I'm not sure where it's getting eaten, though. > The "semicolon" trick is way too obscure, but perhaps can be > documented as an escape hatch? Yeah, I agree this should be documented if it can't be fixed. I wasn't sure if we were giving up just yet. Either way, though, it wouldn't hurt to mention optimizing out "maybe shell" optimization, because it can occasionally produce user-visible effects. Where would be a good place? In git(1), I guess? It's almost something that could go in gitcli(7), but it's not really about the CLI in particular. In most cases the shell-exec'd commands are from config, but not always (as this case shows). So git-config(1) probably isn't the right place. AFAIK, we don't talk about this behavior at all in the existing documentation. And I really don't know where we'd put it that somebody would find it without having to read the documentation exhaustively. -Peff ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: git rebase regression: cannot pass a shell expression directly to --exec 2017-05-16 3:53 ` Jeff King @ 2017-05-16 4:08 ` Jeff King 2017-05-16 16:45 ` Eric Rannaud 1 sibling, 0 replies; 29+ messages in thread From: Jeff King @ 2017-05-16 4:08 UTC (permalink / raw) To: Junio C Hamano; +Cc: Eric Rannaud, git, Johannes Schindelin, Jeremy Serror On Mon, May 15, 2017 at 11:53:57PM -0400, Jeff King wrote: > My /bin/sh isn't bash, but I should be able to build with > SHELL_PATH=/bin/bash to reproduce. But I can't: > > $ bash > $ foo() { echo >&2 "in foo"; } > $ export -f foo > $ bash -c foo > in foo > > $ strace -f 2>&1 git.compile rebase -x 'foo;' HEAD^ | grep foo > Executing: foo; > [pid 1788] execve("/bin/bash", ["/bin/bash", "-c", "foo;", "foo;"], [/* 60 vars */] <unfinished ...> > foo;: foo: command not found > > So I'm not sure why the direct "bash -c" can find it, but somehow the > variable doesn't make it through to the "bash -c" at the lower level. > Replacing "foo;" with "env" shows the environment, but BASH_FUNC_foo > isn't set in it. I'm not sure where it's getting eaten, though. Hmph. It looks like "dash" eats it. My "git.compile" above is a symlink to /path/to/git/bin-wrappers/git. But it looks like our Makefile isn't smart enough to rebuild bin-wrappers when you switch SHELL_PATH, so it will had "/bin/sh" in it. Which points to dash on my machine. And indeed, dash seems to wipe the environment: $ foo() { echo >&2 "in foo"; } $ export -f foo $ bash -c foo in foo $ dash -c "bash -c foo" bash: foo: command not found I wonder if it's related to ShellShock protections, or if dash just rejects variable names with the "%" in them or something. Anyway. That explains why I had trouble reproducing. Using bash consistently as my shell lets me reproduce Eric's results. And doing the semicolon trick does make it work again. So I feel confident that I've analyzed the problem correctly, at least. -Peff ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: git rebase regression: cannot pass a shell expression directly to --exec 2017-05-16 3:53 ` Jeff King 2017-05-16 4:08 ` Jeff King @ 2017-05-16 16:45 ` Eric Rannaud 1 sibling, 0 replies; 29+ messages in thread From: Eric Rannaud @ 2017-05-16 16:45 UTC (permalink / raw) To: Jeff King; +Cc: Junio C Hamano, git, Johannes Schindelin, Jeremy Serror On Mon, May 15, 2017 at 8:53 PM, Jeff King <peff@peff.net> wrote: >> > So I suspect if you added an extraneous semi-colon, your case would work >> > again (and that would confirm for us that this is indeed the problem). >> >> Wow, that's a brilliant analysis. > > If it's right. :) It's all theory at this point. > > My /bin/sh isn't bash, but I should be able to build with > SHELL_PATH=/bin/bash to reproduce. But I can't: Just to clarify if there was any doubt, the semicolon trick does indeed work fine when bash is your shell. It would be nice if it were consistent for all git commands that take a <cmd> argument: foo() { echo foo; } export -f foo git bisect start master master^^ git bisect run foo # works git bisect run 'foo;' # doesn't work running foo; /usr/lib/git-core/git-bisect: line 493: foo;: command not found Also, what's a "simple command", exactly? foo() { echo foo; } export -f foo git rebase --exec foo master^^ # fails git rebase --exec 'foo;' master^^ # OK git rebase --exec 'foo 1' master^^ # OK Not sure if this can be made easy to understand in the manpage. Thanks, Eric ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: git rebase regression: cannot pass a shell expression directly to --exec 2017-05-16 3:40 ` Junio C Hamano 2017-05-16 3:53 ` Jeff King @ 2017-05-16 10:46 ` Johannes Schindelin 1 sibling, 0 replies; 29+ messages in thread From: Johannes Schindelin @ 2017-05-16 10:46 UTC (permalink / raw) To: Junio C Hamano; +Cc: Jeff King, Eric Rannaud, git, Jeremy Serror Hi Junio, On Tue, 16 May 2017, Junio C Hamano wrote: > The point of rewriting things in C and using run_command() interface was > to avoid shell overhead. That statement is missing the point. It is true that converting shell scripts to proper builtins avoids the shell overhead. It is even more true that that results in a nice speed-up that can even be felt on Linux (I feel I have to stress this, as you seem to be mostly focused on Linux usage, and I do not want you to get the impression that you are only doing "the Windows folks" a favor when you gracefully accept the patches to convert scripts to proper builtins). But the real truth is: shell scripting is not portable. Shell scripting is never only shell scripting, of course. A quite undocumented set of utilities is expected to be present for our scripts to run, too: sed, awk, tr, cat, expr, just to name a few. It does not end there. For example, sed is not equal to sed. BSD sed has different semantics than GNU sed, and we jump through hoops to try to ensure that our shell scripts run with both versions. Which must make many a contributor feel a lot less positive about their contributions e.g. when a reviewer points out that their \t in their sed statement would break Git on MacOSX. Sad! We place a burden on maintainers targeting platforms that are not Linux. If I remember correctly, we need to override the default shell on one Unix, even, because we simply failed to make our shell scripts run with it. And then, of course, there are all the problems inflicted on users on that most prevalent Operating System of them all: Windows. Yes, I know, you "non-Windows folk" would like to happily ignore that little tiny problem, but that's really doing Git a disservice. So stop doing it. What hurts me most about this is that Subversion, a software that us "Git folk" pride ourselves in making obsolete, never had those problems. It was designed with an eye toward portability from the start. It is a good thing to support users who want to automate their workflows using scripts, of course. Oh, and please, do not forget to remember that there are tons of scripting languages out there, e.g. AppleScript, Javascript, Tcl, Python, Ruby; just because you prefer Bash does not mean the majority of developers do. So let's continue to broaden our scripting support. And at the same time, let's also change our mindset so that we can reduce the portability problems of Git. Ciao, Dscho ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: git rebase regression: cannot pass a shell expression directly to --exec 2017-05-15 18:08 git rebase regression: cannot pass a shell expression directly to --exec Eric Rannaud 2017-05-16 3:25 ` Jeff King @ 2017-05-16 10:23 ` Johannes Schindelin 2017-05-16 16:18 ` Jeff King 1 sibling, 1 reply; 29+ messages in thread From: Johannes Schindelin @ 2017-05-16 10:23 UTC (permalink / raw) To: Eric Rannaud; +Cc: git, Jeremy Serror Hi Eric, On Mon, 15 May 2017, Eric Rannaud wrote: > It used to be possible to run a sequence like: > > foo() { echo X; } > export -f foo > git rebase --exec foo HEAD~10 It would appear to me that you used a side effect of an implementation detail: that `git rebase -i` was implemented entirely as a shell script. In my mind, this was a fragile assumption, and that is why... > commit 18633e1a22a68bbe8e6311a1039d13ebbf6fd041 (refs/bisect/bad) > Author: Johannes Schindelin <johannes.schindelin@gmx.de> > Date: Thu Feb 9 23:23:11 2017 +0100 > > rebase -i: use the rebase--helper builtin ... this commit, which is a long-overdue start of cleaning up our act, broke your code's assumption. I am afraid that your code placed too much of a dependency on an implementation detail that changed. In short: I think that your fix to your script is correct. Ciao, Johannes ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: git rebase regression: cannot pass a shell expression directly to --exec 2017-05-16 10:23 ` Johannes Schindelin @ 2017-05-16 16:18 ` Jeff King 2017-05-16 16:59 ` Eric Rannaud 0 siblings, 1 reply; 29+ messages in thread From: Jeff King @ 2017-05-16 16:18 UTC (permalink / raw) To: Johannes Schindelin; +Cc: Eric Rannaud, git, Jeremy Serror On Tue, May 16, 2017 at 12:23:02PM +0200, Johannes Schindelin wrote: > On Mon, 15 May 2017, Eric Rannaud wrote: > > > It used to be possible to run a sequence like: > > > > foo() { echo X; } > > export -f foo > > git rebase --exec foo HEAD~10 > > It would appear to me that you used a side effect of an implementation > detail: that `git rebase -i` was implemented entirely as a shell script. I don't think that's true at all. He expected the user-provided "--exec" command to be run by a shell, which seems like a reasonable thing for Git to promise (and we already make a similar promise for most user-provided commands that we run). What happens in between, be it shell or C code, doesn't matter, and the conversion away from a shell script in this case only tickled an existing bad interaction between "export -f" and Git's run-command code. See my other replies for the full story. I don't think this has anything in particular to do with git-rebase, though. Our solutions are either: - declare "export -f" as too tricky for our optimization, and teach people about the ";" trick - figure out some workaround/fallback to disable the shell-skipping optimization in this case -Peff ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: git rebase regression: cannot pass a shell expression directly to --exec 2017-05-16 16:18 ` Jeff King @ 2017-05-16 16:59 ` Eric Rannaud 2017-05-16 17:14 ` Kevin Daudt 2017-05-16 17:21 ` Eric Rannaud 0 siblings, 2 replies; 29+ messages in thread From: Eric Rannaud @ 2017-05-16 16:59 UTC (permalink / raw) To: Jeff King; +Cc: Johannes Schindelin, git, Jeremy Serror On Tue, May 16, 2017 at 9:18 AM, Jeff King <peff@peff.net> wrote: > On Tue, May 16, 2017 at 12:23:02PM +0200, Johannes Schindelin wrote: >> It would appear to me that you used a side effect of an implementation >> detail: that `git rebase -i` was implemented entirely as a shell script. > > I don't think that's true at all. He expected the user-provided "--exec" > command to be run by a shell, which seems like a reasonable thing for > Git to promise (and we already make a similar promise for most > user-provided commands that we run). What happens in between, be it As a "user", my expectation was simply that the command would be run not just in "a shell", but in *my* shell (or the shell that calls git, maybe). So I don't see any portability question with respect to Git. My script that uses git rebase --exec may not be portable, but that's my problem. When I use "git rebase --exec <cmd>" I'm basically writing a "foreach commit in range { <cmd> }" in my shell. Same idea with git bisect run. A transparent optimization that tries execve() then falls back to the user's shell sounds like a good idea. ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: git rebase regression: cannot pass a shell expression directly to --exec 2017-05-16 16:59 ` Eric Rannaud @ 2017-05-16 17:14 ` Kevin Daudt 2017-05-16 17:29 ` Eric Rannaud 2017-05-16 17:21 ` Eric Rannaud 1 sibling, 1 reply; 29+ messages in thread From: Kevin Daudt @ 2017-05-16 17:14 UTC (permalink / raw) To: Eric Rannaud; +Cc: Jeff King, Johannes Schindelin, git, Jeremy Serror On Tue, May 16, 2017 at 09:59:03AM -0700, Eric Rannaud wrote: > On Tue, May 16, 2017 at 9:18 AM, Jeff King <peff@peff.net> wrote: > > On Tue, May 16, 2017 at 12:23:02PM +0200, Johannes Schindelin wrote: > >> It would appear to me that you used a side effect of an implementation > >> detail: that `git rebase -i` was implemented entirely as a shell script. > > > > I don't think that's true at all. He expected the user-provided "--exec" > > command to be run by a shell, which seems like a reasonable thing for > > Git to promise (and we already make a similar promise for most > > user-provided commands that we run). What happens in between, be it > > As a "user", my expectation was simply that the command would be run > not just in "a shell", but in *my* shell (or the shell that calls git, > maybe). So I don't see any portability question with respect to Git. > My script that uses git rebase --exec may not be portable, but that's > my problem. > > When I use "git rebase --exec <cmd>" I'm basically writing a "foreach > commit in range { <cmd> }" in my shell. Same idea with git bisect run. > > A transparent optimization that tries execve() then falls back to the > user's shell sounds like a good idea. It does not really work that way. Git runs in a separate process that does not have access to your current shell. That's why you need to do 'export -f foo'. If you want git to be able to ecute the foo shell function, git needs to start a _new_ shell process, which reads the environment, recognize the exported function and run that. This is not the same as git executing the command in your shell. Not exported variables would not be available in this function (as it would be in your equivalent). ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: git rebase regression: cannot pass a shell expression directly to --exec 2017-05-16 17:14 ` Kevin Daudt @ 2017-05-16 17:29 ` Eric Rannaud 2017-05-16 17:41 ` Jeff King 0 siblings, 1 reply; 29+ messages in thread From: Eric Rannaud @ 2017-05-16 17:29 UTC (permalink / raw) To: Kevin Daudt; +Cc: Jeff King, Johannes Schindelin, git, Jeremy Serror On Tue, May 16, 2017 at 10:14 AM, Kevin Daudt <me@ikke.info> wrote: >> A transparent optimization that tries execve() then falls back to the >> user's shell sounds like a good idea. > > It does not really work that way. Git runs in a separate process that > does not have access to your current shell. That's why you need to do > 'export -f foo'. > > If you want git to be able to ecute the foo shell function, git needs to > start a _new_ shell process, which reads the environment, recognize the > exported function and run that. > > This is not the same as git executing the command in your shell. Not > exported variables would not be available in this function (as it would > be in your equivalent). I'm sorry, I didn't mean (or say) "my shell process". Indeed, it doesn't work that way. And to be clear, there is no problem with having to "export -f foo". The only question is how should git run the <cmd> passed to --exec: should it run directly or using a shell? ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: git rebase regression: cannot pass a shell expression directly to --exec 2017-05-16 17:29 ` Eric Rannaud @ 2017-05-16 17:41 ` Jeff King 0 siblings, 0 replies; 29+ messages in thread From: Jeff King @ 2017-05-16 17:41 UTC (permalink / raw) To: Eric Rannaud; +Cc: Kevin Daudt, Johannes Schindelin, git, Jeremy Serror On Tue, May 16, 2017 at 10:29:31AM -0700, Eric Rannaud wrote: > > It does not really work that way. Git runs in a separate process that > > does not have access to your current shell. That's why you need to do > > 'export -f foo'. > > > > If you want git to be able to ecute the foo shell function, git needs to > > start a _new_ shell process, which reads the environment, recognize the > > exported function and run that. > > > > This is not the same as git executing the command in your shell. Not > > exported variables would not be available in this function (as it would > > be in your equivalent). > > I'm sorry, I didn't mean (or say) "my shell process". Indeed, it > doesn't work that way. And to be clear, there is no problem with > having to "export -f foo". The only question is how should git run the > <cmd> passed to --exec: should it run directly or using a shell? It's definitely "using a shell". Most things Git runs on your behalf behave the same, but there are some exceptions due to historical warts (that would break backwards compatibility if we switched them). E.g., $GIT_SSH does not use the shell, but we introduced GIT_SSH_COMMAND as an alternative which does use the shell. But note that "a shell" may not be necessarily be your login shell. We always execute "/bin/sh -c" (and you can override the shell path at build time). So your "export -f" trick only works because your /bin/sh is a symlink to bash (or because you specifically built with the SHELL_PATH knob pointed at bash). -Peff ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: git rebase regression: cannot pass a shell expression directly to --exec 2017-05-16 16:59 ` Eric Rannaud 2017-05-16 17:14 ` Kevin Daudt @ 2017-05-16 17:21 ` Eric Rannaud 2017-05-16 17:37 ` Jeff King 1 sibling, 1 reply; 29+ messages in thread From: Eric Rannaud @ 2017-05-16 17:21 UTC (permalink / raw) To: Jeff King; +Cc: Johannes Schindelin, git, Jeremy Serror On Tue, May 16, 2017 at 9:59 AM, Eric Rannaud <eric.rannaud@gmail.com> wrote: > When I use "git rebase --exec <cmd>" I'm basically writing a "foreach > commit in range { <cmd> }" in my shell. Same idea with git bisect run. > > A transparent optimization that tries execve() then falls back to the > user's shell sounds like a good idea. One issue with the execve-else-shell optimization is that sometimes a binary exists that will shadow an exported function or a shell builtin: git rebase --exec true master^^ # OK but in fact this runs /usr/bin/true In this case it doesn't really matter. If the optimization is only applied to "simple commands" (i.e. no arguments), then it's probably OK. I can't think of problematic cases. Except weird things like: $ git rebase --exec time master^ Executing: time Usage: time [-apvV] [-f format] [-o file] [--append] [--verbose] [--portability] [--format=format] [--output=file] [--version] [--help] command [arg...] warning: execution failed: time /usr/bin/time requires an argument. Even though the bash builtin time runs fine without argument. $ time real 0m0.000s user 0m0.000s sys 0m0.000s But if the optimization is applied to more complex commands, then we will have problems. For instance, the builtin echo supports \E, but /usr/bin/echo doesn't support it. In any case, the manpage says --exec <cmd> and "<cmd> will be interpreted as one or more shell commands.", it doesn't say "--exec <executable>". ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: git rebase regression: cannot pass a shell expression directly to --exec 2017-05-16 17:21 ` Eric Rannaud @ 2017-05-16 17:37 ` Jeff King 0 siblings, 0 replies; 29+ messages in thread From: Jeff King @ 2017-05-16 17:37 UTC (permalink / raw) To: Eric Rannaud; +Cc: Johannes Schindelin, git, Jeremy Serror On Tue, May 16, 2017 at 10:21:51AM -0700, Eric Rannaud wrote: > On Tue, May 16, 2017 at 9:59 AM, Eric Rannaud <eric.rannaud@gmail.com> wrote: > > When I use "git rebase --exec <cmd>" I'm basically writing a "foreach > > commit in range { <cmd> }" in my shell. Same idea with git bisect run. > > > > A transparent optimization that tries execve() then falls back to the > > user's shell sounds like a good idea. > > One issue with the execve-else-shell optimization is that sometimes a > binary exists that will shadow an exported function or a shell > builtin: > > git rebase --exec true master^^ # OK but in fact this runs /usr/bin/true Yeah, this is the builtin thing I mentioned elsewhere. I think it's pretty rare to run a builtin with no arguments and care about the behavior differences. > /usr/bin/time requires an argument. Even though the bash builtin time > runs fine without argument. > > $ time > > real 0m0.000s > user 0m0.000s > sys 0m0.000s I've run into the "time" distinction even when running things from the shell, because the time builtin is special. E.g.: $ time true real 0m0.000s user 0m0.000s sys 0m0.000s $ (echo foo | time true) 2>&1 0.00user 0.00system 0:00.00elapsed 0%CPU (0avgtext+0avgdata 1136maxresident)k 0inputs+0outputs (0major+61minor)pagefaults 0swaps So to some degree, depending on builtins versus external commands (especially when you're round-tripping through another program running a second shell) is going to have some surprises. > But if the optimization is applied to more complex commands, then we > will have problems. For instance, the builtin echo supports \E, but > /usr/bin/echo doesn't support it. No, it shouldn't. If any of |&;<>()$`\\\"' \t\n*?[#~=% appear in the string, we always run the shell. > In any case, the manpage says --exec <cmd> and "<cmd> will be > interpreted as one or more shell commands.", it doesn't say "--exec > <executable>". Right, it's clearly supposed to use the shell, or behave as if the shell were invoked (within reason). -Peff ^ permalink raw reply [flat|nested] 29+ messages in thread
end of thread, other threads:[~2017-05-16 21:11 UTC | newest] Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2017-05-15 18:08 git rebase regression: cannot pass a shell expression directly to --exec Eric Rannaud 2017-05-16 3:25 ` Jeff King 2017-05-16 3:37 ` Jeff King 2017-05-16 16:41 ` Jonathan Nieder 2017-05-16 16:47 ` Jeff King 2017-05-16 17:11 ` Eric Rannaud 2017-05-16 17:15 ` Brandon Williams 2017-05-16 17:23 ` Jeff King 2017-05-16 17:30 ` Brandon Williams 2017-05-16 19:14 ` Linus Torvalds 2017-05-16 19:35 ` [TANGENT] run-command: use vfork instead of fork Eric Wong 2017-05-16 20:47 ` Linus Torvalds 2017-05-16 21:11 ` Brandon Williams 2017-05-16 20:12 ` git rebase regression: cannot pass a shell expression directly to --exec Johannes Schindelin 2017-05-16 20:27 ` Linus Torvalds 2017-05-16 17:37 ` Jonathan Nieder 2017-05-16 3:40 ` Junio C Hamano 2017-05-16 3:53 ` Jeff King 2017-05-16 4:08 ` Jeff King 2017-05-16 16:45 ` Eric Rannaud 2017-05-16 10:46 ` Johannes Schindelin 2017-05-16 10:23 ` Johannes Schindelin 2017-05-16 16:18 ` Jeff King 2017-05-16 16:59 ` Eric Rannaud 2017-05-16 17:14 ` Kevin Daudt 2017-05-16 17:29 ` Eric Rannaud 2017-05-16 17:41 ` Jeff King 2017-05-16 17:21 ` Eric Rannaud 2017-05-16 17:37 ` Jeff King
Code repositories for project(s) associated with this public inbox https://80x24.org/mirrors/git.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).