git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "SZEDER Gábor" <szeder.dev@gmail.com>
To: "Martin Ågren" <martin.agren@gmail.com>
Cc: Alexander Pyhalov <apyhalov@gmail.com>,
	Git Mailing List <git@vger.kernel.org>
Subject: Re: t7005-editor.sh failure
Date: Wed, 26 Sep 2018 14:11:07 +0200	[thread overview]
Message-ID: <20180926121107.GH27036@localhost> (raw)
In-Reply-To: <CAN0heSpUhzbTjceVhBxk_jjE=vOAVTzXGFQ=UL9Y+muJHe0S6w@mail.gmail.com>

On Wed, Sep 26, 2018 at 11:52:42AM +0200, Martin Ågren wrote:
> On Wed, 26 Sep 2018 at 11:00, Alexander Pyhalov <apyhalov@gmail.com> wrote:
> > As for sign-off, do I understand correctly that you just want to know
> > that I'm the original author of the code? Yes, it's so.
> 
> Right. Plus that you agree that the code (the commit) may be
> redistributed basically forever.
> 
> > I see this on OpenIndiana in
> > https://github.com/OpenIndiana/oi-userland/pull/4456 , when running
> > test suite.
> > Not sure why it wasn't noticed earlier, as 'trash directory' is used in path.
> 
> My first theory was that my shell and that of other developers was
> "modern" or "clever" enough to realize that the space belongs to the
> filename, so it just takes everything to the end of line.

(Note that redirections can occur anywhere in the command, i.e. these
are all equivalent: 'echo foo >out', 'echo >out foo' '>out echo foo')

> Whereas your
> shell would be "dumber". I see now that you have a newer bash than I
> do... Maybe this cleverness can be configured (at compile-time?), or
> maybe something else is happening.

Let me put on my POSIX-lawyer hat for a moment to explain this :)

Long story short: Bash doesn't conform to POSIX in this respect.

So, the Shell Command Language specification section 2.6 Word
Expansions [1] says among many other things the following:

  Tilde expansions, parameter expansions, command substitutions,
  arithmetic expansions, and quote removals that occur within a single
  word expand to a single field. It is only field splitting or
  pathname expansion that can create multiple fields from a single
  word.

Later, in section 2.7 Redirection [2]:

  [...] the word that follows the redirection operator shall be
  subjected to tilde expansion, parameter expansion, command
  substitution, arithmetic expansion, and quote removal. Pathname
  expansion shall not be performed on the word by a non-interactive
  shell; an interactive shell may perform it, but shall do so only
  when the expansion would result in one word.

Note that this "word" is _not_ subject of field splitting, i.e. in a
redirection like

  echo foo >$file

it's not necessary to quote $file, because it will remain a single
field even if it contains spaces.

Most shells I have at hand follow the specs:

  $ for shell in dash mksh ksh ksh93 zsh ; do $shell ./e\ space.sh "output from $shell" ; done
  $ ls -1 output\ from*
  output from dash
  output from ksh
  output from ksh93
  output from mksh
  output from zsh

Bash doesn't:

  $ bash ./e\ space.sh "output from bash"
  ./e space.sh: line 1: $1: ambiguous redirect

And this behaviour is documented in its man page (though the text
calls "word splitting" what the specs call "field splitting"):

  The word following the redirection operator in the following descrip
  tions, unless otherwise noted, is subjected to brace expansion,
  tilde expansion, parameter and variable expansion, command
  substitution, arithmetic expansion, quote removal, pathname
  expansion, and word splitting.

When run in posix mode, however, even Bash follows the specs:

  $ ls -l /bin/sh
  lrwxrwxrwx 1 root root 4 Sep 26 12:04 /bin/sh -> bash
  $ sh ./e\ space.sh "output from bash as sh"
  $ bash --posix ./e\ space.sh "output from bash --posix"
  $ ls -1 output\ from\ bash*
  output from bash as sh
  output from bash --posix

That's why we didn't noticed it yet, not even on macOS, which uses
Bash as /bin/sh.  You have to build Git with 'SHELL_PATH=/bin/bash' to
make t7005 fail because of this issue, and based on the trace that
Alexander showed us it seems that OpenIndiana folks do build Git that
way.

Good.  *throws the POSIX-lawyer hat into the farthest corner*

Having said all that, I didn't omit the quotes in 4362da078e with the
above in mind; in fact I tend to use quotes even when they are
unnecessary (e.g. in variable assignments: var="$1"), because unquoted
variables and command substitutions freak me out before I can think
through whether its safe to omit the quotes or not :)


Sidenote: this test should use the write_script helper to create this
editor script.



[1] http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_06
[2] http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_07


> > execve("/bin/bash", 0x007EA898, 0x007EA960)  argc = 5
> > 2655:    argv: /bin/bash -c ./e\ space.sh "$@" ./e\ space.sh
> > 2655:     /export/home/alp/srcs/oi-userland/components/developer/git/build/amd64/t/trash
> > directory.t7005-editor/.git/COMMIT_EDITMSG
> > 2655:   execve("./e space.sh", 0x005655C8, 0x00564008)  Err#8 ENOEXEC
> > ./e space.sh: line 1: $1: ambiguous redirect
> 
> > Shell is bash, as you can see (GNU bash, version 4.4.23(1)-release
> > (i386-pc-solaris2.11))
> 
> I came up with the following commit message. What do you think about it?
> 
>     t7005-editor: quote filename to fix whitespace-issue
> 
>     Commit 4362da078e (t7005-editor: get rid of the SPACES_IN_FILENAMES
>     prereq, 2018-05-14) removed code for detecting whether spaces in
>     filenames work. Since we rely on spaces throughout the test suite
>     ("trash directory.t1234-foo"), testing whether we can use the filename
>     "e space.sh" was redundant and unnecessary.
> 
>     In simplifying the code, though, the commit introduced a regression around
>     how spaces are handled, not in the /name/ of the script, but /in/ the
>     script itself. The editor-script created looks like this:
> 
>       echo space >$1
> 
>     We will try to execute something like
> 
>       echo space >/foo/t/trash directory.t7005-editor/.git/COMMIT_EDITMSG
> 
>     Most shells seem to be able to figure out that the filename doesn't end
>     with "trash" but continues all the way to "COMMIT_EDITMSG", but at least
>     one shell chokes on this.
> 
>     Make sure that the editor-script quotes "$1".
> 
> Martin

  parent reply	other threads:[~2018-09-26 12:11 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-09-26  6:52 t7005-editor.sh failure Alexander Pyhalov
2018-09-26  7:59 ` Martin Ågren
2018-09-26  9:00   ` Alexander Pyhalov
2018-09-26  9:52     ` Martin Ågren
2018-09-26 10:02       ` Alexander Pyhalov
2018-09-26 11:59       ` Eric Sunshine
2018-09-26 13:23         ` Martin Ågren
2018-09-26 12:11       ` SZEDER Gábor [this message]
2018-09-26 16:14         ` [PATCH] t7005-editor: quote filename to fix whitespace-issue Martin Ågren
2018-09-26 18:14           ` Taylor Blau
2018-09-26 19:21           ` Jeff King
2018-09-26 18:16         ` t7005-editor.sh failure Junio C Hamano
2018-09-26 19:16           ` Junio C Hamano
2018-09-26 19:29             ` Andrei Rybak
2018-09-27 20:53             ` SZEDER Gábor

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180926121107.GH27036@localhost \
    --to=szeder.dev@gmail.com \
    --cc=apyhalov@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=martin.agren@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).