git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "SZEDER Gábor" <szeder.dev@gmail.com>
To: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Cc: Stephon Harris <theonestep4@gmail.com>,
	Git mailing list <git@vger.kernel.org>,
	Duy Nguyen <pclouds@gmail.com>
Subject: Re: [PATCH] specify encoding for sed command
Date: Thu, 5 Apr 2018 11:43:05 +0200	[thread overview]
Message-ID: <CAM0VKj=45wL7PyC-9-KE0drE=0UozoPKOmKbf9-caubfC2p_TA@mail.gmail.com> (raw)
In-Reply-To: <87605616vr.fsf@evledraar.gmail.com>

On Thu, Apr 5, 2018 at 8:53 AM, Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
>
> On Thu, Apr 05 2018, Stephon Harris wrote:
>
>> Fixes issue with seeing `sed: RE error: illegal byte sequence` when running git-completion.bash

Please:
  - prefix the subject line with "completion:".
  - nit: replace "running" with "sourcing".
  - wrap the body of the commit message around 70 characters.
  - sign off your patch; see Documentation/SubmittingPatches and 'git
    commit -s'.

>> ---
>>  contrib/completion/git-completion.bash | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/contrib/completion/git-completion.bash b/contrib/completion/git-completion.bash
>> index b09c8a23626b4..52a4ab5e2165a 100644
>> --- a/contrib/completion/git-completion.bash
>> +++ b/contrib/completion/git-completion.bash
>> @@ -282,7 +282,7 @@ __gitcomp ()
>>
>>  # Clear the variables caching builtins' options when (re-)sourcing
>>  # the completion script.
>> -unset $(set |sed -ne 's/^\(__gitcomp_builtin_[a-zA-Z0-9_][a-zA-Z0-9_]*\)=.*/\1/p') 2>/dev/null
>> +unset $(set |LANG=C sed -ne 's/^\(__gitcomp_builtin_[a-zA-Z0-9_][a-zA-Z0-9_]*\)=.*/\1/p') 2>/dev/null

I was wondering how could you see an error message from 'sed', when the
stderr of the whole thing is redirected to /dev/null.  It turns out that
such a redirection doesn't affect the stderr of the commands executed
inside the command substitution:

  $ echo $(echo foo ; echo >&2 error) 2>/dev/null
  error
  foo

Interesting, I didn't know that.

> This is getting closer to the issue than your previous patch, but
> there's still some open questions:
>
> 1) What platform OS / version / sed version is this on?
>
> 2) What's the output from "set" that's causing this error? Do we have an
>    isolated test case for that?

A quick internet search suggests that this error message tends to appear
on Macs, when its 'sed' encounters an invalid UTF-8 character in its
input.

> 3) There's other invocations of "sed" in the file, aren't those affected
>    as well?

The two 'sed' invocations in _git_stash() are suspect, as they process
the output of 'git stash list', which includes the stashes'
descriptions, which can contain whatever the users wished to store there
(though they would probably get a "Warning: commit message did not
conform to UTF-8" message from 9e83266525 (commit-tree: encourage UTF-8
commit messages., 2006-12-22) when doing so).

The 'sed' invocation in __git_complete_revlist_file() is suspect as
well, as it processes the output of 'git ls-tree'.  Pathnames can
contain anything, and while 'git ls-tree' quotes pathnames with
"unusual" characters, I don't know whether it quotes all characters that
can possibly upset Stephon's 'sed'.

What about 'awk' on Stephon's platform?  We already have one 'awk'
invocation in __git_match_ctag(), which we use for 'git grep <TAB>' and
'git log -L:<TAB>'.  That 'awk' script uses a regexp to match the symbol
name in a ctags file; does any programming language allow such unusual
characters in symbol names?

What about 'sed' and 'awk' scripts that don't use regexps at all?  (I
intend to add such an 'awk' script in a day or two...)

> 4) Any reason we wouldn't just set LC_AlL=C for the whole file? I see we
>    already do it for our invocation to "git merge".

That would perhaps be the safest thing to do...
But how can we set it for a whole file, when the file is not executed as
a script but sourced into the user's shell environment?

  parent reply	other threads:[~2018-04-05  9:43 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-05  3:10 [PATCH] specify encoding for sed command Stephon Harris
2018-04-05  6:53 ` Ævar Arnfjörð Bjarmason
2018-04-05  9:42   ` Eric Sunshine
2018-04-05  9:43   ` SZEDER Gábor [this message]
2018-04-10  7:18   ` Matt Coleman
2018-04-11 20:42     ` Matt Coleman
2018-04-12 22:12       ` Matthew Coleman
2018-04-13  0:01         ` SZEDER Gábor
2018-04-13  3:00           ` Matthew Coleman
2018-04-13 10:30             ` [PATCH] completion: reduce overhead of clearing cached --options SZEDER Gábor
2018-04-13 21:44               ` Jakub Narebski
2018-04-13 22:23                 ` SZEDER Gábor
2018-04-14 13:27                   ` Jakub Narebski
2018-04-16 18:23                     ` Jacob Keller
2018-04-16 20:35                       ` Matthew Coleman
2018-04-16  5:10                   ` Junio C Hamano
2018-04-16 13:15                     ` SZEDER Gábor
2018-04-16 13:29                       ` Jakub Narębski
2018-04-16 22:43                         ` Junio C Hamano
2018-04-17 22:02                           ` [PATCH v2] " SZEDER Gábor
2018-04-17 23:45                             ` Junio C Hamano
2018-05-07 15:05                               ` Matthew Coleman
2018-05-08  2:28                                 ` Todd Zullinger
2018-05-08  3:28                                   ` Junio C Hamano
2018-06-07  5:48                             ` Jonathan Nieder
2018-06-07 12:07                               ` Dave Borowitz
2018-06-07 12:41                               ` Rick van Hattem
2018-06-08 21:16                               ` SZEDER Gábor
2018-06-08 21:23                                 ` Jonathan Nieder
2018-06-08 21:41                                   ` SZEDER Gábor
2018-06-08 21:52                                     ` Jonathan Nieder
2018-06-08 21:58                                       ` SZEDER Gábor
2018-06-11 17:53                                   ` Junio C Hamano
2018-06-11 18:20                                 ` [PATCH] completion: correct zsh detection when run from git-completion.zsh (Re: [PATCH v2] completion: reduce overhead of clearing cached --options) Jonathan Nieder
2018-06-12  8:46                                   ` SZEDER Gábor
2018-06-12  9:48                                   ` SZEDER Gábor
2018-06-12  9:51                                   ` Rick van Hattem
2018-04-08 23:17 ` [PATCH] specify encoding for sed command Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAM0VKj=45wL7PyC-9-KE0drE=0UozoPKOmKbf9-caubfC2p_TA@mail.gmail.com' \
    --to=szeder.dev@gmail.com \
    --cc=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=pclouds@gmail.com \
    --cc=theonestep4@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).