git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Torsten Bögershausen" <tboegi@web.de>
To: Philippe Blain <levraiphilippeblain@gmail.com>
Cc: "Randall S. Becker" <rsbecker@nexbridge.com>,
	'Daniel Troger' <random_n0body@icloud.com>,
	git@vger.kernel.org,
	Johannes Schindelin <johannes.schindelin@gmx.de>
Subject: Re: git-bugreport-2021-01-06-1209.txt (git can't deal with special characters)
Date: Thu, 7 Jan 2021 16:49:26 +0100	[thread overview]
Message-ID: <20210107154926.6tb7ukgemn7kmpnl@tb-raspi4> (raw)
In-Reply-To: <ffe0a3ae-780d-95ae-524d-7b029eda21ee@gmail.com>

On Thu, Jan 07, 2021 at 09:34:35AM -0500, Philippe Blain wrote:
> Hi everyone,
>
> Le 2021-01-06 à 18:07, Randall S. Becker a écrit :
> > On January 6, 2021 5:22 PM, Daniel Troger wrote:
> > > Hi, maybe this helps you reproduce. I think I should have committed before
> > > doing the second changes but I still got the error message and the two
> > > names for one folder:
> > >
> > > me@iMac:/tmp$ mkdir git_bug
> > > me@iMac:/tmp$ cd git_bug
> > > me@iMac:/tmp/git_bug$ git init
> > > hint: Using 'master' as the name for the initial branch. This default branch
> > > name
> > > hint: is subject to change. To configure the initial branch name to use in all
> > > hint: of your new repositories, which will suppress this warning, call:
> > > hint:
> > > hint: 	git config --global init.defaultBranch <name>
> > > hint:
> > > hint: Names commonly chosen instead of 'master' are 'main', 'trunk' and
> > > hint: 'development'. The just-created branch can be renamed via this
> > > command:
> > > hint:
> > > hint: 	git branch -m <name>
> > > Initialized empty Git repository in /private/tmp/git_bug/.git/
> > > me@iMac:/tmp/git_bug$ ls -la total 8
> > > drwxr-xr-x   4 daniel  wheel   128 Jan  6 23:13 .
> > > drwxrwxrwt  27 root    wheel   864 Jan  6 23:13 ..
> > > drwxr-xr-x   9 daniel  wheel   288 Jan  6 23:12 .git
> > > -rw-r--r--@  1 daniel  staff  1283 Jan  6 23:13 paulbrunngård-springyard.zip
> > > me@iMac:/tmp/git_bug$ unzip paulbrunngård-springyard.zip
> > > Archive:  paulbrunngård-springyard.zip
> > >     creating: paulbrunnga??rd-springyard/
> > >    inflating: paulbrunnga??rd-springyard/.DS_Store
> > >     creating: __MACOSX/
> > >     creating: __MACOSX/paulbrunnga??rd-springyard/
> > >    inflating: __MACOSX/paulbrunnga??rd-springyard/._.DS_Store
> > >   extracting: paulbrunnga??rd-springyard/empty me@iMac:/tmp/git_bug$ rm
> > > -rf __MACOSX/ *.zip me@iMac:/tmp/git_bug$ ls -la total 0
> > > drwxr-xr-x   4 daniel  wheel  128 Jan  6 23:15 .
> > > drwxrwxrwt  27 root    wheel  864 Jan  6 23:13 ..
> > > drwxr-xr-x   9 daniel  wheel  288 Jan  6 23:15 .git
> > > drwxr-xr-x@  4 daniel  wheel  128 Jan  6 12:20 paulbrunngård-springyard
> > > me@iMac:/tmp/git_bug$ cd paulbrunngård-springyard/
> > > me@iMac:/tmp/git_bug/paulbrunngård-springyard$ nano empty
> > > me@iMac:/tmp/git_bug/paulbrunngård-springyard$ cat empty Initial
> > > content me@iMac:/tmp/git_bug/paulbrunngård-springyard$ git add empty
> > > me@iMac:/tmp/git_bug/paulbrunngård-springyard$ nano empty
> > > me@iMac:/tmp/git_bug/paulbrunngård-springyard$ cat empty Initial
> > > content
> > >
> > >
> > > Line I want to keep
> > >
> > > Line I want gone
> > > me@iMac:/tmp/git_bug/paulbrunngård-springyard$ git restore -p .
> > > BUG: pathspec.c:495: error initializing pathspec_item Cannot close git diff-
> > > index --cached --numstat --summary
> > > 4b825dc642cb6eb9a060e54bf8d69288fbee4904 --
> > > :(,prefix:27)paulbrunngård-springyard/ () at
> > > /usr/local/Cellar/git/2.30.0/libexec/git-core/git-add--interactive line 183.
> > > me@iMac:/tmp/git_bug/paulbrunngård-springyard$ cd ..
> > > me@iMac:/tmp/git_bug$ git status
> > > On branch master
> > >
> > > No commits yet
> > >
> > > Changes to be committed:
> > >    (use "git rm --cached <file>..." to unstage)
> > > 	new file:   "paulbrunnga\314\212rd-springyard/empty"
> > >
> > > Changes not staged for commit:
> > >    (use "git add <file>..." to update what will be committed)
> > >    (use "git restore <file>..." to discard changes in working directory)
> > > 	modified:   "paulbrunnga\314\212rd-springyard/empty"
> > >
> > > Untracked files:
> > >    (use "git add <file>..." to include in what will be committed)
> > > 	.DS_Store
> > > 	"paulbrunng\303\245rd-springyard/"
> > >
> > > me@iMac:/tmp/git_bug$
> >
> > Is it possible that the å character is coming from a UTF-16 encoding and is not representable in UTF-8? I'm wondering whether the name has a double-byte representation where one of the bytes is null, resulting in a truncated file name coming from readdir(). The file name would not be representable on some platforms that do not support UTF-16 path names.
>
> I don't think that's the case (the angstrom is present in UTF-8 [1]).
> I think it's another UTF-8 precomposed/decomposed bug. As far as I
> was able to test it happens as soon as you have a precomposed character
> in the folder name. I observed the same behaviour with a folder named
> "folderü", for example. I also tried 'git -c add.interactive.usebuiltin restore -p .'
> to see if the new experimental builtin§ add-interactive has the same problem,
> and it does (though the error is less verbose).
>
> Anyway as you show with 'git status', it's not just 'git add -p' that is
> faulty, it's deeper than that, I would say.
>
> Cheers,
>
> Philippe.
>
> [1] https://en.wikipedia.org/wiki/%C3%85#On_computers

Folks, I can not reproduce anything here.
- The zip file mentioned earlier does not include a decomposed "å"
  Neither when running unzip under Linux nor under Mac
- Trying to write a test script does not show anything special

My attempt looks like this:
cat test.sh
#!/bin/sh
DIR=git-test-restore-p
GIT=/usr/local/bin/git

Adiarnfc=$(printf '\303\204')
Adiarnfd=$(printf 'A\314\210')
DIRNAME=xx${Adiarnfd}yy
FILENAME=$DIRNAME/file

rm -rf $DIR &&
    mkdir $DIR &&
    cd $DIR &&
    $GIT init &&
    mkdir $DIRNAME &&
    echo "Initial" >$FILENAME &&
    $GIT add $FILENAME &&
    echo  >>$FILENAME &&
    echo  >>$FILENAME &&
    echo "One more line"  >>$FILENAME &&
    echo  >>$FILENAME &&
    echo  >>$FILENAME &&
    echo "Last line"  >$FILENAME &&
    $GIT restore -p . &&
    $GIT restore -p . &&
    echo git=$GIT &&
    $GIT --version &&
    echo "OK"

==================================
It points out Git from brew, I think.
And running it (using a decomposed Ä instead of å or ü)
shows nothing strange, please see below.

Is anybody able to provide a shell-script, that does reproduce ?
Excuse my bad shell-programming style, this is for demo only.

hint: is subject to change. To configure the initial branch name to use in all
hint: of your new repositories, which will suppress this warning, call:
hint:
hint:   git config --global init.defaultBranch <name>
hint:
hint: Names commonly chosen instead of 'master' are 'main', 'trunk' and
hint: 'development'. The just-created branch can be renamed via this command:
hint:
hint:   git branch -m <name>
Initialized empty Git repository in ..../2021-01-07-git_decompose_test/git-test-restore-p/.git/
diff --git a/xxÄyy/file b/xxÄyy/file
index a77fa51..8ea8e21 100644
--- a/xxÄyy/file
+++ b/xxÄyy/file
@@ -1 +1 @@
-Initial
+Last line
(1/1) Discard this hunk from worktree [y,n,q,a,d,e,?]? y

No changes.
git=/usr/local/bin/git
git version 2.30.0
OK


  reply	other threads:[~2021-01-07 15:54 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-06 11:35 git-bugreport-2021-01-06-1209.txt (git can't deal with special characters) Daniel Troger
2021-01-06 14:21 ` Torsten Bögershausen
2021-01-06 16:49   ` Daniel Troger
2021-01-06 21:47     ` Torsten Bögershausen
2021-01-06 22:21       ` Daniel Troger
2021-01-06 23:07         ` Randall S. Becker
2021-01-07 14:34           ` Philippe Blain
2021-01-07 15:49             ` Torsten Bögershausen [this message]
2021-01-07 16:21               ` Philippe Blain
2021-01-08 19:07                 ` Torsten Bögershausen
2021-01-24 15:13 ` [PATCH/RFC v1 1/1] git restore -p . and precomposed unicode tboegi
2021-01-24 19:51   ` Junio C Hamano
2021-01-25 16:53     ` Torsten Bögershausen
2021-01-29 17:15 ` [PATCH v2 1/1] MacOS: precompose_argv_prefix() tboegi
2021-01-29 23:19   ` Junio C Hamano
2021-01-31  0:43   ` Junio C Hamano
2021-01-31  9:50     ` Torsten =?unknown-8bit?Q?B=C3=B6gershausen?=
2021-02-02 15:11 ` [PATCH v3 " tboegi
2021-02-02 17:43   ` Junio C Hamano
2021-02-03 16:28 ` [PATCH v4 " tboegi
2021-02-03 19:33   ` Junio C Hamano
2021-02-03 22:13     ` Junio C Hamano
2021-02-05 17:31       ` Torsten Bögershausen
  -- strict thread matches above, loose matches on Subject: below --
2021-01-08 19:56 git-bugreport-2021-01-06-1209.txt (git can't deal with special characters) Daniel Troger
2021-01-09 17:23 ` Torsten Bögershausen
2021-01-13 14:57   ` Daniel Troger
2021-01-16 17:24     ` Torsten Bögershausen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210107154926.6tb7ukgemn7kmpnl@tb-raspi4 \
    --to=tboegi@web.de \
    --cc=git@vger.kernel.org \
    --cc=johannes.schindelin@gmx.de \
    --cc=levraiphilippeblain@gmail.com \
    --cc=random_n0body@icloud.com \
    --cc=rsbecker@nexbridge.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).