git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [StGIT BUG] StGIT errors out on rebasing patch deleting file with Unicode filename
@ 2008-06-01  8:46 Jakub Narebski
  2008-06-02  7:39 ` Jakub Narebski
                   ` (2 more replies)
  0 siblings, 3 replies; 18+ messages in thread
From: Jakub Narebski @ 2008-06-01  8:46 UTC (permalink / raw)
  To: Catalin Marinas; +Cc: git

StGIT errors out on rebasing patch which deletes file with Unicode
characters in filename (with characters outside US-ASCII in filename).
The patch in question is patch deleting gitweb/test/* in git directory,
and is present already on the 'origin' branch (the branch we rebase
onto), so stg-rebase should result in an empty patch (as first patch).

 "gitweb/test/M\303\244rchen" |    2 --
 gitweb/test/file with spaces |    4 ----
 gitweb/test/file+plus+sign   |    6 ------
 3 files changed, 0 insertions(+), 12 deletions(-)
 delete mode 100644 gitweb/test/Märchen
 delete mode 100644 gitweb/test/file with spaces
 delete mode 100644 gitweb/test/file+plus+sign

I guess the error is caused by using unescaped (quoted) filename.

Below is StGIT's error message:

1248:[gitweb/web@git]# stg rebase origin
Checking for changes in the working directory ... done
Popping all applied patches ... done
Rebasing to "origin" ... Traceback (most recent call last):
  File "/usr/bin/stg", line 43, in ?
    main()
  File "/usr/lib/python2.4/site-packages/stgit/main.py", line 281, in main
    command.func(parser, options, args)
  File "/usr/lib/python2.4/site-packages/stgit/commands/rebase.py", line 70, in func
    rebase(crt_series, args[0])
  File "/usr/lib/python2.4/site-packages/stgit/commands/common.py", line 356, in rebase
    git.rebase(tree_id = tree_id)
  File "/usr/lib/python2.4/site-packages/stgit/git.py", line 927, in rebase
    reset(tree_id = tree_id)
  File "/usr/lib/python2.4/site-packages/stgit/git.py", line 872, in reset
    map(os.remove, rm_files)
OSError: [Errno 2] No such file or directory: '"gitweb/test/M\\303\\244rchen"'
1249:[gitweb/web@git]# stg version
Stacked GIT 0.14.2
git version 1.5.5.3
Python version 2.4.3 (#1, Jun 13 2006, 16:41:18) 
[GCC 4.0.2 20051125 (Red Hat 4.0.2-8)]

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [StGIT BUG] StGIT errors out on rebasing patch deleting file with Unicode filename
  2008-06-01  8:46 [StGIT BUG] StGIT errors out on rebasing patch deleting file with Unicode filename Jakub Narebski
@ 2008-06-02  7:39 ` Jakub Narebski
  2008-06-02 13:26   ` Catalin Marinas
  2008-06-02 20:23 ` [PATCH] Add rebase test for when upstream has deleted a non-ASCII file Karl Hasselström
  2008-06-02 21:46 ` [StGit PATCH 0/4] Handle non-ASCII filenames Karl Hasselström
  2 siblings, 1 reply; 18+ messages in thread
From: Jakub Narebski @ 2008-06-02  7:39 UTC (permalink / raw)
  To: Catalin Marinas; +Cc: git

On Sun, 1 June 2008, Jakub Narebski wrote:

> StGIT errors out on rebasing patch which deletes file with Unicode
> characters in filename (with characters outside US-ASCII in filename).
> The patch in question is patch deleting gitweb/test/* in git directory,
> and is present already on the 'origin' branch (the branch we rebase
> onto), so stg-rebase should result in an empty patch (as first patch).
> 
>  "gitweb/test/M\303\244rchen" |    2 --
>  gitweb/test/file with spaces |    4 ----
>  gitweb/test/file+plus+sign   |    6 ------
>  3 files changed, 0 insertions(+), 12 deletions(-)
>  delete mode 100644 gitweb/test/Märchen
>  delete mode 100644 gitweb/test/file with spaces
>  delete mode 100644 gitweb/test/file+plus+sign
> 
> I guess the error is caused by using unescaped (quoted) filename.

You can WORKAROUND this bug by setting core.quotepath to false.  This
allowed me to make stg-rebase.
 
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [StGIT BUG] StGIT errors out on rebasing patch deleting file with Unicode filename
  2008-06-02  7:39 ` Jakub Narebski
@ 2008-06-02 13:26   ` Catalin Marinas
  2008-06-02 15:47     ` Jakub Narebski
  0 siblings, 1 reply; 18+ messages in thread
From: Catalin Marinas @ 2008-06-02 13:26 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git

2008/6/2 Jakub Narebski <jnareb@gmail.com>:
> On Sun, 1 June 2008, Jakub Narebski wrote:
>
>> StGIT errors out on rebasing patch which deletes file with Unicode
>> characters in filename (with characters outside US-ASCII in filename).
>> The patch in question is patch deleting gitweb/test/* in git directory,
>> and is present already on the 'origin' branch (the branch we rebase
>> onto), so stg-rebase should result in an empty patch (as first patch).
>>
>>  "gitweb/test/M\303\244rchen" |    2 --
>>  gitweb/test/file with spaces |    4 ----
>>  gitweb/test/file+plus+sign   |    6 ------
>>  3 files changed, 0 insertions(+), 12 deletions(-)
>>  delete mode 100644 gitweb/test/Märchen
>>  delete mode 100644 gitweb/test/file with spaces
>>  delete mode 100644 gitweb/test/file+plus+sign
>>
>> I guess the error is caused by using unescaped (quoted) filename.
>
> You can WORKAROUND this bug by setting core.quotepath to false.  This
> allowed me to make stg-rebase.

I can add a workaround in StGIT to actually ignore the exception
raised by os.remove() but I don't know how to convert the quoted file
name back to its unicode value in Python.

-- 
Catalin

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [StGIT BUG] StGIT errors out on rebasing patch deleting file with Unicode filename
  2008-06-02 13:26   ` Catalin Marinas
@ 2008-06-02 15:47     ` Jakub Narebski
  2008-06-02 16:11       ` Catalin Marinas
  0 siblings, 1 reply; 18+ messages in thread
From: Jakub Narebski @ 2008-06-02 15:47 UTC (permalink / raw)
  To: Catalin Marinas; +Cc: git

Dnia poniedziałek 2. czerwca 2008 15:26, Catalin Marinas napisał:
> 2008/6/2 Jakub Narebski <jnareb@gmail.com>:
>> On Sun, 1 June 2008, Jakub Narebski wrote:
>>
>>> StGIT errors out on rebasing patch which deletes file with Unicode
>>> characters in filename (with characters outside US-ASCII in filename).
>>> The patch in question is patch deleting gitweb/test/* in git directory,
>>> and is present already on the 'origin' branch (the branch we rebase
>>> onto), so stg-rebase should result in an empty patch (as first patch).
>>>
>>>  "gitweb/test/M\303\244rchen" |    2 --
>>>  gitweb/test/file with spaces |    4 ----
>>>  gitweb/test/file+plus+sign   |    6 ------
>>>  3 files changed, 0 insertions(+), 12 deletions(-)
>>>  delete mode 100644 gitweb/test/Märchen
>>>  delete mode 100644 gitweb/test/file with spaces
>>>  delete mode 100644 gitweb/test/file+plus+sign
>>>
>>> I guess the error is caused by using unescaped (quoted) filename.
>>
>> You can WORKAROUND this bug by setting core.quotepath to false.  This
>> allowed me to make stg-rebase.
> 
> I can add a workaround in StGIT to actually ignore the exception
> raised by os.remove() but I don't know how to convert the quoted file
> name back to its unicode value in Python.

In Perl it is as simple as (see unquote() in gitweb/gitweb.perl)

	if ($str =~ m/^"(.*)"$/) {
		# needs unquoting
		$str = $1;
		$str =~ s/\\([^0-7]|[0-7]{1,3})/unq($1)/eg;
	}

where

	sub unq {

		#...

		if ($seq =~ m/^[0-7]{1,3}$/) {
			# octal char sequence
			return chr(oct($seq));
		} elsif (exists $es{$seq}) {
			# C escape sequence, aka character escape code
			return $es{$seq};
		}
		# quoted ordinary character
		return $seq;

Although I haven't tested this extensively.

Or you can just use '-z' switch to git-diff-tree and/or git-ls-files,
to get filename without quoting (but then you use '\0' as record
delimiter, not "\n").

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [StGIT BUG] StGIT errors out on rebasing patch deleting file with Unicode filename
  2008-06-02 15:47     ` Jakub Narebski
@ 2008-06-02 16:11       ` Catalin Marinas
  0 siblings, 0 replies; 18+ messages in thread
From: Catalin Marinas @ 2008-06-02 16:11 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git

2008/6/2 Jakub Narebski <jnareb@gmail.com>:
> Dnia poniedziałek 2. czerwca 2008 15:26, Catalin Marinas napisał:
>> 2008/6/2 Jakub Narebski <jnareb@gmail.com>:
>>> On Sun, 1 June 2008, Jakub Narebski wrote:
>>>
>>>> StGIT errors out on rebasing patch which deletes file with Unicode
>>>> characters in filename (with characters outside US-ASCII in filename).
>>>> The patch in question is patch deleting gitweb/test/* in git directory,
>>>> and is present already on the 'origin' branch (the branch we rebase
>>>> onto), so stg-rebase should result in an empty patch (as first patch).
>>>>
>>>>  "gitweb/test/M\303\244rchen" |    2 --
>>>>  gitweb/test/file with spaces |    4 ----
>>>>  gitweb/test/file+plus+sign   |    6 ------
>>>>  3 files changed, 0 insertions(+), 12 deletions(-)
>>>>  delete mode 100644 gitweb/test/Märchen
>>>>  delete mode 100644 gitweb/test/file with spaces
>>>>  delete mode 100644 gitweb/test/file+plus+sign
>>>>
>>>> I guess the error is caused by using unescaped (quoted) filename.
>>>
>>> You can WORKAROUND this bug by setting core.quotepath to false.  This
>>> allowed me to make stg-rebase.
>>
>> I can add a workaround in StGIT to actually ignore the exception
>> raised by os.remove() but I don't know how to convert the quoted file
>> name back to its unicode value in Python.
>
> In Perl it is as simple as (see unquote() in gitweb/gitweb.perl)

Actually, Python seems OK at handling quoted characters, the only
confusion is that it doubles the backslash in the string since it
doesn't know that the text received from Git is quoted or not. I would
probably have to check for the "core.quotepath" value and act
accordingly.

-- 
Catalin

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH] Add rebase test for when upstream has deleted a non-ASCII file
  2008-06-01  8:46 [StGIT BUG] StGIT errors out on rebasing patch deleting file with Unicode filename Jakub Narebski
  2008-06-02  7:39 ` Jakub Narebski
@ 2008-06-02 20:23 ` Karl Hasselström
  2008-06-02 21:46 ` [StGit PATCH 0/4] Handle non-ASCII filenames Karl Hasselström
  2 siblings, 0 replies; 18+ messages in thread
From: Karl Hasselström @ 2008-06-02 20:23 UTC (permalink / raw)
  To: Catalin Marinas, Jakub Narebski; +Cc: git

Test that stg rebase can handle upstream deleting a file with a
non-ASCII name. It currently can't.

Bug spotted by Jakub Narebski <jnareb@gmail.com>.

Signed-off-by: Karl Hasselström <kha@treskal.com>

---

Here's a little something for the test suite. Note that it's enough
that upstream deletes the file -- no patch in the StGit stack needs to
touch it.

 t/t2201-rebase-nonascii.sh |   27 +++++++++++++++++++++++++++
 1 files changed, 27 insertions(+), 0 deletions(-)
 create mode 100755 t/t2201-rebase-nonascii.sh


diff --git a/t/t2201-rebase-nonascii.sh b/t/t2201-rebase-nonascii.sh
new file mode 100755
index 0000000..42c062f
--- /dev/null
+++ b/t/t2201-rebase-nonascii.sh
@@ -0,0 +1,27 @@
+#!/bin/sh
+test_description='Rebase onto changed non-ASCII file'
+
+. ./test-lib.sh
+
+test_expect_success 'Setup' '
+    echo "Fjäderholmarna" > skärgårdsö.txt &&
+    git add skärgårdsö.txt &&
+    git commit -m "Create island" &&
+    stg init &&
+    echo foo > unrelated.txt &&
+    git add unrelated.txt &&
+    stg new -m "Unrelated file" &&
+    stg refresh &&
+    stg pop &&
+    rm skärgårdsö.txt &&
+    git commit -a -m "Remove island" &&
+    git tag upstream &&
+    git reset --hard HEAD^ &&
+    stg push
+'
+
+test_expect_failure 'Rebase onto changed non-ASCII file' '
+    stg rebase upstream
+'
+
+test_done

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [StGit PATCH 0/4] Handle non-ASCII filenames
  2008-06-01  8:46 [StGIT BUG] StGIT errors out on rebasing patch deleting file with Unicode filename Jakub Narebski
  2008-06-02  7:39 ` Jakub Narebski
  2008-06-02 20:23 ` [PATCH] Add rebase test for when upstream has deleted a non-ASCII file Karl Hasselström
@ 2008-06-02 21:46 ` Karl Hasselström
  2008-06-02 21:46   ` [PATCH 1/4] Add rebase test for when upstream has deleted a non-ASCII file Karl Hasselström
                     ` (5 more replies)
  2 siblings, 6 replies; 18+ messages in thread
From: Karl Hasselström @ 2008-06-02 21:46 UTC (permalink / raw)
  To: Catalin Marinas, Jakub Narebski; +Cc: git

I fixed the first problem, and while doing so noticed that a nearby
block of code had exactly the same bug. So I fixed that as well.

Catalin, this should go on the stable branch, I believe. It probably
warrants a new release too, since anyone rebasing patches past the
point where the "Märchen" file was removed from git.git is going to
hit the same bug Jakub did.

---

Karl Hasselström (4):
      Handle refresh of changed files with non-ASCII names
      Test for another filename quoting issue in tree_status()
      Handle changed files with non-ASCII names
      Add rebase test for when upstream has deleted a non-ASCII file


 stgit/git.py                   |   31 +++++++++++++++++++++----------
 t/t3200-non-ascii-filenames.sh |   38 ++++++++++++++++++++++++++++++++++++++
 2 files changed, 59 insertions(+), 10 deletions(-)
 create mode 100755 t/t3200-non-ascii-filenames.sh

-- 
Karl Hasselström, kha@treskal.com
      www.treskal.com/kalle

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 1/4] Add rebase test for when upstream has deleted a non-ASCII file
  2008-06-02 21:46 ` [StGit PATCH 0/4] Handle non-ASCII filenames Karl Hasselström
@ 2008-06-02 21:46   ` Karl Hasselström
  2008-06-02 21:46   ` [PATCH 2/4] Handle changed files with non-ASCII names Karl Hasselström
                     ` (4 subsequent siblings)
  5 siblings, 0 replies; 18+ messages in thread
From: Karl Hasselström @ 2008-06-02 21:46 UTC (permalink / raw)
  To: Catalin Marinas, Jakub Narebski; +Cc: git

Test that stg rebase can handle upstream deleting a file with a
non-ASCII name. It currently can't.

Bug spotted by Jakub Narebski <jnareb@gmail.com>.

Signed-off-by: Karl Hasselström <kha@treskal.com>

---

 t/t3200-non-ascii-filenames.sh |   27 +++++++++++++++++++++++++++
 1 files changed, 27 insertions(+), 0 deletions(-)
 create mode 100755 t/t3200-non-ascii-filenames.sh


diff --git a/t/t3200-non-ascii-filenames.sh b/t/t3200-non-ascii-filenames.sh
new file mode 100755
index 0000000..1d82a9f
--- /dev/null
+++ b/t/t3200-non-ascii-filenames.sh
@@ -0,0 +1,27 @@
+#!/bin/sh
+test_description='Handle files with non-ASCII characters in their names'
+
+. ./test-lib.sh
+
+test_expect_success 'Setup' '
+    echo "Fjäderholmarna" > skärgårdsö.txt &&
+    git add skärgårdsö.txt &&
+    git commit -m "Create island" &&
+    stg init &&
+    echo foo > unrelated.txt &&
+    git add unrelated.txt &&
+    stg new -m "Unrelated file" &&
+    stg refresh &&
+    stg pop &&
+    rm skärgårdsö.txt &&
+    git commit -a -m "Remove island" &&
+    git tag upstream &&
+    git reset --hard HEAD^ &&
+    stg push
+'
+
+test_expect_failure 'Rebase onto changed non-ASCII file' '
+    stg rebase upstream
+'
+
+test_done

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 2/4] Handle changed files with non-ASCII names
  2008-06-02 21:46 ` [StGit PATCH 0/4] Handle non-ASCII filenames Karl Hasselström
  2008-06-02 21:46   ` [PATCH 1/4] Add rebase test for when upstream has deleted a non-ASCII file Karl Hasselström
@ 2008-06-02 21:46   ` Karl Hasselström
  2008-06-02 21:46   ` [PATCH 3/4] Test for another filename quoting issue in tree_status() Karl Hasselström
                     ` (3 subsequent siblings)
  5 siblings, 0 replies; 18+ messages in thread
From: Karl Hasselström @ 2008-06-02 21:46 UTC (permalink / raw)
  To: Catalin Marinas, Jakub Narebski; +Cc: git

Git was quoting them for us, which was not what we wanted. So call
diff-index with the -z flag, so that it doesn't.

Signed-off-by: Karl Hasselström <kha@treskal.com>

---

 stgit/git.py                   |   18 +++++++++++++-----
 t/t3200-non-ascii-filenames.sh |    2 +-
 2 files changed, 14 insertions(+), 6 deletions(-)


diff --git a/stgit/git.py b/stgit/git.py
index deb5efc..d4cd946 100644
--- a/stgit/git.py
+++ b/stgit/git.py
@@ -236,11 +236,19 @@ def tree_status(files = None, tree_id = 'HEAD', unknown = False,
     args = diff_flags + [tree_id]
     if files:
         args += ['--'] + files
-    for line in GRun('diff-index', *args).output_lines():
-        fs = tuple(line.rstrip().split(' ',4)[-1].split('\t',1))
-        if fs[1] not in reported_files:
-            cache_files.append(fs)
-            reported_files.add(fs[1])
+    t = None
+    for line in GRun('diff-index', '-z', *args).raw_output().split('\0'):
+        if not line:
+            # There's a zero byte at the end of the output, which
+            # gives us an empty string as the last "line".
+            continue
+        if t == None:
+            mode_a, mode_b, sha1_a, sha1_b, t = line.split(' ')
+        else:
+            if not line in reported_files:
+                cache_files.append((t, line))
+                reported_files.add(line)
+            t = None
 
     # files in the index but changed on (or removed from) disk
     args = list(diff_flags)
diff --git a/t/t3200-non-ascii-filenames.sh b/t/t3200-non-ascii-filenames.sh
index 1d82a9f..a04ead8 100755
--- a/t/t3200-non-ascii-filenames.sh
+++ b/t/t3200-non-ascii-filenames.sh
@@ -20,7 +20,7 @@ test_expect_success 'Setup' '
     stg push
 '
 
-test_expect_failure 'Rebase onto changed non-ASCII file' '
+test_expect_success 'Rebase onto changed non-ASCII file' '
     stg rebase upstream
 '
 

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 3/4] Test for another filename quoting issue in tree_status()
  2008-06-02 21:46 ` [StGit PATCH 0/4] Handle non-ASCII filenames Karl Hasselström
  2008-06-02 21:46   ` [PATCH 1/4] Add rebase test for when upstream has deleted a non-ASCII file Karl Hasselström
  2008-06-02 21:46   ` [PATCH 2/4] Handle changed files with non-ASCII names Karl Hasselström
@ 2008-06-02 21:46   ` Karl Hasselström
  2008-06-02 21:46   ` [PATCH 4/4] Handle refresh of changed files with non-ASCII names Karl Hasselström
                     ` (2 subsequent siblings)
  5 siblings, 0 replies; 18+ messages in thread
From: Karl Hasselström @ 2008-06-02 21:46 UTC (permalink / raw)
  To: Catalin Marinas, Jakub Narebski; +Cc: git

stgit.git.tree_status() had another filename quoting issue, similar to
the one just fixed. Test for that one too.

Signed-off-by: Karl Hasselström <kha@treskal.com>

---

 t/t3200-non-ascii-filenames.sh |   13 ++++++++++++-
 1 files changed, 12 insertions(+), 1 deletions(-)


diff --git a/t/t3200-non-ascii-filenames.sh b/t/t3200-non-ascii-filenames.sh
index a04ead8..055b152 100755
--- a/t/t3200-non-ascii-filenames.sh
+++ b/t/t3200-non-ascii-filenames.sh
@@ -10,7 +10,7 @@ test_expect_success 'Setup' '
     stg init &&
     echo foo > unrelated.txt &&
     git add unrelated.txt &&
-    stg new -m "Unrelated file" &&
+    stg new p0 -m "Unrelated file" &&
     stg refresh &&
     stg pop &&
     rm skärgårdsö.txt &&
@@ -24,4 +24,15 @@ test_expect_success 'Rebase onto changed non-ASCII file' '
     stg rebase upstream
 '
 
+test_expect_success 'Setup' '
+    stg delete p0 &&
+    git reset --hard HEAD^ &&
+    echo "-- ett liv mitt ute i vattnet" >> skärgårdsö.txt &&
+    stg new p1 -m "Describe island"
+'
+
+test_expect_failure 'Refresh changes to non-ASCII file' '
+    stg refresh
+'
+
 test_done

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 4/4] Handle refresh of changed files with non-ASCII names
  2008-06-02 21:46 ` [StGit PATCH 0/4] Handle non-ASCII filenames Karl Hasselström
                     ` (2 preceding siblings ...)
  2008-06-02 21:46   ` [PATCH 3/4] Test for another filename quoting issue in tree_status() Karl Hasselström
@ 2008-06-02 21:46   ` Karl Hasselström
  2008-06-03  0:41   ` [StGit PATCH v2 0/4] Handle non-ASCII filenames Karl Hasselström
  2008-06-03  7:56   ` [StGit PATCH 0/4] Handle non-ASCII filenames Catalin Marinas
  5 siblings, 0 replies; 18+ messages in thread
From: Karl Hasselström @ 2008-06-02 21:46 UTC (permalink / raw)
  To: Catalin Marinas, Jakub Narebski; +Cc: git

Without -z, git diff-files was quoting them for us.

Signed-off-by: Karl Hasselström <kha@treskal.com>

---

 stgit/git.py                   |   39 +++++++++++++++++++++------------------
 t/t3200-non-ascii-filenames.sh |    2 +-
 2 files changed, 22 insertions(+), 19 deletions(-)


diff --git a/stgit/git.py b/stgit/git.py
index d4cd946..9032ddd 100644
--- a/stgit/git.py
+++ b/stgit/git.py
@@ -191,6 +191,19 @@ def ls_files(files, tree = None, full_name = True):
         raise GitException, \
             'Some of the given paths are either missing or not known to GIT'
 
+def parse_git_ls(output):
+    t = None
+    for line in output.split('\0'):
+        if not line:
+            # There's a zero byte at the end of the output, which
+            # gives us an empty string as the last "line".
+            continue
+        if t == None:
+            mode_a, mode_b, sha1_a, sha1_b, t = line.split(' ')
+        else:
+            yield (t, line)
+            t = None
+
 def tree_status(files = None, tree_id = 'HEAD', unknown = False,
                   noexclude = True, verbose = False, diff_flags = []):
     """Get the status of all changed files, or of a selected set of
@@ -236,29 +249,19 @@ def tree_status(files = None, tree_id = 'HEAD', unknown = False,
     args = diff_flags + [tree_id]
     if files:
         args += ['--'] + files
-    t = None
-    for line in GRun('diff-index', '-z', *args).raw_output().split('\0'):
-        if not line:
-            # There's a zero byte at the end of the output, which
-            # gives us an empty string as the last "line".
-            continue
-        if t == None:
-            mode_a, mode_b, sha1_a, sha1_b, t = line.split(' ')
-        else:
-            if not line in reported_files:
-                cache_files.append((t, line))
-                reported_files.add(line)
-            t = None
+    for t, fn in parse_git_ls(GRun('diff-index', '-z', *args).raw_output()):
+        if not fn in reported_files:
+            cache_files.append((t, fn))
+            reported_files.add(fn)
 
     # files in the index but changed on (or removed from) disk
     args = list(diff_flags)
     if files:
         args += ['--'] + files
-    for line in GRun('diff-files', *args).output_lines():
-        fs = tuple(line.rstrip().split(' ',4)[-1].split('\t',1))
-        if fs[1] not in reported_files:
-            cache_files.append(fs)
-            reported_files.add(fs[1])
+    for t, fn in parse_git_ls(GRun('diff-files', '-z', *args).raw_output()):
+        if not fn in reported_files:
+            cache_files.append((t, fn))
+            reported_files.add(fn)
 
     if verbose:
         out.done()
diff --git a/t/t3200-non-ascii-filenames.sh b/t/t3200-non-ascii-filenames.sh
index 055b152..892ebc9 100755
--- a/t/t3200-non-ascii-filenames.sh
+++ b/t/t3200-non-ascii-filenames.sh
@@ -31,7 +31,7 @@ test_expect_success 'Setup' '
     stg new p1 -m "Describe island"
 '
 
-test_expect_failure 'Refresh changes to non-ASCII file' '
+test_expect_success 'Refresh changes to non-ASCII file' '
     stg refresh
 '
 

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [StGit PATCH v2 0/4] Handle non-ASCII filenames
  2008-06-02 21:46 ` [StGit PATCH 0/4] Handle non-ASCII filenames Karl Hasselström
                     ` (3 preceding siblings ...)
  2008-06-02 21:46   ` [PATCH 4/4] Handle refresh of changed files with non-ASCII names Karl Hasselström
@ 2008-06-03  0:41   ` Karl Hasselström
  2008-06-03  0:41     ` [StGit PATCH v2 1/4] Add rebase test for when upstream has deleted a non-ASCII file Karl Hasselström
                       ` (3 more replies)
  2008-06-03  7:56   ` [StGit PATCH 0/4] Handle non-ASCII filenames Catalin Marinas
  5 siblings, 4 replies; 18+ messages in thread
From: Karl Hasselström @ 2008-06-03  0:41 UTC (permalink / raw)
  To: Catalin Marinas, Jakub Narebski; +Cc: git

I fixed the first problem, and while doing so noticed that a nearby
block of code had exactly the same bug. So I fixed that as well.

Catalin, this should go on the stable branch, I believe. It probably
warrants a new release too, since anyone rebasing patches past the
point where the "Märchen" file was removed from git.git is going to
hit the same bug Jakub did.

( This is a second version of the series, this time based on Catalin's
  stable and not stable~1. As luck would have it, that last commit was
  touching the precise same part of stgit/git.py that my series did.
  Argh. )

---

Karl Hasselström (4):
      Handle refresh of changed files with non-ASCII names
      Test for another filename quoting issue in tree_status()
      Handle changed files with non-ASCII names
      Add rebase test for when upstream has deleted a non-ASCII file


 stgit/git.py                   |   31 ++++++++++++++-------
 t/t3200-non-ascii-filenames.sh |   59 ++++++++++++++++++++++++++++++++++++++++
 2 files changed, 80 insertions(+), 10 deletions(-)
 create mode 100755 t/t3200-non-ascii-filenames.sh

-- 
Karl Hasselström, kha@treskal.com
      www.treskal.com/kalle

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [StGit PATCH v2 1/4] Add rebase test for when upstream has deleted a non-ASCII file
  2008-06-03  0:41   ` [StGit PATCH v2 0/4] Handle non-ASCII filenames Karl Hasselström
@ 2008-06-03  0:41     ` Karl Hasselström
  2008-06-03  0:41     ` [StGit PATCH v2 2/4] Handle changed files with non-ASCII names Karl Hasselström
                       ` (2 subsequent siblings)
  3 siblings, 0 replies; 18+ messages in thread
From: Karl Hasselström @ 2008-06-03  0:41 UTC (permalink / raw)
  To: Catalin Marinas, Jakub Narebski; +Cc: git

Test that stg rebase can handle upstream deleting a file with a
non-ASCII name. It currently can't.

Bug spotted by Jakub Narebski <jnareb@gmail.com>.

Signed-off-by: Karl Hasselström <kha@treskal.com>

---

 t/t3200-non-ascii-filenames.sh |   27 +++++++++++++++++++++++++++
 1 files changed, 27 insertions(+), 0 deletions(-)
 create mode 100755 t/t3200-non-ascii-filenames.sh


diff --git a/t/t3200-non-ascii-filenames.sh b/t/t3200-non-ascii-filenames.sh
new file mode 100755
index 0000000..1d82a9f
--- /dev/null
+++ b/t/t3200-non-ascii-filenames.sh
@@ -0,0 +1,27 @@
+#!/bin/sh
+test_description='Handle files with non-ASCII characters in their names'
+
+. ./test-lib.sh
+
+test_expect_success 'Setup' '
+    echo "Fjäderholmarna" > skärgårdsö.txt &&
+    git add skärgårdsö.txt &&
+    git commit -m "Create island" &&
+    stg init &&
+    echo foo > unrelated.txt &&
+    git add unrelated.txt &&
+    stg new -m "Unrelated file" &&
+    stg refresh &&
+    stg pop &&
+    rm skärgårdsö.txt &&
+    git commit -a -m "Remove island" &&
+    git tag upstream &&
+    git reset --hard HEAD^ &&
+    stg push
+'
+
+test_expect_failure 'Rebase onto changed non-ASCII file' '
+    stg rebase upstream
+'
+
+test_done

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [StGit PATCH v2 2/4] Handle changed files with non-ASCII names
  2008-06-03  0:41   ` [StGit PATCH v2 0/4] Handle non-ASCII filenames Karl Hasselström
  2008-06-03  0:41     ` [StGit PATCH v2 1/4] Add rebase test for when upstream has deleted a non-ASCII file Karl Hasselström
@ 2008-06-03  0:41     ` Karl Hasselström
  2008-06-03  0:41     ` [StGit PATCH v2 3/4] Test for another filename quoting issue in tree_status() Karl Hasselström
  2008-06-03  0:41     ` [StGit PATCH v2 4/4] Handle refresh of changed files with non-ASCII names Karl Hasselström
  3 siblings, 0 replies; 18+ messages in thread
From: Karl Hasselström @ 2008-06-03  0:41 UTC (permalink / raw)
  To: Catalin Marinas, Jakub Narebski; +Cc: git

Git was quoting them for us, which was not what we wanted. So call
diff-index with the -z flag, so that it doesn't.

Signed-off-by: Karl Hasselström <kha@treskal.com>

---

 stgit/git.py                   |   22 +++++++++++++++-------
 t/t3200-non-ascii-filenames.sh |    2 +-
 2 files changed, 16 insertions(+), 8 deletions(-)


diff --git a/stgit/git.py b/stgit/git.py
index 6140fd9..8c637d5 100644
--- a/stgit/git.py
+++ b/stgit/git.py
@@ -242,13 +242,21 @@ def tree_status(files = None, tree_id = 'HEAD', unknown = False,
         args = diff_flags + [tree_id]
         if files_left:
             args += ['--'] + files_left
-        for line in GRun('diff-index', *args).output_lines():
-            fs = tuple(line.rstrip().split(' ',4)[-1].split('\t',1))
-            # the condition is needed in case files is emtpy and
-            # diff-index lists those already reported
-            if fs[1] not in reported_files:
-                cache_files.append(fs)
-                reported_files.add(fs[1])
+        t = None
+        for line in GRun('diff-index', '-z', *args).raw_output().split('\0'):
+            if not line:
+                # There's a zero byte at the end of the output, which
+                # gives us an empty string as the last "line".
+                continue
+            if t == None:
+                mode_a, mode_b, sha1_a, sha1_b, t = line.split(' ')
+            else:
+                # the condition is needed in case files is emtpy and
+                # diff-index lists those already reported
+                if not line in reported_files:
+                    cache_files.append((t, line))
+                    reported_files.add(line)
+                t = None
         files_left = [f for f in files if f not in reported_files]
 
     # files in the index but changed on (or removed from) disk. Only
diff --git a/t/t3200-non-ascii-filenames.sh b/t/t3200-non-ascii-filenames.sh
index 1d82a9f..a04ead8 100755
--- a/t/t3200-non-ascii-filenames.sh
+++ b/t/t3200-non-ascii-filenames.sh
@@ -20,7 +20,7 @@ test_expect_success 'Setup' '
     stg push
 '
 
-test_expect_failure 'Rebase onto changed non-ASCII file' '
+test_expect_success 'Rebase onto changed non-ASCII file' '
     stg rebase upstream
 '
 

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [StGit PATCH v2 3/4] Test for another filename quoting issue in tree_status()
  2008-06-03  0:41   ` [StGit PATCH v2 0/4] Handle non-ASCII filenames Karl Hasselström
  2008-06-03  0:41     ` [StGit PATCH v2 1/4] Add rebase test for when upstream has deleted a non-ASCII file Karl Hasselström
  2008-06-03  0:41     ` [StGit PATCH v2 2/4] Handle changed files with non-ASCII names Karl Hasselström
@ 2008-06-03  0:41     ` Karl Hasselström
  2008-06-03  0:41     ` [StGit PATCH v2 4/4] Handle refresh of changed files with non-ASCII names Karl Hasselström
  3 siblings, 0 replies; 18+ messages in thread
From: Karl Hasselström @ 2008-06-03  0:41 UTC (permalink / raw)
  To: Catalin Marinas, Jakub Narebski; +Cc: git

stgit.git.tree_status() had another filename quoting issue, similar to
the one just fixed. Test for that one too.

Signed-off-by: Karl Hasselström <kha@treskal.com>

---

 t/t3200-non-ascii-filenames.sh |   34 +++++++++++++++++++++++++++++++++-
 1 files changed, 33 insertions(+), 1 deletions(-)


diff --git a/t/t3200-non-ascii-filenames.sh b/t/t3200-non-ascii-filenames.sh
index a04ead8..3146b8d 100755
--- a/t/t3200-non-ascii-filenames.sh
+++ b/t/t3200-non-ascii-filenames.sh
@@ -3,6 +3,12 @@ test_description='Handle files with non-ASCII characters in their names'
 
 . ./test-lib.sh
 
+# Ignore our own output files.
+cat > .git/info/exclude <<EOF
+/expected.txt
+/output.txt
+EOF
+
 test_expect_success 'Setup' '
     echo "Fjäderholmarna" > skärgårdsö.txt &&
     git add skärgårdsö.txt &&
@@ -10,7 +16,7 @@ test_expect_success 'Setup' '
     stg init &&
     echo foo > unrelated.txt &&
     git add unrelated.txt &&
-    stg new -m "Unrelated file" &&
+    stg new p0 -m "Unrelated file" &&
     stg refresh &&
     stg pop &&
     rm skärgårdsö.txt &&
@@ -24,4 +30,30 @@ test_expect_success 'Rebase onto changed non-ASCII file' '
     stg rebase upstream
 '
 
+test_expect_success 'Setup' '
+    stg delete p0 &&
+    git reset --hard HEAD^ &&
+    echo "-- ett liv mitt ute i vattnet" >> skärgårdsö.txt &&
+    stg new p1 -m "Describe island"
+'
+
+cat > expected.txt <<EOF
+M skärgårdsö.txt
+EOF
+test_expect_failure 'Status of modified non-ASCII file' '
+    stg status > output.txt &&
+    diff -u expected.txt output.txt
+'
+
+test_expect_success 'Refresh changes to non-ASCII file' '
+    stg refresh
+'
+
+cat > expected.txt <<EOF
+EOF
+test_expect_success 'Status after refresh' '
+    stg status > output.txt &&
+    diff -u expected.txt output.txt
+'
+
 test_done

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [StGit PATCH v2 4/4] Handle refresh of changed files with non-ASCII names
  2008-06-03  0:41   ` [StGit PATCH v2 0/4] Handle non-ASCII filenames Karl Hasselström
                       ` (2 preceding siblings ...)
  2008-06-03  0:41     ` [StGit PATCH v2 3/4] Test for another filename quoting issue in tree_status() Karl Hasselström
@ 2008-06-03  0:41     ` Karl Hasselström
  3 siblings, 0 replies; 18+ messages in thread
From: Karl Hasselström @ 2008-06-03  0:41 UTC (permalink / raw)
  To: Catalin Marinas, Jakub Narebski; +Cc: git

Without -z, git diff-files was quoting them for us.

Signed-off-by: Karl Hasselström <kha@treskal.com>

---

 stgit/git.py                   |   43 +++++++++++++++++++++-------------------
 t/t3200-non-ascii-filenames.sh |    2 +-
 2 files changed, 24 insertions(+), 21 deletions(-)


diff --git a/stgit/git.py b/stgit/git.py
index 8c637d5..8e6bdf4 100644
--- a/stgit/git.py
+++ b/stgit/git.py
@@ -191,6 +191,19 @@ def ls_files(files, tree = 'HEAD', full_name = True):
         raise GitException, \
             'Some of the given paths are either missing or not known to GIT'
 
+def parse_git_ls(output):
+    t = None
+    for line in output.split('\0'):
+        if not line:
+            # There's a zero byte at the end of the output, which
+            # gives us an empty string as the last "line".
+            continue
+        if t == None:
+            mode_a, mode_b, sha1_a, sha1_b, t = line.split(' ')
+        else:
+            yield (t, line)
+            t = None
+
 def tree_status(files = None, tree_id = 'HEAD', unknown = False,
                   noexclude = True, verbose = False, diff_flags = []):
     """Get the status of all changed files, or of a selected set of
@@ -242,21 +255,12 @@ def tree_status(files = None, tree_id = 'HEAD', unknown = False,
         args = diff_flags + [tree_id]
         if files_left:
             args += ['--'] + files_left
-        t = None
-        for line in GRun('diff-index', '-z', *args).raw_output().split('\0'):
-            if not line:
-                # There's a zero byte at the end of the output, which
-                # gives us an empty string as the last "line".
-                continue
-            if t == None:
-                mode_a, mode_b, sha1_a, sha1_b, t = line.split(' ')
-            else:
-                # the condition is needed in case files is emtpy and
-                # diff-index lists those already reported
-                if not line in reported_files:
-                    cache_files.append((t, line))
-                    reported_files.add(line)
-                t = None
+        for t, fn in parse_git_ls(GRun('diff-index', '-z', *args).raw_output()):
+            # the condition is needed in case files is emtpy and
+            # diff-index lists those already reported
+            if not fn in reported_files:
+                cache_files.append((t, fn))
+                reported_files.add(fn)
         files_left = [f for f in files if f not in reported_files]
 
     # files in the index but changed on (or removed from) disk. Only
@@ -267,13 +271,12 @@ def tree_status(files = None, tree_id = 'HEAD', unknown = False,
         args = list(diff_flags)
         if files_left:
             args += ['--'] + files_left
-        for line in GRun('diff-files', *args).output_lines():
-            fs = tuple(line.rstrip().split(' ',4)[-1].split('\t',1))
+        for t, fn in parse_git_ls(GRun('diff-files', '-z', *args).raw_output()):
             # the condition is needed in case files is empty and
             # diff-files lists those already reported
-            if fs[1] not in reported_files:
-                cache_files.append(fs)
-                reported_files.add(fs[1])
+            if not fn in reported_files:
+                cache_files.append((t, fn))
+                reported_files.add(fn)
 
     if verbose:
         out.done()
diff --git a/t/t3200-non-ascii-filenames.sh b/t/t3200-non-ascii-filenames.sh
index 3146b8d..1aa78ed 100755
--- a/t/t3200-non-ascii-filenames.sh
+++ b/t/t3200-non-ascii-filenames.sh
@@ -40,7 +40,7 @@ test_expect_success 'Setup' '
 cat > expected.txt <<EOF
 M skärgårdsö.txt
 EOF
-test_expect_failure 'Status of modified non-ASCII file' '
+test_expect_success 'Status of modified non-ASCII file' '
     stg status > output.txt &&
     diff -u expected.txt output.txt
 '

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [StGit PATCH 0/4] Handle non-ASCII filenames
  2008-06-02 21:46 ` [StGit PATCH 0/4] Handle non-ASCII filenames Karl Hasselström
                     ` (4 preceding siblings ...)
  2008-06-03  0:41   ` [StGit PATCH v2 0/4] Handle non-ASCII filenames Karl Hasselström
@ 2008-06-03  7:56   ` Catalin Marinas
  2008-06-03  9:27     ` Karl Hasselström
  5 siblings, 1 reply; 18+ messages in thread
From: Catalin Marinas @ 2008-06-03  7:56 UTC (permalink / raw)
  To: Karl Hasselström; +Cc: Jakub Narebski, git

On 02/06/2008, Karl Hasselström <kha@treskal.com> wrote:
> I fixed the first problem, and while doing so noticed that a nearby
>  block of code had exactly the same bug. So I fixed that as well.

That was fast :-). thanks.

>  Catalin, this should go on the stable branch, I believe. It probably
>  warrants a new release too, since anyone rebasing patches past the
>  point where the "Märchen" file was removed from git.git is going to
>  hit the same bug Jakub did.

Yes, it will go in both stable and master. There are some more patches
on stable already, I'll try to release 0.14.3 this weekend.

Is the new infrastructure affected? We haven't got to the point of
converting "rebase" yet (doesn't have all the functionality it needs
in the new infrastructure). BTW, I created a new lib.git.Branch class
as the parent of lib.git.Stack and it takes care the extra things used
when creating a new stack like setting parent and remote branches.
I'll post the patches in the next few days.

-- 
Catalin

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [StGit PATCH 0/4] Handle non-ASCII filenames
  2008-06-03  7:56   ` [StGit PATCH 0/4] Handle non-ASCII filenames Catalin Marinas
@ 2008-06-03  9:27     ` Karl Hasselström
  0 siblings, 0 replies; 18+ messages in thread
From: Karl Hasselström @ 2008-06-03  9:27 UTC (permalink / raw)
  To: Catalin Marinas; +Cc: Jakub Narebski, git

On 2008-06-03 08:56:38 +0100, Catalin Marinas wrote:

> On 02/06/2008, Karl Hasselström <kha@treskal.com> wrote:
>
> > I fixed the first problem, and while doing so noticed that a
> > nearby block of code had exactly the same bug. So I fixed that as
> > well.
>
> That was fast :-). thanks.

Jakub reported the bug right before I was about to start hacking
anyway. :-)

> >  Catalin, this should go on the stable branch, I believe. It
> >  probably warrants a new release too, since anyone rebasing
> >  patches past the point where the "Märchen" file was removed from
> >  git.git is going to hit the same bug Jakub did.
>
> Yes, it will go in both stable and master. There are some more
> patches on stable already, I'll try to release 0.14.3 this weekend.

Splendid. You can pull from my "stable" branch if you like.

> Is the new infrastructure affected? We haven't got to the point of
> converting "rebase" yet (doesn't have all the functionality it needs
> in the new infrastructure).

The problem wasn't actually about rebase. It was about mangling of
diff-index and diff-files output in tree_status(). I think if one were
to look closely, a largish number of commands would be found to be
affected.

The new infrastructure won't have this problem, by virtue of not
having any such calls yet. Plus, I try to use the -z flag whenever I
can from the outset.

> BTW, I created a new lib.git.Branch class as the parent of
> lib.git.Stack and it takes care the extra things used when creating
> a new stack like setting parent and remote branches. I'll post the
> patches in the next few days.

Looking forward to it.

-- 
Karl Hasselström, kha@treskal.com
      www.treskal.com/kalle

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2008-06-03  9:28 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-06-01  8:46 [StGIT BUG] StGIT errors out on rebasing patch deleting file with Unicode filename Jakub Narebski
2008-06-02  7:39 ` Jakub Narebski
2008-06-02 13:26   ` Catalin Marinas
2008-06-02 15:47     ` Jakub Narebski
2008-06-02 16:11       ` Catalin Marinas
2008-06-02 20:23 ` [PATCH] Add rebase test for when upstream has deleted a non-ASCII file Karl Hasselström
2008-06-02 21:46 ` [StGit PATCH 0/4] Handle non-ASCII filenames Karl Hasselström
2008-06-02 21:46   ` [PATCH 1/4] Add rebase test for when upstream has deleted a non-ASCII file Karl Hasselström
2008-06-02 21:46   ` [PATCH 2/4] Handle changed files with non-ASCII names Karl Hasselström
2008-06-02 21:46   ` [PATCH 3/4] Test for another filename quoting issue in tree_status() Karl Hasselström
2008-06-02 21:46   ` [PATCH 4/4] Handle refresh of changed files with non-ASCII names Karl Hasselström
2008-06-03  0:41   ` [StGit PATCH v2 0/4] Handle non-ASCII filenames Karl Hasselström
2008-06-03  0:41     ` [StGit PATCH v2 1/4] Add rebase test for when upstream has deleted a non-ASCII file Karl Hasselström
2008-06-03  0:41     ` [StGit PATCH v2 2/4] Handle changed files with non-ASCII names Karl Hasselström
2008-06-03  0:41     ` [StGit PATCH v2 3/4] Test for another filename quoting issue in tree_status() Karl Hasselström
2008-06-03  0:41     ` [StGit PATCH v2 4/4] Handle refresh of changed files with non-ASCII names Karl Hasselström
2008-06-03  7:56   ` [StGit PATCH 0/4] Handle non-ASCII filenames Catalin Marinas
2008-06-03  9:27     ` Karl Hasselström

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).