git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH 0/8] Initial support for Python 3
@ 2013-01-12 19:23 John Keeping
  2013-01-12 19:23 ` [PATCH 1/8] git_remote_helpers: Allow building with " John Keeping
                   ` (17 more replies)
  0 siblings, 18 replies; 53+ messages in thread
From: John Keeping @ 2013-01-12 19:23 UTC (permalink / raw)
  To: git; +Cc: John Keeping, Eric S. Raymond, Felipe Contreras, Sverre Rabbelier

I started having a look to see how much work would be needed to make Git
work with Python 3 and the answer is mostly not much.  The exception is
git-p4.py which is hit hard by the distinction between byte strings and
unicode strings, particularly because the Python output mode of p4
targets Python 2.

I don't know if it's worthwhile to actually apply these but here they
are in case anyone's interested.

Having said that, the changes are minimal and involve either wrapping
parentheses around arguments to print or being a bit more explicit about
how we expect byte strings to be decoded to unicode.

With these patches all tests pass with python3 except t98* (git-p4), but
there are a couple of topics in-flight which will affect that
(fc/remote-testgit-feature-done and er/replace-cvsimport).

John Keeping (8):
  git_remote_helpers: Allow building with Python 3
  git_remote_helpers: fix input when running under Python 3
  git_remote_helpers: Force rebuild if python version changes
  git_remote_helpers: Use 2to3 if building with Python 3
  svn-fe: allow svnrdump_sim.py to run with Python 3
  git-remote-testpy: hash bytes explicitly
  git-remote-testpy: don't do unbuffered text I/O
  git-remote-testpy: call print as a function

 contrib/svn-fe/svnrdump_sim.py     |  4 ++--
 git-remote-testpy.py               | 40 +++++++++++++++++++-------------------
 git_remote_helpers/.gitignore      |  1 +
 git_remote_helpers/Makefile        | 10 ++++++++--
 git_remote_helpers/git/importer.py |  2 +-
 git_remote_helpers/setup.py        | 10 ++++++++++
 6 files changed, 42 insertions(+), 25 deletions(-)

-- 
1.8.1

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [PATCH 1/8] git_remote_helpers: Allow building with Python 3
  2013-01-12 19:23 [PATCH 0/8] Initial support for Python 3 John Keeping
@ 2013-01-12 19:23 ` John Keeping
  2013-01-12 19:23 ` [PATCH 2/8] git_remote_helpers: fix input when running under " John Keeping
                   ` (16 subsequent siblings)
  17 siblings, 0 replies; 53+ messages in thread
From: John Keeping @ 2013-01-12 19:23 UTC (permalink / raw)
  To: git; +Cc: John Keeping, Eric S. Raymond, Felipe Contreras, Sverre Rabbelier

Change inline Python to call "print" as a function not a statement.

This is harmless because Python 2 will see the parentheses as redundant
grouping but they are necessary to run this code with Python 3.

Signed-off-by: John Keeping <john@keeping.me.uk>
---
 git_remote_helpers/Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/git_remote_helpers/Makefile b/git_remote_helpers/Makefile
index 74b05dc..f65f064 100644
--- a/git_remote_helpers/Makefile
+++ b/git_remote_helpers/Makefile
@@ -23,7 +23,7 @@ endif
 
 PYLIBDIR=$(shell $(PYTHON_PATH) -c \
 	 "import sys; \
-	 print 'lib/python%i.%i/site-packages' % sys.version_info[:2]")
+	 print('lib/python%i.%i/site-packages' % sys.version_info[:2])")
 
 all: $(pysetupfile)
 	$(QUIET)$(PYTHON_PATH) $(pysetupfile) $(QUIETSETUP) build
-- 
1.8.1

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 2/8] git_remote_helpers: fix input when running under Python 3
  2013-01-12 19:23 [PATCH 0/8] Initial support for Python 3 John Keeping
  2013-01-12 19:23 ` [PATCH 1/8] git_remote_helpers: Allow building with " John Keeping
@ 2013-01-12 19:23 ` John Keeping
  2013-01-13  3:26   ` Michael Haggerty
  2013-01-12 19:23 ` [PATCH 3/8] git_remote_helpers: Force rebuild if python version changes John Keeping
                   ` (15 subsequent siblings)
  17 siblings, 1 reply; 53+ messages in thread
From: John Keeping @ 2013-01-12 19:23 UTC (permalink / raw)
  To: git; +Cc: John Keeping, Eric S. Raymond, Felipe Contreras, Sverre Rabbelier

Although 2to3 will fix most issues in Python 2 code to make it run under
Python 3, it does not handle the new strict separation between byte
strings and unicode strings.  There is one instance in
git_remote_helpers where we are caught by this.

Fix it by explicitly decoding the incoming byte string into a unicode
string.  In this instance, use the locale under which the application is
running.

Signed-off-by: John Keeping <john@keeping.me.uk>
---
 git_remote_helpers/git/importer.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/git_remote_helpers/git/importer.py b/git_remote_helpers/git/importer.py
index e28cc8f..6814003 100644
--- a/git_remote_helpers/git/importer.py
+++ b/git_remote_helpers/git/importer.py
@@ -20,7 +20,7 @@ class GitImporter(object):
         """Returns a dictionary with refs.
         """
         args = ["git", "--git-dir=" + gitdir, "for-each-ref", "refs/heads"]
-        lines = check_output(args).strip().split('\n')
+        lines = check_output(args).decode().strip().split('\n')
         refs = {}
         for line in lines:
             value, name = line.split(' ')
-- 
1.8.1

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 3/8] git_remote_helpers: Force rebuild if python version changes
  2013-01-12 19:23 [PATCH 0/8] Initial support for Python 3 John Keeping
  2013-01-12 19:23 ` [PATCH 1/8] git_remote_helpers: Allow building with " John Keeping
  2013-01-12 19:23 ` [PATCH 2/8] git_remote_helpers: fix input when running under " John Keeping
@ 2013-01-12 19:23 ` John Keeping
  2013-01-12 23:30   ` Pete Wyckoff
  2013-01-12 19:23 ` [PATCH 4/8] git_remote_helpers: Use 2to3 if building with Python 3 John Keeping
                   ` (14 subsequent siblings)
  17 siblings, 1 reply; 53+ messages in thread
From: John Keeping @ 2013-01-12 19:23 UTC (permalink / raw)
  To: git; +Cc: John Keeping, Eric S. Raymond, Felipe Contreras, Sverre Rabbelier

When different version of python are used to build via distutils, the
behaviour can change.  Detect changes in version and pass --force in
this case.

Signed-off-by: John Keeping <john@keeping.me.uk>
---
 git_remote_helpers/.gitignore | 1 +
 git_remote_helpers/Makefile   | 8 +++++++-
 2 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/git_remote_helpers/.gitignore b/git_remote_helpers/.gitignore
index 2247d5f..06c664f 100644
--- a/git_remote_helpers/.gitignore
+++ b/git_remote_helpers/.gitignore
@@ -1,2 +1,3 @@
+/GIT-PYTHON_VERSION
 /build
 /dist
diff --git a/git_remote_helpers/Makefile b/git_remote_helpers/Makefile
index f65f064..91f458f 100644
--- a/git_remote_helpers/Makefile
+++ b/git_remote_helpers/Makefile
@@ -25,8 +25,14 @@ PYLIBDIR=$(shell $(PYTHON_PATH) -c \
 	 "import sys; \
 	 print('lib/python%i.%i/site-packages' % sys.version_info[:2])")
 
+py_version=$(shell $(PYTHON_PATH) -c \
+	'import sys; print("%i.%i" % sys.version_info[:2])')
+
 all: $(pysetupfile)
-	$(QUIET)$(PYTHON_PATH) $(pysetupfile) $(QUIETSETUP) build
+	$(QUIET)test "$$(cat GIT-PYTHON_VERSION 2>/dev/null)" = "$(py_version)" || \
+	flags=--force; \
+	$(PYTHON_PATH) $(pysetupfile) $(QUIETSETUP) build $$flags
+	$(QUIET)echo "$(py_version)" >GIT-PYTHON_VERSION
 
 install: $(pysetupfile)
 	$(PYTHON_PATH) $(pysetupfile) install --prefix $(DESTDIR_SQ)$(prefix)
-- 
1.8.1

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 4/8] git_remote_helpers: Use 2to3 if building with Python 3
  2013-01-12 19:23 [PATCH 0/8] Initial support for Python 3 John Keeping
                   ` (2 preceding siblings ...)
  2013-01-12 19:23 ` [PATCH 3/8] git_remote_helpers: Force rebuild if python version changes John Keeping
@ 2013-01-12 19:23 ` John Keeping
  2013-01-12 19:23 ` [PATCH 5/8] svn-fe: allow svnrdump_sim.py to run " John Keeping
                   ` (13 subsequent siblings)
  17 siblings, 0 replies; 53+ messages in thread
From: John Keeping @ 2013-01-12 19:23 UTC (permalink / raw)
  To: git; +Cc: John Keeping, Eric S. Raymond, Felipe Contreras, Sverre Rabbelier

Using the approach detailed on the Python wiki[1], run 2to3 on the code
as part of the build if building with Python 3.

The code itself requires no changes to convert cleanly.

[1] http://wiki.python.org/moin/PortingPythonToPy3k

Signed-off-by: John Keeping <john@keeping.me.uk>
---
 git_remote_helpers/setup.py | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/git_remote_helpers/setup.py b/git_remote_helpers/setup.py
index 4d434b6..6de41de 100644
--- a/git_remote_helpers/setup.py
+++ b/git_remote_helpers/setup.py
@@ -4,6 +4,15 @@
 
 from distutils.core import setup
 
+# If building under Python3 we need to run 2to3 on the code, do this by
+# trying to import distutils' 2to3 builder, which is only available in
+# Python3.
+try:
+    from distutils.command.build_py import build_py_2to3 as build_py
+except ImportError:
+    # 2.x
+    from distutils.command.build_py import build_py
+
 setup(
     name = 'git_remote_helpers',
     version = '0.1.0',
@@ -14,4 +23,5 @@ setup(
     url = 'http://www.git-scm.com/',
     package_dir = {'git_remote_helpers': ''},
     packages = ['git_remote_helpers', 'git_remote_helpers.git'],
+    cmdclass = {'build_py': build_py},
 )
-- 
1.8.1

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 5/8] svn-fe: allow svnrdump_sim.py to run with Python 3
  2013-01-12 19:23 [PATCH 0/8] Initial support for Python 3 John Keeping
                   ` (3 preceding siblings ...)
  2013-01-12 19:23 ` [PATCH 4/8] git_remote_helpers: Use 2to3 if building with Python 3 John Keeping
@ 2013-01-12 19:23 ` John Keeping
  2013-01-12 19:23 ` [PATCH 6/8] git-remote-testpy: hash bytes explicitly John Keeping
                   ` (12 subsequent siblings)
  17 siblings, 0 replies; 53+ messages in thread
From: John Keeping @ 2013-01-12 19:23 UTC (permalink / raw)
  To: git; +Cc: John Keeping, Eric S. Raymond, Felipe Contreras, Sverre Rabbelier

The changes to allow this script to run with Python 3 are minimal and do
not affect its functionality on the versions of Python 2 that are
already supported (2.4 onwards).

Signed-off-by: John Keeping <john@keeping.me.uk>
---
 contrib/svn-fe/svnrdump_sim.py | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/contrib/svn-fe/svnrdump_sim.py b/contrib/svn-fe/svnrdump_sim.py
index 17cf6f9..4e78a1c 100755
--- a/contrib/svn-fe/svnrdump_sim.py
+++ b/contrib/svn-fe/svnrdump_sim.py
@@ -14,7 +14,7 @@ if sys.hexversion < 0x02040000:
 
 def getrevlimit():
         var = 'SVNRMAX'
-        if os.environ.has_key(var):
+        if var in os.environ:
                 return os.environ[var]
         return None
 
@@ -44,7 +44,7 @@ def writedump(url, lower, upper):
 
 if __name__ == "__main__":
         if not (len(sys.argv) in (3, 4, 5)):
-                print "usage: %s dump URL -rLOWER:UPPER"
+                print("usage: %s dump URL -rLOWER:UPPER")
                 sys.exit(1)
         if not sys.argv[1] == 'dump': raise NotImplementedError('only "dump" is suppported.')
         url = sys.argv[2]
-- 
1.8.1

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 6/8] git-remote-testpy: hash bytes explicitly
  2013-01-12 19:23 [PATCH 0/8] Initial support for Python 3 John Keeping
                   ` (4 preceding siblings ...)
  2013-01-12 19:23 ` [PATCH 5/8] svn-fe: allow svnrdump_sim.py to run " John Keeping
@ 2013-01-12 19:23 ` John Keeping
  2013-01-12 19:23 ` [PATCH 7/8] git-remote-testpy: don't do unbuffered text I/O John Keeping
                   ` (11 subsequent siblings)
  17 siblings, 0 replies; 53+ messages in thread
From: John Keeping @ 2013-01-12 19:23 UTC (permalink / raw)
  To: git; +Cc: John Keeping, Eric S. Raymond, Felipe Contreras, Sverre Rabbelier

Under Python 3 'hasher.update(...)' must take a byte string and not a
unicode string.  Explicitly encode the argument to this method as UTF-8
so that this code works under Python 3.

This moves the required Python version forward to 2.0.

Signed-off-by: John Keeping <john@keeping.me.uk>
---
 git-remote-testpy.py | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/git-remote-testpy.py b/git-remote-testpy.py
index e4533b1..58aa1ae 100644
--- a/git-remote-testpy.py
+++ b/git-remote-testpy.py
@@ -31,9 +31,9 @@ from git_remote_helpers.git.exporter import GitExporter
 from git_remote_helpers.git.importer import GitImporter
 from git_remote_helpers.git.non_local import NonLocalGit
 
-if sys.hexversion < 0x01050200:
-    # os.makedirs() is the limiter
-    sys.stderr.write("git-remote-testgit: requires Python 1.5.2 or later.\n")
+if sys.hexversion < 0x02000000:
+    # string.encode() is the limiter
+    sys.stderr.write("git-remote-testgit: requires Python 2.0 or later.\n")
     sys.exit(1)
 
 def get_repo(alias, url):
@@ -45,7 +45,7 @@ def get_repo(alias, url):
     repo.get_head()
 
     hasher = _digest()
-    hasher.update(repo.path)
+    hasher.update(repo.path.encode('utf-8'))
     repo.hash = hasher.hexdigest()
 
     repo.get_base_path = lambda base: os.path.join(
-- 
1.8.1

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 7/8] git-remote-testpy: don't do unbuffered text I/O
  2013-01-12 19:23 [PATCH 0/8] Initial support for Python 3 John Keeping
                   ` (5 preceding siblings ...)
  2013-01-12 19:23 ` [PATCH 6/8] git-remote-testpy: hash bytes explicitly John Keeping
@ 2013-01-12 19:23 ` John Keeping
  2013-01-12 19:23 ` [PATCH 8/8] git-remote-testpy: call print as a function John Keeping
                   ` (10 subsequent siblings)
  17 siblings, 0 replies; 53+ messages in thread
From: John Keeping @ 2013-01-12 19:23 UTC (permalink / raw)
  To: git; +Cc: John Keeping, Eric S. Raymond, Felipe Contreras, Sverre Rabbelier

Python 3 forbids unbuffered I/O in text mode.  Change the reading of
stdin in git-remote-testpy so that we read the lines as bytes and then
decode them a line at a time.

This allows us to keep the I/O unbuffered in order to avoid
reintroducing the bug fixed by commit 7fb8e16 (git-remote-testgit: fix
race when spawning fast-import).

Signed-off-by: John Keeping <john@keeping.me.uk>
---
 git-remote-testpy.py | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/git-remote-testpy.py b/git-remote-testpy.py
index 58aa1ae..815222f 100644
--- a/git-remote-testpy.py
+++ b/git-remote-testpy.py
@@ -154,7 +154,7 @@ def do_import(repo, args):
     refs = [ref]
 
     while True:
-        line = sys.stdin.readline()
+        line = sys.stdin.readline().decode()
         if line == '\n':
             break
         if not line.startswith('import '):
@@ -217,7 +217,7 @@ def read_one_line(repo):
 
     line = sys.stdin.readline()
 
-    cmdline = line
+    cmdline = line.decode()
 
     if not cmdline:
         warn("Unexpected EOF")
@@ -269,7 +269,7 @@ def main(args):
 
     more = True
 
-    sys.stdin = os.fdopen(sys.stdin.fileno(), 'r', 0)
+    sys.stdin = os.fdopen(sys.stdin.fileno(), 'rb', 0)
     while (more):
         more = read_one_line(repo)
 
-- 
1.8.1

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 8/8] git-remote-testpy: call print as a function
  2013-01-12 19:23 [PATCH 0/8] Initial support for Python 3 John Keeping
                   ` (6 preceding siblings ...)
  2013-01-12 19:23 ` [PATCH 7/8] git-remote-testpy: don't do unbuffered text I/O John Keeping
@ 2013-01-12 19:23 ` John Keeping
  2013-01-12 23:43 ` [PATCH 0/8] Initial support for Python 3 Pete Wyckoff
                   ` (9 subsequent siblings)
  17 siblings, 0 replies; 53+ messages in thread
From: John Keeping @ 2013-01-12 19:23 UTC (permalink / raw)
  To: git; +Cc: John Keeping, Eric S. Raymond, Felipe Contreras, Sverre Rabbelier

This is harmless in Python 2, which sees the parentheses as redundant
grouping, but is required for Python 3.  Since this is the only change
required to make this script just run under Python 3 without needing
2to3 it seems worthwhile.

The case of an empty print must be handled specially because in that
case Python 2 will interpret '()' as an empty tuple and print it as
'()'; inserting an empty string fixes this.

Signed-off-by: John Keeping <john@keeping.me.uk>
---
 git-remote-testpy.py | 26 +++++++++++++-------------
 1 file changed, 13 insertions(+), 13 deletions(-)

diff --git a/git-remote-testpy.py b/git-remote-testpy.py
index 815222f..8ba5d28 100644
--- a/git-remote-testpy.py
+++ b/git-remote-testpy.py
@@ -87,9 +87,9 @@ def do_capabilities(repo, args):
     """Prints the supported capabilities.
     """
 
-    print "import"
-    print "export"
-    print "refspec refs/heads/*:%s*" % repo.prefix
+    print("import")
+    print("export")
+    print("refspec refs/heads/*:%s*" % repo.prefix)
 
     dirname = repo.get_base_path(repo.gitdir)
 
@@ -98,11 +98,11 @@ def do_capabilities(repo, args):
 
     path = os.path.join(dirname, 'git.marks')
 
-    print "*export-marks %s" % path
+    print("*export-marks %s" % path)
     if os.path.exists(path):
-        print "*import-marks %s" % path
+        print("*import-marks %s" % path)
 
-    print # end capabilities
+    print('') # end capabilities
 
 
 def do_list(repo, args):
@@ -115,16 +115,16 @@ def do_list(repo, args):
 
     for ref in repo.revs:
         debug("? refs/heads/%s", ref)
-        print "? refs/heads/%s" % ref
+        print("? refs/heads/%s" % ref)
 
     if repo.head:
         debug("@refs/heads/%s HEAD" % repo.head)
-        print "@refs/heads/%s HEAD" % repo.head
+        print("@refs/heads/%s HEAD" % repo.head)
     else:
         debug("@refs/heads/master HEAD")
-        print "@refs/heads/master HEAD"
+        print("@refs/heads/master HEAD")
 
-    print # end list
+    print('') # end list
 
 
 def update_local_repo(repo):
@@ -167,7 +167,7 @@ def do_import(repo, args):
     repo = update_local_repo(repo)
     repo.exporter.export_repo(repo.gitdir, refs)
 
-    print "done"
+    print("done")
 
 
 def do_export(repo, args):
@@ -184,8 +184,8 @@ def do_export(repo, args):
         repo.non_local.push(repo.gitdir)
 
     for ref in changed:
-        print "ok %s" % ref
-    print
+        print("ok %s" % ref)
+    print('')
 
 
 COMMANDS = {
-- 
1.8.1

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* Re: [PATCH 3/8] git_remote_helpers: Force rebuild if python version changes
  2013-01-12 19:23 ` [PATCH 3/8] git_remote_helpers: Force rebuild if python version changes John Keeping
@ 2013-01-12 23:30   ` Pete Wyckoff
  2013-01-13 16:26     ` John Keeping
  0 siblings, 1 reply; 53+ messages in thread
From: Pete Wyckoff @ 2013-01-12 23:30 UTC (permalink / raw)
  To: John Keeping; +Cc: git, Eric S. Raymond, Felipe Contreras, Sverre Rabbelier

john@keeping.me.uk wrote on Sat, 12 Jan 2013 19:23 +0000:
> When different version of python are used to build via distutils, the
> behaviour can change.  Detect changes in version and pass --force in
> this case.
[..]
> diff --git a/git_remote_helpers/Makefile b/git_remote_helpers/Makefile
[..]
> +py_version=$(shell $(PYTHON_PATH) -c \
> +	'import sys; print("%i.%i" % sys.version_info[:2])')
> +
>  all: $(pysetupfile)
> -	$(QUIET)$(PYTHON_PATH) $(pysetupfile) $(QUIETSETUP) build
> +	$(QUIET)test "$$(cat GIT-PYTHON_VERSION 2>/dev/null)" = "$(py_version)" || \
> +	flags=--force; \
> +	$(PYTHON_PATH) $(pysetupfile) $(QUIETSETUP) build $$flags
> +	$(QUIET)echo "$(py_version)" >GIT-PYTHON_VERSION

Can you depend on ../GIT-PYTHON-VARS instead?  It comes from
96a4647 (Makefile: detect when PYTHON_PATH changes, 2012-12-18).
It doesn't check version, just path, but hopefully that's good
enough.  I'm imagining a rule that would do "clean" if
../GIT-PYTHON-VARS changed, then build without --force.

		-- Pete

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 0/8] Initial support for Python 3
  2013-01-12 19:23 [PATCH 0/8] Initial support for Python 3 John Keeping
                   ` (7 preceding siblings ...)
  2013-01-12 19:23 ` [PATCH 8/8] git-remote-testpy: call print as a function John Keeping
@ 2013-01-12 23:43 ` Pete Wyckoff
  2013-01-13  0:41   ` John Keeping
  2013-01-17 18:53 ` [PATCH v2 0/8] Initial Python 3 support John Keeping
                   ` (8 subsequent siblings)
  17 siblings, 1 reply; 53+ messages in thread
From: Pete Wyckoff @ 2013-01-12 23:43 UTC (permalink / raw)
  To: John Keeping
  Cc: git, Eric S. Raymond, Felipe Contreras, Sverre Rabbelier,
	Sebastian Morr

john@keeping.me.uk wrote on Sat, 12 Jan 2013 19:23 +0000:
> I started having a look to see how much work would be needed to make Git
> work with Python 3 and the answer is mostly not much.  The exception is
> git-p4.py which is hit hard by the distinction between byte strings and
> unicode strings, particularly because the Python output mode of p4
> targets Python 2.
> 
> I don't know if it's worthwhile to actually apply these but here they
> are in case anyone's interested.
> 
> Having said that, the changes are minimal and involve either wrapping
> parentheses around arguments to print or being a bit more explicit about
> how we expect byte strings to be decoded to unicode.
> 
> With these patches all tests pass with python3 except t98* (git-p4), but
> there are a couple of topics in-flight which will affect that
> (fc/remote-testgit-feature-done and er/replace-cvsimport).
> 
> John Keeping (8):
>   git_remote_helpers: Allow building with Python 3
>   git_remote_helpers: fix input when running under Python 3
>   git_remote_helpers: Force rebuild if python version changes
>   git_remote_helpers: Use 2to3 if building with Python 3
>   svn-fe: allow svnrdump_sim.py to run with Python 3
>   git-remote-testpy: hash bytes explicitly
>   git-remote-testpy: don't do unbuffered text I/O
>   git-remote-testpy: call print as a function
> 
>  contrib/svn-fe/svnrdump_sim.py     |  4 ++--
>  git-remote-testpy.py               | 40 +++++++++++++++++++-------------------
>  git_remote_helpers/.gitignore      |  1 +
>  git_remote_helpers/Makefile        | 10 ++++++++--
>  git_remote_helpers/git/importer.py |  2 +-
>  git_remote_helpers/setup.py        | 10 ++++++++++
>  6 files changed, 42 insertions(+), 25 deletions(-)

These look good, in that there are relatively few changed needed.

Sebastian Morr tried a similar patch a year ago, in

    http://thread.gmane.org/gmane.comp.version-control.git/187545

He made changes beyond yours, in particular "print >>" lines,
that you seem to handle with 2to3 during the build.  I'm not sure
which approach is better in the long run.  He worked on the
other .py in contrib/ too.

Can you give me some hints about the byte/unicode string issues
in git-p4.py?  There's really only one place that does:

    p4 = subprocess.Popen("p4 -G ...")
    marshal.load(p4.stdout)

If that's the only issue, this might not be too paniful.

I hesitated to take Sebastian's changes due to the huge number of
print() lines, but maybe a 2to3 approach would make that aspect
of python3 support not too onerous.

		-- Pete

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 0/8] Initial support for Python 3
  2013-01-12 23:43 ` [PATCH 0/8] Initial support for Python 3 Pete Wyckoff
@ 2013-01-13  0:41   ` John Keeping
  2013-01-13 12:34     ` John Keeping
  2013-01-13 16:40     ` Pete Wyckoff
  0 siblings, 2 replies; 53+ messages in thread
From: John Keeping @ 2013-01-13  0:41 UTC (permalink / raw)
  To: Pete Wyckoff
  Cc: git, Eric S. Raymond, Felipe Contreras, Sverre Rabbelier,
	Sebastian Morr

On Sat, Jan 12, 2013 at 06:43:04PM -0500, Pete Wyckoff wrote:
> john@keeping.me.uk wrote on Sat, 12 Jan 2013 19:23 +0000:
>> I started having a look to see how much work would be needed to make Git
>> work with Python 3 and the answer is mostly not much.  The exception is
>> git-p4.py which is hit hard by the distinction between byte strings and
>> unicode strings, particularly because the Python output mode of p4
>> targets Python 2.
>> 
>> I don't know if it's worthwhile to actually apply these but here they
>> are in case anyone's interested.
>> 
>> Having said that, the changes are minimal and involve either wrapping
>> parentheses around arguments to print or being a bit more explicit about
>> how we expect byte strings to be decoded to unicode.
>> 
>> With these patches all tests pass with python3 except t98* (git-p4), but
>> there are a couple of topics in-flight which will affect that
>> (fc/remote-testgit-feature-done and er/replace-cvsimport).
>> 
>> John Keeping (8):
>>   git_remote_helpers: Allow building with Python 3
>>   git_remote_helpers: fix input when running under Python 3
>>   git_remote_helpers: Force rebuild if python version changes
>>   git_remote_helpers: Use 2to3 if building with Python 3
>>   svn-fe: allow svnrdump_sim.py to run with Python 3
>>   git-remote-testpy: hash bytes explicitly
>>   git-remote-testpy: don't do unbuffered text I/O
>>   git-remote-testpy: call print as a function
>> 
>>  contrib/svn-fe/svnrdump_sim.py     |  4 ++--
>>  git-remote-testpy.py               | 40 +++++++++++++++++++-------------------
>>  git_remote_helpers/.gitignore      |  1 +
>>  git_remote_helpers/Makefile        | 10 ++++++++--
>>  git_remote_helpers/git/importer.py |  2 +-
>>  git_remote_helpers/setup.py        | 10 ++++++++++
>>  6 files changed, 42 insertions(+), 25 deletions(-)
> 
> These look good, in that there are relatively few changed needed.
> 
> Sebastian Morr tried a similar patch a year ago, in
> 
>     http://thread.gmane.org/gmane.comp.version-control.git/187545
> 
> He made changes beyond yours, in particular "print >>" lines,
> that you seem to handle with 2to3 during the build.  I'm not sure
> which approach is better in the long run.  He worked on the
> other .py in contrib/ too.

In the long run I'd want to move away from "print >>" to use
"print(file=..., ...)" but that's only available from Python 2.6 onwards
(via a __future__ import) and I think we probably don't want to rule out
Python 2.5 yet.

Without 2to3 the only way to do this for both Python 2 and 3 is as
"file.write('...\n')".

> Can you give me some hints about the byte/unicode string issues
> in git-p4.py?  There's really only one place that does:
> 
>     p4 = subprocess.Popen("p4 -G ...")
>     marshal.load(p4.stdout)
> 
> If that's the only issue, this might not be too paniful.

The problem is that what gets loaded there is a dictionary (encoded by
p4) that maps byte strings to byte strings, so all of the accesses to
that dictionary need to either:

   1) explicitly call encode() on a string constant
or 2) use a byte string constant with a "b" prefix

Or we could re-write the dictionary once, which handles the keys... but
some of the values are also used as strings and we can't handle that as
a one-off conversion since in other places we really do want the byte
string (think content of binary files).

Basically a thorough audit of all access to variables that come from p4
would be needed, with explicit decode()s for authors, dates, etc.

> I hesitated to take Sebastian's changes due to the huge number of
> print() lines, but maybe a 2to3 approach would make that aspect
> of python3 support not too onerous.

I think we'd want to change to print() eventually and having a single
codebase for 2 and 3 would be nicer for development, but I think we need
to be able to say "no one is using Python 2.5 or earlier" before we can
do that and I'm not sure we're there yet.  From where we are at the
moment I think 2to3 is a good answer, particularly where we're already
using distutils to generate a release image.


John

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 2/8] git_remote_helpers: fix input when running under Python 3
  2013-01-12 19:23 ` [PATCH 2/8] git_remote_helpers: fix input when running under " John Keeping
@ 2013-01-13  3:26   ` Michael Haggerty
  2013-01-13 16:17     ` John Keeping
  0 siblings, 1 reply; 53+ messages in thread
From: Michael Haggerty @ 2013-01-13  3:26 UTC (permalink / raw)
  To: John Keeping; +Cc: git, Eric S. Raymond, Felipe Contreras, Sverre Rabbelier

On 01/12/2013 08:23 PM, John Keeping wrote:
> Although 2to3 will fix most issues in Python 2 code to make it run under
> Python 3, it does not handle the new strict separation between byte
> strings and unicode strings.  There is one instance in
> git_remote_helpers where we are caught by this.
> 
> Fix it by explicitly decoding the incoming byte string into a unicode
> string.  In this instance, use the locale under which the application is
> running.
> 
> Signed-off-by: John Keeping <john@keeping.me.uk>
> ---
>  git_remote_helpers/git/importer.py | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/git_remote_helpers/git/importer.py b/git_remote_helpers/git/importer.py
> index e28cc8f..6814003 100644
> --- a/git_remote_helpers/git/importer.py
> +++ b/git_remote_helpers/git/importer.py
> @@ -20,7 +20,7 @@ class GitImporter(object):
>          """Returns a dictionary with refs.
>          """
>          args = ["git", "--git-dir=" + gitdir, "for-each-ref", "refs/heads"]
> -        lines = check_output(args).strip().split('\n')
> +        lines = check_output(args).decode().strip().split('\n')
>          refs = {}
>          for line in lines:
>              value, name = line.split(' ')
> 

Won't this change cause an exception if the branch names are not all
valid strings in the current locale's encoding?  I don't see how this
assumption is justified (e.g., see git-check-ref-format(1) for the rules
governing reference names).

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 0/8] Initial support for Python 3
  2013-01-13  0:41   ` John Keeping
@ 2013-01-13 12:34     ` John Keeping
  2013-01-13 16:40     ` Pete Wyckoff
  1 sibling, 0 replies; 53+ messages in thread
From: John Keeping @ 2013-01-13 12:34 UTC (permalink / raw)
  To: Pete Wyckoff
  Cc: git, Eric S. Raymond, Felipe Contreras, Sverre Rabbelier,
	Sebastian Morr

On Sun, Jan 13, 2013 at 12:41:30AM +0000, John Keeping wrote:
> On Sat, Jan 12, 2013 at 06:43:04PM -0500, Pete Wyckoff wrote:
>> Can you give me some hints about the byte/unicode string issues
>> in git-p4.py?  There's really only one place that does:
>> 
>>     p4 = subprocess.Popen("p4 -G ...")
>>     marshal.load(p4.stdout)
>> 
>> If that's the only issue, this might not be too paniful.
> 
> The problem is that what gets loaded there is a dictionary (encoded by
> p4) that maps byte strings to byte strings, so all of the accesses to
> that dictionary need to either:
> 
>    1) explicitly call encode() on a string constant
> or 2) use a byte string constant with a "b" prefix
> 
> Or we could re-write the dictionary once, which handles the keys... but
> some of the values are also used as strings and we can't handle that as
> a one-off conversion since in other places we really do want the byte
> string (think content of binary files).
> 
> Basically a thorough audit of all access to variables that come from p4
> would be needed, with explicit decode()s for authors, dates, etc.

Having thought about this a bit more, another possibility would be to
apply this transformation once using something like this (completely
untested, I haven't looked up the keys of interest):

-- >8 --

def _noop(s):
    return s

def _decode(s):
    return s.decode('utf-8')

CONVERSION_MAP = {
    'user': _decode,
    'data': _decode
}

d = marshal.load(p4.stdout)
retval = {}
for k, v in d.items():
    key = k.decode('utf-8')
    retval[key] = CONVERSION_MAP.get(key, _noop)(v)
return retval

-- 8< --

Obviously this isn't ideal but without p4 gaining a Python 3 output mode
I suspect this would be the best we could do.


John

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 2/8] git_remote_helpers: fix input when running under Python 3
  2013-01-13  3:26   ` Michael Haggerty
@ 2013-01-13 16:17     ` John Keeping
  2013-01-14  4:48       ` Michael Haggerty
  0 siblings, 1 reply; 53+ messages in thread
From: John Keeping @ 2013-01-13 16:17 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: git, Eric S. Raymond, Felipe Contreras, Sverre Rabbelier

On Sun, Jan 13, 2013 at 04:26:39AM +0100, Michael Haggerty wrote:
> On 01/12/2013 08:23 PM, John Keeping wrote:
>> Although 2to3 will fix most issues in Python 2 code to make it run under
>> Python 3, it does not handle the new strict separation between byte
>> strings and unicode strings.  There is one instance in
>> git_remote_helpers where we are caught by this.
>> 
>> Fix it by explicitly decoding the incoming byte string into a unicode
>> string.  In this instance, use the locale under which the application is
>> running.
>> 
>> Signed-off-by: John Keeping <john@keeping.me.uk>
>> ---
>>  git_remote_helpers/git/importer.py | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>> 
>> diff --git a/git_remote_helpers/git/importer.py b/git_remote_helpers/git/importer.py
>> index e28cc8f..6814003 100644
>> --- a/git_remote_helpers/git/importer.py
>> +++ b/git_remote_helpers/git/importer.py
>> @@ -20,7 +20,7 @@ class GitImporter(object):
>>          """Returns a dictionary with refs.
>>          """
>>          args = ["git", "--git-dir=" + gitdir, "for-each-ref", "refs/heads"]
>> -        lines = check_output(args).strip().split('\n')
>> +        lines = check_output(args).decode().strip().split('\n')
>>          refs = {}
>>          for line in lines:
>>              value, name = line.split(' ')
>> 
> 
> Won't this change cause an exception if the branch names are not all
> valid strings in the current locale's encoding?  I don't see how this
> assumption is justified (e.g., see git-check-ref-format(1) for the rules
> governing reference names).

Yes it will.  The problem is that for Python 3 we need to decode the
byte string into a unicode string, which means we need to know what
encoding it is.

I don't think we can just say "git-for-each-ref will print refs in
UTF-8" since AFAIK git doesn't care what encoding the refs are in - I
suspect that's determined by the filesystem which in the end probably
maps to whatever bytes the shell fed git when the ref was created.

That's why I chose the current locale in this case.  I'm hoping someone
here will correct me if we can do better, but I don't see any way of
avoiding choosing some encoding here if we want to support Python 3
(which I think we will, even if we don't right now).


John

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 3/8] git_remote_helpers: Force rebuild if python version changes
  2013-01-12 23:30   ` Pete Wyckoff
@ 2013-01-13 16:26     ` John Keeping
  2013-01-13 17:14       ` Pete Wyckoff
  0 siblings, 1 reply; 53+ messages in thread
From: John Keeping @ 2013-01-13 16:26 UTC (permalink / raw)
  To: Pete Wyckoff; +Cc: git, Eric S. Raymond, Felipe Contreras, Sverre Rabbelier

On Sat, Jan 12, 2013 at 06:30:44PM -0500, Pete Wyckoff wrote:
> john@keeping.me.uk wrote on Sat, 12 Jan 2013 19:23 +0000:
>> When different version of python are used to build via distutils, the
>> behaviour can change.  Detect changes in version and pass --force in
>> this case.
>[..]
>> diff --git a/git_remote_helpers/Makefile b/git_remote_helpers/Makefile
>[..]
>> +py_version=$(shell $(PYTHON_PATH) -c \
>> +	'import sys; print("%i.%i" % sys.version_info[:2])')
>> +
>>  all: $(pysetupfile)
>> -	$(QUIET)$(PYTHON_PATH) $(pysetupfile) $(QUIETSETUP) build
>> +	$(QUIET)test "$$(cat GIT-PYTHON_VERSION 2>/dev/null)" = "$(py_version)" || \
>> +	flags=--force; \
>> +	$(PYTHON_PATH) $(pysetupfile) $(QUIETSETUP) build $$flags
>> +	$(QUIET)echo "$(py_version)" >GIT-PYTHON_VERSION
> 
> Can you depend on ../GIT-PYTHON-VARS instead?  It comes from
> 96a4647 (Makefile: detect when PYTHON_PATH changes, 2012-12-18).
> It doesn't check version, just path, but hopefully that's good
> enough.  I'm imagining a rule that would do "clean" if
> ../GIT-PYTHON-VARS changed, then build without --force.

I was trying to keep the git_remote_helpers directory self contained.  I
can't see how to depend on ../GIT-PYTHON-VARS in a way that is as simple
as this and keeps "make -C git_remote_helpers" working in a clean tree.

Am I missing something obvious here?


John

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 0/8] Initial support for Python 3
  2013-01-13  0:41   ` John Keeping
  2013-01-13 12:34     ` John Keeping
@ 2013-01-13 16:40     ` Pete Wyckoff
  2013-01-13 17:35       ` John Keeping
  1 sibling, 1 reply; 53+ messages in thread
From: Pete Wyckoff @ 2013-01-13 16:40 UTC (permalink / raw)
  To: John Keeping
  Cc: git, Eric S. Raymond, Felipe Contreras, Sverre Rabbelier,
	Sebastian Morr

john@keeping.me.uk wrote on Sun, 13 Jan 2013 00:41 +0000:
> On Sat, Jan 12, 2013 at 06:43:04PM -0500, Pete Wyckoff wrote:
> > Can you give me some hints about the byte/unicode string issues
> > in git-p4.py?  There's really only one place that does:
> > 
> >     p4 = subprocess.Popen("p4 -G ...")
> >     marshal.load(p4.stdout)
> > 
> > If that's the only issue, this might not be too paniful.
> 
> The problem is that what gets loaded there is a dictionary (encoded by
> p4) that maps byte strings to byte strings, so all of the accesses to
> that dictionary need to either:
> 
>    1) explicitly call encode() on a string constant
> or 2) use a byte string constant with a "b" prefix
> 
> Or we could re-write the dictionary once, which handles the keys... but
> some of the values are also used as strings and we can't handle that as
> a one-off conversion since in other places we really do want the byte
> string (think content of binary files).
> 
> Basically a thorough audit of all access to variables that come from p4
> would be needed, with explicit decode()s for authors, dates, etc.

Your auto-conversion snippet in the follow-up mail would work
fine for most keys and values.  A few perforce docs and some
playing around convince me that it is mostly utf-8, except for
file data for particular types.

I'd still rather handle each command separately, and think about
the conversions, to do it right in the long run.

> > I hesitated to take Sebastian's changes due to the huge number of
> > print() lines, but maybe a 2to3 approach would make that aspect
> > of python3 support not too onerous.
> 
> I think we'd want to change to print() eventually and having a single
> codebase for 2 and 3 would be nicer for development, but I think we need
> to be able to say "no one is using Python 2.5 or earlier" before we can
> do that and I'm not sure we're there yet.  From where we are at the
> moment I think 2to3 is a good answer, particularly where we're already
> using distutils to generate a release image.

Agreed.  The 2to3 diff is large but straightforward.  But these
p4 -G interface errors require a lot of thought and work.  I'm
not too eager to work on this yet.

Thanks.

		-- Pete

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 3/8] git_remote_helpers: Force rebuild if python version changes
  2013-01-13 16:26     ` John Keeping
@ 2013-01-13 17:14       ` Pete Wyckoff
  2013-01-13 17:52         ` John Keeping
  0 siblings, 1 reply; 53+ messages in thread
From: Pete Wyckoff @ 2013-01-13 17:14 UTC (permalink / raw)
  To: John Keeping; +Cc: git, Eric S. Raymond, Felipe Contreras, Sverre Rabbelier

john@keeping.me.uk wrote on Sun, 13 Jan 2013 16:26 +0000:
> On Sat, Jan 12, 2013 at 06:30:44PM -0500, Pete Wyckoff wrote:
> > john@keeping.me.uk wrote on Sat, 12 Jan 2013 19:23 +0000:
> >> When different version of python are used to build via distutils, the
> >> behaviour can change.  Detect changes in version and pass --force in
> >> this case.
> >[..]
> >> diff --git a/git_remote_helpers/Makefile b/git_remote_helpers/Makefile
> >[..]
> >> +py_version=$(shell $(PYTHON_PATH) -c \
> >> +	'import sys; print("%i.%i" % sys.version_info[:2])')
> >> +
> >>  all: $(pysetupfile)
> >> -	$(QUIET)$(PYTHON_PATH) $(pysetupfile) $(QUIETSETUP) build
> >> +	$(QUIET)test "$$(cat GIT-PYTHON_VERSION 2>/dev/null)" = "$(py_version)" || \
> >> +	flags=--force; \
> >> +	$(PYTHON_PATH) $(pysetupfile) $(QUIETSETUP) build $$flags
> >> +	$(QUIET)echo "$(py_version)" >GIT-PYTHON_VERSION
> > 
> > Can you depend on ../GIT-PYTHON-VARS instead?  It comes from
> > 96a4647 (Makefile: detect when PYTHON_PATH changes, 2012-12-18).
> > It doesn't check version, just path, but hopefully that's good
> > enough.  I'm imagining a rule that would do "clean" if
> > ../GIT-PYTHON-VARS changed, then build without --force.
> 
> I was trying to keep the git_remote_helpers directory self contained.  I
> can't see how to depend on ../GIT-PYTHON-VARS in a way that is as simple
> as this and keeps "make -C git_remote_helpers" working in a clean tree.
> 
> Am I missing something obvious here?

Not if it wants to stay self-contained; you're right.

I'm not thrilled with how git_remote_helpers/Makefile always
runs setup.py, and always generates PYLIBDIR, and now always
invokes python a third time to see if its version changed.

		-- Pete

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 0/8] Initial support for Python 3
  2013-01-13 16:40     ` Pete Wyckoff
@ 2013-01-13 17:35       ` John Keeping
  0 siblings, 0 replies; 53+ messages in thread
From: John Keeping @ 2013-01-13 17:35 UTC (permalink / raw)
  To: Pete Wyckoff
  Cc: git, Eric S. Raymond, Felipe Contreras, Sverre Rabbelier,
	Sebastian Morr

On Sun, Jan 13, 2013 at 11:40:45AM -0500, Pete Wyckoff wrote:
> john@keeping.me.uk wrote on Sun, 13 Jan 2013 00:41 +0000:
>> On Sat, Jan 12, 2013 at 06:43:04PM -0500, Pete Wyckoff wrote:
>> > Can you give me some hints about the byte/unicode string issues
>> > in git-p4.py?  There's really only one place that does:
>> > 
>> >     p4 = subprocess.Popen("p4 -G ...")
>> >     marshal.load(p4.stdout)
>> > 
>> > If that's the only issue, this might not be too paniful.
>> 
>> The problem is that what gets loaded there is a dictionary (encoded by
>> p4) that maps byte strings to byte strings, so all of the accesses to
>> that dictionary need to either:
>> 
>>    1) explicitly call encode() on a string constant
>> or 2) use a byte string constant with a "b" prefix
>> 
>> Or we could re-write the dictionary once, which handles the keys... but
>> some of the values are also used as strings and we can't handle that as
>> a one-off conversion since in other places we really do want the byte
>> string (think content of binary files).
>> 
>> Basically a thorough audit of all access to variables that come from p4
>> would be needed, with explicit decode()s for authors, dates, etc.
> 
> Your auto-conversion snippet in the follow-up mail would work
> fine for most keys and values.  A few perforce docs and some
> playing around convince me that it is mostly utf-8, except for
> file data for particular types.
> 
> I'd still rather handle each command separately, and think about
> the conversions, to do it right in the long run.

I sent that on the assumption that the same key would have similar
semantics wherever its used, but I don't use git-p4 or know much about
perforce.

It would be interesting to know whether there is any likelihood of p4
gaining a Python 3 output mode (since the documentation currently say
not to use "p4 -G" with Python 3).  If it does then I would assume that
it will make a sensible choice about unicode/bytes such that the
existing git-p4 would Just Work with only a small change to the
invocation of p4 to add the new argument.

>> > I hesitated to take Sebastian's changes due to the huge number of
>> > print() lines, but maybe a 2to3 approach would make that aspect
>> > of python3 support not too onerous.
>> 
>> I think we'd want to change to print() eventually and having a single
>> codebase for 2 and 3 would be nicer for development, but I think we need
>> to be able to say "no one is using Python 2.5 or earlier" before we can
>> do that and I'm not sure we're there yet.  From where we are at the
>> moment I think 2to3 is a good answer, particularly where we're already
>> using distutils to generate a release image.
> 
> Agreed.  The 2to3 diff is large but straightforward.  But these
> p4 -G interface errors require a lot of thought and work.  I'm
> not too eager to work on this yet.

Fair enough.  As I don't use git-p4, it's not something I intend to
tackle either (given the scale of the changes involved).

Given the minimal scope of the changes needed for everything else, I
sent this series wondering whether it's sensible to move forward on the
basis of "Python scripts except git-p4 work with Python 3.  You must use
Python 2 if you want to use git-p4".


John

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 3/8] git_remote_helpers: Force rebuild if python version changes
  2013-01-13 17:14       ` Pete Wyckoff
@ 2013-01-13 17:52         ` John Keeping
  2013-01-15 22:58           ` John Keeping
  0 siblings, 1 reply; 53+ messages in thread
From: John Keeping @ 2013-01-13 17:52 UTC (permalink / raw)
  To: Pete Wyckoff; +Cc: git, Eric S. Raymond, Felipe Contreras, Sverre Rabbelier

On Sun, Jan 13, 2013 at 12:14:02PM -0500, Pete Wyckoff wrote:
> john@keeping.me.uk wrote on Sun, 13 Jan 2013 16:26 +0000:
>> On Sat, Jan 12, 2013 at 06:30:44PM -0500, Pete Wyckoff wrote:
>> > john@keeping.me.uk wrote on Sat, 12 Jan 2013 19:23 +0000:
>> >> When different version of python are used to build via distutils, the
>> >> behaviour can change.  Detect changes in version and pass --force in
>> >> this case.
>> >[..]
>> >> diff --git a/git_remote_helpers/Makefile b/git_remote_helpers/Makefile
>> >[..]
>> >> +py_version=$(shell $(PYTHON_PATH) -c \
>> >> +	'import sys; print("%i.%i" % sys.version_info[:2])')
>> >> +
>> >>  all: $(pysetupfile)
>> >> -	$(QUIET)$(PYTHON_PATH) $(pysetupfile) $(QUIETSETUP) build
>> >> +	$(QUIET)test "$$(cat GIT-PYTHON_VERSION 2>/dev/null)" = "$(py_version)" || \
>> >> +	flags=--force; \
>> >> +	$(PYTHON_PATH) $(pysetupfile) $(QUIETSETUP) build $$flags
>> >> +	$(QUIET)echo "$(py_version)" >GIT-PYTHON_VERSION
>> > 
>> > Can you depend on ../GIT-PYTHON-VARS instead?  It comes from
>> > 96a4647 (Makefile: detect when PYTHON_PATH changes, 2012-12-18).
>> > It doesn't check version, just path, but hopefully that's good
>> > enough.  I'm imagining a rule that would do "clean" if
>> > ../GIT-PYTHON-VARS changed, then build without --force.
>> 
>> I was trying to keep the git_remote_helpers directory self contained.  I
>> can't see how to depend on ../GIT-PYTHON-VARS in a way that is as simple
>> as this and keeps "make -C git_remote_helpers" working in a clean tree.
>> 
>> Am I missing something obvious here?
> 
> Not if it wants to stay self-contained; you're right.
> 
> I'm not thrilled with how git_remote_helpers/Makefile always
> runs setup.py, and always generates PYLIBDIR, and now always
> invokes python a third time to see if its version changed.

I don't think PYLIBDIR will be calculated unless it's used ('=' not
':=' means its a deferred variable).

I wonder if the version check should move into setup.py - it would be
just as easy to check the file there and massage sys.args, although
possibly not as neat.


John

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 2/8] git_remote_helpers: fix input when running under Python 3
  2013-01-13 16:17     ` John Keeping
@ 2013-01-14  4:48       ` Michael Haggerty
  2013-01-14  9:47         ` John Keeping
  0 siblings, 1 reply; 53+ messages in thread
From: Michael Haggerty @ 2013-01-14  4:48 UTC (permalink / raw)
  To: John Keeping; +Cc: git, Eric S. Raymond, Felipe Contreras, Sverre Rabbelier

On 01/13/2013 05:17 PM, John Keeping wrote:
> On Sun, Jan 13, 2013 at 04:26:39AM +0100, Michael Haggerty wrote:
>> On 01/12/2013 08:23 PM, John Keeping wrote:
>>> Although 2to3 will fix most issues in Python 2 code to make it run under
>>> Python 3, it does not handle the new strict separation between byte
>>> strings and unicode strings.  There is one instance in
>>> git_remote_helpers where we are caught by this.
>>>
>>> Fix it by explicitly decoding the incoming byte string into a unicode
>>> string.  In this instance, use the locale under which the application is
>>> running.
>>>
>>> Signed-off-by: John Keeping <john@keeping.me.uk>
>>> ---
>>>  git_remote_helpers/git/importer.py | 2 +-
>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/git_remote_helpers/git/importer.py b/git_remote_helpers/git/importer.py
>>> index e28cc8f..6814003 100644
>>> --- a/git_remote_helpers/git/importer.py
>>> +++ b/git_remote_helpers/git/importer.py
>>> @@ -20,7 +20,7 @@ class GitImporter(object):
>>>          """Returns a dictionary with refs.
>>>          """
>>>          args = ["git", "--git-dir=" + gitdir, "for-each-ref", "refs/heads"]
>>> -        lines = check_output(args).strip().split('\n')
>>> +        lines = check_output(args).decode().strip().split('\n')
>>>          refs = {}
>>>          for line in lines:
>>>              value, name = line.split(' ')
>>>
>>
>> Won't this change cause an exception if the branch names are not all
>> valid strings in the current locale's encoding?  I don't see how this
>> assumption is justified (e.g., see git-check-ref-format(1) for the rules
>> governing reference names).
> 
> Yes it will.  The problem is that for Python 3 we need to decode the
> byte string into a unicode string, which means we need to know what
> encoding it is.
> 
> I don't think we can just say "git-for-each-ref will print refs in
> UTF-8" since AFAIK git doesn't care what encoding the refs are in - I
> suspect that's determined by the filesystem which in the end probably
> maps to whatever bytes the shell fed git when the ref was created.
> 
> That's why I chose the current locale in this case.  I'm hoping someone
> here will correct me if we can do better, but I don't see any way of
> avoiding choosing some encoding here if we want to support Python 3
> (which I think we will, even if we don't right now).

I'm not just trying to be a nuisance here; I'm struggling myself to
understand how a program that cares about strings-vs-bytes (e.g., a
Python3 script) should coexist with a program that doesn't (e.g., git
[1]).  I think this will become a big issue if my Python version of the
commit email script ever gets integrated and then made compatible with
Python3.

You claim "for Python 3 we need to decode the byte string into a unicode
string".  I understand that Python 3 strings are Unicode, but why/when
is it necessary to decode data into a Unicode string as opposed to
leaving it as a byte sequence?

In this particular case (from a cursory look over the code) it seems to
me that (1) decoding to Unicode will sometimes fail for data that git
considers valid and (2) there is no obvious reason that the data cannot
be processed as byte sequences.

Michael

[1] And it doesn't just seem that "git doesn't care about Unicode
*yet*".  It seems more likely that "git will adamantly refuse to deal
with Unicode".  For example, Linus is quite clearly in favor of treating
data as byte sequences in most situations:
https://plus.google.com/111049168280159033135/posts/f3fngVm174f

-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 2/8] git_remote_helpers: fix input when running under Python 3
  2013-01-14  4:48       ` Michael Haggerty
@ 2013-01-14  9:47         ` John Keeping
  2013-01-15 19:48           ` [RFC/PATCH 2/8 v2] " John Keeping
  0 siblings, 1 reply; 53+ messages in thread
From: John Keeping @ 2013-01-14  9:47 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: git, Eric S. Raymond, Felipe Contreras, Sverre Rabbelier

On Mon, Jan 14, 2013 at 05:48:18AM +0100, Michael Haggerty wrote:
> On 01/13/2013 05:17 PM, John Keeping wrote:
>> On Sun, Jan 13, 2013 at 04:26:39AM +0100, Michael Haggerty wrote:
>>> On 01/12/2013 08:23 PM, John Keeping wrote:
>>>> Although 2to3 will fix most issues in Python 2 code to make it run under
>>>> Python 3, it does not handle the new strict separation between byte
>>>> strings and unicode strings.  There is one instance in
>>>> git_remote_helpers where we are caught by this.
>>>>
>>>> Fix it by explicitly decoding the incoming byte string into a unicode
>>>> string.  In this instance, use the locale under which the application is
>>>> running.
>>>>
>>>> Signed-off-by: John Keeping <john@keeping.me.uk>
>>>> ---
>>>>  git_remote_helpers/git/importer.py | 2 +-
>>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>>> diff --git a/git_remote_helpers/git/importer.py b/git_remote_helpers/git/importer.py
>>>> index e28cc8f..6814003 100644
>>>> --- a/git_remote_helpers/git/importer.py
>>>> +++ b/git_remote_helpers/git/importer.py
>>>> @@ -20,7 +20,7 @@ class GitImporter(object):
>>>>          """Returns a dictionary with refs.
>>>>          """
>>>>          args = ["git", "--git-dir=" + gitdir, "for-each-ref", "refs/heads"]
>>>> -        lines = check_output(args).strip().split('\n')
>>>> +        lines = check_output(args).decode().strip().split('\n')
>>>>          refs = {}
>>>>          for line in lines:
>>>>              value, name = line.split(' ')
>>>>
>>>
>>> Won't this change cause an exception if the branch names are not all
>>> valid strings in the current locale's encoding?  I don't see how this
>>> assumption is justified (e.g., see git-check-ref-format(1) for the rules
>>> governing reference names).
>> 
>> Yes it will.  The problem is that for Python 3 we need to decode the
>> byte string into a unicode string, which means we need to know what
>> encoding it is.
>> 
>> I don't think we can just say "git-for-each-ref will print refs in
>> UTF-8" since AFAIK git doesn't care what encoding the refs are in - I
>> suspect that's determined by the filesystem which in the end probably
>> maps to whatever bytes the shell fed git when the ref was created.
>> 
>> That's why I chose the current locale in this case.  I'm hoping someone
>> here will correct me if we can do better, but I don't see any way of
>> avoiding choosing some encoding here if we want to support Python 3
>> (which I think we will, even if we don't right now).
> 
> I'm not just trying to be a nuisance here;

You're not being - I think this is a difficult issue and I don't know
myself what the right answer is.

>                                            I'm struggling myself to
> understand how a program that cares about strings-vs-bytes (e.g., a
> Python3 script) should coexist with a program that doesn't (e.g., git
> [1]).  I think this will become a big issue if my Python version of the
> commit email script ever gets integrated and then made compatible with
> Python3.
> 
> You claim "for Python 3 we need to decode the byte string into a unicode
> string".  I understand that Python 3 strings are Unicode, but why/when
> is it necessary to decode data into a Unicode string as opposed to
> leaving it as a byte sequence?
> 
> In this particular case (from a cursory look over the code) it seems to
> me that (1) decoding to Unicode will sometimes fail for data that git
> considers valid and (2) there is no obvious reason that the data cannot
> be processed as byte sequences.

I've been thinking about this overnight and I think you're right that
treating them as byte strings in Python is most correct.  Having said
that, when I'm programming in Python I would find it quite surprising
that I had to treat ref strings specially - and as soon as I want to use
one in a string context (e.g. printing it as part of a message to the
user) I'm back to the same problem.

So I think we should try to solve the problem once rather than forcing
everyone who wants to use the library to solve it individually.  I just
wish it was obvious what we should do!


John

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [RFC/PATCH 2/8 v2] git_remote_helpers: fix input when running under Python 3
  2013-01-14  9:47         ` John Keeping
@ 2013-01-15 19:48           ` John Keeping
  2013-01-15 20:51             ` Junio C Hamano
  0 siblings, 1 reply; 53+ messages in thread
From: John Keeping @ 2013-01-15 19:48 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: git, Eric S. Raymond, Felipe Contreras, Sverre Rabbelier

Although 2to3 will fix most issues in Python 2 code to make it run under
Python 3, it does not handle the new strict separation between byte
strings and unicode strings.  There is one instance in
git_remote_helpers where we are caught by this, which is when reading
refs from "git for-each-ref".

While we could fix this by explicitly handling refs as byte strings,
this is merely punting the problem to users of the library since the
same problem will be encountered as soon you want to display the ref
name to a user.

Instead of doing this, explicit decode the incoming byte string into a
unicode string.  Following the lead of pygit2 (the Python bindings for
libgit2 - see [1] and [2]), use the filesystem encoding by default,
providing a way for callers to override this if necessary.

[1] https://github.com/libgit2/pygit2/blob/e34911b63e5d2266f9f72a4e3f32e27b13190feb/src/pygit2/reference.c#L261
[2] https://github.com/libgit2/pygit2/blob/e34911b63e5d2266f9f72a4e3f32e27b13190feb/include/pygit2/utils.h#L55

Signed-off-by: John Keeping <john@keeping.me.uk>
---

I think this is in fact the best way to handle this, and I hope the
above description clarified why I don't think we want to treat refs as
byte strings in Python 3.

My only remaining question is whether it would be better to set the
error mode when decoding to "replace" instead of "strict" (the default).
"strict" will cause a UnicodeError if the string cannot be decoded
whereas "replace" will use U+FFFD (the replacement character). [3]

I think it's better to use "strict" and let the user know that
something has gone wrong rather than silently change the string, but I'd
welcome other opinions.

[3] http://docs.python.org/2/library/codecs.html#codec-base-classes

 git_remote_helpers/git/importer.py | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/git_remote_helpers/git/importer.py b/git_remote_helpers/git/importer.py
index e28cc8f..5bc16a4 100644
--- a/git_remote_helpers/git/importer.py
+++ b/git_remote_helpers/git/importer.py
@@ -1,5 +1,6 @@
 import os
 import subprocess
+import sys
 
 from git_remote_helpers.util import check_call, check_output
 
@@ -10,17 +11,26 @@ class GitImporter(object):
     This importer simply delegates to git fast-import.
     """
 
-    def __init__(self, repo):
+    def __init__(self, repo, ref_encoding=None):
         """Creates a new importer for the specified repo.
+
+        If ref_encoding is specified that refs are decoded using that
+        encoding.  Otherwise the system filesystem encoding is used.
         """
 
         self.repo = repo
+        self.ref_encoding = ref_encoding
 
     def get_refs(self, gitdir):
         """Returns a dictionary with refs.
         """
         args = ["git", "--git-dir=" + gitdir, "for-each-ref", "refs/heads"]
-        lines = check_output(args).strip().split('\n')
+        encoding = self.ref_encoding
+        if encoding is None:
+            encoding = sys.getfilesystemencoding()
+            if encoding is None:
+                encoding = sys.getdefaultencoding()
+        lines = check_output(args).decode(encoding).strip().split('\n')
         refs = {}
         for line in lines:
             value, name = line.split(' ')
-- 
1.8.1

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* Re: [RFC/PATCH 2/8 v2] git_remote_helpers: fix input when running under Python 3
  2013-01-15 19:48           ` [RFC/PATCH 2/8 v2] " John Keeping
@ 2013-01-15 20:51             ` Junio C Hamano
  2013-01-15 21:54               ` John Keeping
  0 siblings, 1 reply; 53+ messages in thread
From: Junio C Hamano @ 2013-01-15 20:51 UTC (permalink / raw)
  To: John Keeping
  Cc: Michael Haggerty, git, Eric S. Raymond, Felipe Contreras,
	Sverre Rabbelier

John Keeping <john@keeping.me.uk> writes:

> Although 2to3 will fix most issues in Python 2 code to make it run under
> Python 3, it does not handle the new strict separation between byte
> strings and unicode strings.  There is one instance in
> git_remote_helpers where we are caught by this, which is when reading
> refs from "git for-each-ref".
>
> While we could fix this by explicitly handling refs as byte strings,
> this is merely punting the problem to users of the library since the
> same problem will be encountered as soon you want to display the ref
> name to a user.
>
> Instead of doing this, explicit decode the incoming byte string into a
> unicode string.

That really feels wrong.  Displaying is a separate issue and it is
the _right_ thing to punt the problem at the lower-level machinery
level.

> Following the lead of pygit2 (the Python bindings for
> libgit2 - see [1] and [2]),...

I do not think other people getting it wrong is not an excuse to
repeat the same mistake.

Is it really so cumbersome to handle byte strings as byte strings in
Python?

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC/PATCH 2/8 v2] git_remote_helpers: fix input when running under Python 3
  2013-01-15 20:51             ` Junio C Hamano
@ 2013-01-15 21:54               ` John Keeping
  2013-01-15 22:04                 ` Junio C Hamano
  0 siblings, 1 reply; 53+ messages in thread
From: John Keeping @ 2013-01-15 21:54 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Michael Haggerty, git, Eric S. Raymond, Felipe Contreras,
	Sverre Rabbelier

On Tue, Jan 15, 2013 at 12:51:13PM -0800, Junio C Hamano wrote:
> John Keeping <john@keeping.me.uk> writes:
>> Although 2to3 will fix most issues in Python 2 code to make it run under
>> Python 3, it does not handle the new strict separation between byte
>> strings and unicode strings.  There is one instance in
>> git_remote_helpers where we are caught by this, which is when reading
>> refs from "git for-each-ref".
>>
>> While we could fix this by explicitly handling refs as byte strings,
>> this is merely punting the problem to users of the library since the
>> same problem will be encountered as soon you want to display the ref
>> name to a user.
>>
>> Instead of doing this, explicit decode the incoming byte string into a
>> unicode string.
> 
> That really feels wrong.  Displaying is a separate issue and it is
> the _right_ thing to punt the problem at the lower-level machinery
> level.

But the display will require decoding the ref name to a Unicode string,
which depends on the encoding of the underlying ref name, so it feels
like it should be decoded where it's read (see [1]).

>> Following the lead of pygit2 (the Python bindings for
>> libgit2 - see [1] and [2]),...
> 
> I do not think other people getting it wrong is not an excuse to
> repeat the same mistake.
>
> Is it really so cumbersome to handle byte strings as byte strings in
> Python?

As [1] says, there is a potential for bugs whenever people attempt to
combine Unicode and byte strings.  I think it also violates the
principle of least surprise if a ref name (a string) doesn't behave like
a normal string.

[1] http://docs.python.org/3.3/howto/unicode.html#tips-for-writing-unicode-aware-programs


John

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC/PATCH 2/8 v2] git_remote_helpers: fix input when running under Python 3
  2013-01-15 21:54               ` John Keeping
@ 2013-01-15 22:04                 ` Junio C Hamano
  2013-01-15 22:40                   ` [RFC/PATCH 2/8 v3] " John Keeping
  0 siblings, 1 reply; 53+ messages in thread
From: Junio C Hamano @ 2013-01-15 22:04 UTC (permalink / raw)
  To: John Keeping
  Cc: Michael Haggerty, git, Eric S. Raymond, Felipe Contreras,
	Sverre Rabbelier

John Keeping <john@keeping.me.uk> writes:

>> That really feels wrong.  Displaying is a separate issue and it is
>> the _right_ thing to punt the problem at the lower-level machinery
>> level.
>
> But the display will require decoding the ref name to a Unicode string,
> which depends on the encoding of the underlying ref name, so it feels
> like it should be decoded where it's read (see [1]).

If you botch the decoding in a way you cannot recover the original
byte string, you cannot create a ref whose name is the original byte
string, no?  Keeping the original byte string internally (this
includes where you use it to create new refs or update existing
refs), and attempting to convert it to Unicode when you choose to
show that string as a part of a message to the user (and falling
back to replacing some bytes to '?' if you cannot, but do so only in
the message), you won't have that problem.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [RFC/PATCH 2/8 v3] git_remote_helpers: fix input when running under Python 3
  2013-01-15 22:04                 ` Junio C Hamano
@ 2013-01-15 22:40                   ` John Keeping
  2013-01-16  0:03                     ` Pete Wyckoff
  0 siblings, 1 reply; 53+ messages in thread
From: John Keeping @ 2013-01-15 22:40 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Michael Haggerty, git, Eric S. Raymond, Felipe Contreras,
	Sverre Rabbelier

Although 2to3 will fix most issues in Python 2 code to make it run under
Python 3, it does not handle the new strict separation between byte
strings and unicode strings.  There is one instance in
git_remote_helpers where we are caught by this, which is when reading
refs from "git for-each-ref".

Fix this by operating on the returned string as a byte string rather
than a unicode string.  As this method is currently only used internally
by the class this does not affect code anywhere else.

Note that we cannot use byte strings in the source as the 'b' prefix is
not supported before Python 2.7 so in order to maintain compatibility
with the maximum range of Python versions we use an explicit call to
encode().

Signed-off-by: John Keeping <john@keeping.me.uk>
---

On Tue, Jan 15, 2013 at 02:04:29PM -0800, Junio C Hamano wrote:
> John Keeping <john@keeping.me.uk> writes:
>>> That really feels wrong.  Displaying is a separate issue and it is
>>> the _right_ thing to punt the problem at the lower-level machinery
>>> level.
>>
>> But the display will require decoding the ref name to a Unicode string,
>> which depends on the encoding of the underlying ref name, so it feels
>> like it should be decoded where it's read (see [1]).
> 
> If you botch the decoding in a way you cannot recover the original
> byte string, you cannot create a ref whose name is the original byte
> string, no?  Keeping the original byte string internally (this
> includes where you use it to create new refs or update existing
> refs), and attempting to convert it to Unicode when you choose to
> show that string as a part of a message to the user (and falling
> back to replacing some bytes to '?' if you cannot, but do so only in
> the message), you won't have that problem.

Actually, this method is currently only used internally so I don't think
my argument holds.

This is what keeping the refs as byte strings looks like.


 git_remote_helpers/git/importer.py | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/git_remote_helpers/git/importer.py b/git_remote_helpers/git/importer.py
index e28cc8f..c54846c 100644
--- a/git_remote_helpers/git/importer.py
+++ b/git_remote_helpers/git/importer.py
@@ -18,13 +18,16 @@ class GitImporter(object):
 
     def get_refs(self, gitdir):
         """Returns a dictionary with refs.
+
+        Note that the keys in the returned dictionary are byte strings as
+        read from git.
         """
         args = ["git", "--git-dir=" + gitdir, "for-each-ref", "refs/heads"]
-        lines = check_output(args).strip().split('\n')
+        lines = check_output(args).strip().split('\n'.encode('utf-8'))
         refs = {}
         for line in lines:
-            value, name = line.split(' ')
-            name = name.strip('commit\t')
+            value, name = line.split(' '.encode('utf-8'))
+            name = name.strip('commit\t'.encode('utf-8'))
             refs[name] = value
         return refs
 
-- 
1.8.1

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* Re: [PATCH 3/8] git_remote_helpers: Force rebuild if python version changes
  2013-01-13 17:52         ` John Keeping
@ 2013-01-15 22:58           ` John Keeping
  2013-01-17  0:27             ` Pete Wyckoff
  0 siblings, 1 reply; 53+ messages in thread
From: John Keeping @ 2013-01-15 22:58 UTC (permalink / raw)
  To: Pete Wyckoff; +Cc: git, Eric S. Raymond, Felipe Contreras, Sverre Rabbelier

On Sun, Jan 13, 2013 at 05:52:38PM +0000, John Keeping wrote:
> On Sun, Jan 13, 2013 at 12:14:02PM -0500, Pete Wyckoff wrote:
>> john@keeping.me.uk wrote on Sun, 13 Jan 2013 16:26 +0000:
>>> On Sat, Jan 12, 2013 at 06:30:44PM -0500, Pete Wyckoff wrote:
>>> > john@keeping.me.uk wrote on Sat, 12 Jan 2013 19:23 +0000:
>>> >> When different version of python are used to build via distutils, the
>>> >> behaviour can change.  Detect changes in version and pass --force in
>>> >> this case.
>>> >[..]
>>> >> diff --git a/git_remote_helpers/Makefile b/git_remote_helpers/Makefile
>>> >[..]
>>> >> +py_version=$(shell $(PYTHON_PATH) -c \
>>> >> +	'import sys; print("%i.%i" % sys.version_info[:2])')
>>> >> +
>>> >>  all: $(pysetupfile)
>>> >> -	$(QUIET)$(PYTHON_PATH) $(pysetupfile) $(QUIETSETUP) build
>>> >> +	$(QUIET)test "$$(cat GIT-PYTHON_VERSION 2>/dev/null)" = "$(py_version)" || \
>>> >> +	flags=--force; \
>>> >> +	$(PYTHON_PATH) $(pysetupfile) $(QUIETSETUP) build $$flags
>>> >> +	$(QUIET)echo "$(py_version)" >GIT-PYTHON_VERSION
>>> > 
>>> > Can you depend on ../GIT-PYTHON-VARS instead?  It comes from
>>> > 96a4647 (Makefile: detect when PYTHON_PATH changes, 2012-12-18).
>>> > It doesn't check version, just path, but hopefully that's good
>>> > enough.  I'm imagining a rule that would do "clean" if
>>> > ../GIT-PYTHON-VARS changed, then build without --force.
>>> 
>>> I was trying to keep the git_remote_helpers directory self contained.  I
>>> can't see how to depend on ../GIT-PYTHON-VARS in a way that is as simple
>>> as this and keeps "make -C git_remote_helpers" working in a clean tree.
>>> 
>>> Am I missing something obvious here?
>> 
>> Not if it wants to stay self-contained; you're right.
>> 
>> I'm not thrilled with how git_remote_helpers/Makefile always
>> runs setup.py, and always generates PYLIBDIR, and now always
>> invokes python a third time to see if its version changed.
> 
> I don't think PYLIBDIR will be calculated unless it's used ('=' not
> ':=' means its a deferred variable).
> 
> I wonder if the version check should move into setup.py - it would be
> just as easy to check the file there and massage sys.args, although
> possibly not as neat.

For reference, putting the version check in setup.py looks like this:

-- >8 --

diff --git a/git_remote_helpers/setup.py b/git_remote_helpers/setup.py
index 6de41de..2c21eb5 100644
--- a/git_remote_helpers/setup.py
+++ b/git_remote_helpers/setup.py
@@ -3,6 +3,7 @@
 """Distutils build/install script for the git_remote_helpers package."""
 
 from distutils.core import setup
+import sys
 
 # If building under Python3 we need to run 2to3 on the code, do this by
 # trying to import distutils' 2to3 builder, which is only available in
@@ -13,6 +14,24 @@ except ImportError:
     # 2.x
     from distutils.command.build_py import build_py
 
+
+current_version = '%d.%d' % sys.version_info[:2]
+try:
+    f = open('GIT-PYTHON_VERSION', 'r')
+    latest_version = f.read().strip()
+    f.close()
+
+    if latest_version != current_version:
+        if not '--force' in sys.argv:
+            sys.argv.insert(0, '--force')
+except IOError:
+    pass
+
+f = open('GIT-PYTHON_VERSION', 'w')
+f.write(current_version)
+f.close()
+
+
 setup(
     name = 'git_remote_helpers',
     version = '0.1.0',

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* Re: [RFC/PATCH 2/8 v3] git_remote_helpers: fix input when running under Python 3
  2013-01-15 22:40                   ` [RFC/PATCH 2/8 v3] " John Keeping
@ 2013-01-16  0:03                     ` Pete Wyckoff
  2013-01-16  9:45                       ` John Keeping
  0 siblings, 1 reply; 53+ messages in thread
From: Pete Wyckoff @ 2013-01-16  0:03 UTC (permalink / raw)
  To: John Keeping
  Cc: Junio C Hamano, Michael Haggerty, git, Eric S. Raymond,
	Felipe Contreras, Sverre Rabbelier

john@keeping.me.uk wrote on Tue, 15 Jan 2013 22:40 +0000:
> This is what keeping the refs as byte strings looks like.

As John knows, it is not possible to interpret text from a byte
string without talking about the character encoding.

Git is (largely) a C program and uses the character set defined
in the C standard, which is a subset of ASCII.  But git does
"math" on strings, like this snippet that takes something from
argv[] and prepends "refs/heads/":

    strcpy(refname, "refs/heads/");
    strcpy(refname + strlen("refs/heads/"), ret->name);

The result doesn't talk about what character set it is using,
but because it combines a prefix from ASCII with its input,
git makes the assumption that the input is ASCII-compatible.

If you feed a UTF-16 string in argv, e.g.

    $ echo master | iconv -f ascii -t utf16 | xargs git branch
    xargs: Warning: a NUL character occurred in the input.  It cannot be passed through in the argument list.  Did you mean to use the --null option?
    fatal: Not a valid object name: ''.

you get an error about NUL, and not the branch you hoped for.
Git assumes that the input character set contains roughly ASCII
in byte positions 0..127.

That's one small reason why the useful character encodings put
ASCII in the 0..127 range, including utf-8, big5 and shift-jis.
ASCII is indeed special due to its legacy, and both C and Python
recognize this.

> diff --git a/git_remote_helpers/git/importer.py b/git_remote_helpers/git/importer.py
> @@ -18,13 +18,16 @@ class GitImporter(object):
>  
>      def get_refs(self, gitdir):
>          """Returns a dictionary with refs.
> +
> +        Note that the keys in the returned dictionary are byte strings as
> +        read from git.
>          """
>          args = ["git", "--git-dir=" + gitdir, "for-each-ref", "refs/heads"]
> -        lines = check_output(args).strip().split('\n')
> +        lines = check_output(args).strip().split('\n'.encode('utf-8'))
>          refs = {}
>          for line in lines:
> -            value, name = line.split(' ')
> -            name = name.strip('commit\t')
> +            value, name = line.split(' '.encode('utf-8'))
> +            name = name.strip('commit\t'.encode('utf-8'))
>              refs[name] = value
>          return refs

I'd suggest for this Python conundrum using byte-string literals, e.g.:

        lines = check_output(args).strip().split(b'\n')
	value, name = line.split(b' ')
	name = name.strip(b'commit\t')

Essentially identical to what you have, but avoids naming "utf-8" as
the encoding.  It instead relies on Python's interpretation of
ASCII characters in string context, which is exactly what C does.

		-- Pete

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC/PATCH 2/8 v3] git_remote_helpers: fix input when running under Python 3
  2013-01-16  0:03                     ` Pete Wyckoff
@ 2013-01-16  9:45                       ` John Keeping
  2013-01-17  0:29                         ` Pete Wyckoff
  0 siblings, 1 reply; 53+ messages in thread
From: John Keeping @ 2013-01-16  9:45 UTC (permalink / raw)
  To: Pete Wyckoff
  Cc: Junio C Hamano, Michael Haggerty, git, Eric S. Raymond,
	Felipe Contreras, Sverre Rabbelier

On Tue, Jan 15, 2013 at 07:03:16PM -0500, Pete Wyckoff wrote:
> john@keeping.me.uk wrote on Tue, 15 Jan 2013 22:40 +0000:
>> This is what keeping the refs as byte strings looks like.
> 
> As John knows, it is not possible to interpret text from a byte
> string without talking about the character encoding.
> 
> Git is (largely) a C program and uses the character set defined
> in the C standard, which is a subset of ASCII.  But git does
> "math" on strings, like this snippet that takes something from
> argv[] and prepends "refs/heads/":
> 
>     strcpy(refname, "refs/heads/");
>     strcpy(refname + strlen("refs/heads/"), ret->name);
> 
> The result doesn't talk about what character set it is using,
> but because it combines a prefix from ASCII with its input,
> git makes the assumption that the input is ASCII-compatible.
> 
> If you feed a UTF-16 string in argv, e.g.
> 
>     $ echo master | iconv -f ascii -t utf16 | xargs git branch
>     xargs: Warning: a NUL character occurred in the input.  It cannot be passed through in the argument list.  Did you mean to use the --null option?
>     fatal: Not a valid object name: ''.
> 
> you get an error about NUL, and not the branch you hoped for.
> Git assumes that the input character set contains roughly ASCII
> in byte positions 0..127.
> 
> That's one small reason why the useful character encodings put
> ASCII in the 0..127 range, including utf-8, big5 and shift-jis.
> ASCII is indeed special due to its legacy, and both C and Python
> recognize this.
> 
>> diff --git a/git_remote_helpers/git/importer.py b/git_remote_helpers/git/importer.py
>> @@ -18,13 +18,16 @@ class GitImporter(object):
>>  
>>      def get_refs(self, gitdir):
>>          """Returns a dictionary with refs.
>> +
>> +        Note that the keys in the returned dictionary are byte strings as
>> +        read from git.
>>          """
>>          args = ["git", "--git-dir=" + gitdir, "for-each-ref", "refs/heads"]
>> -        lines = check_output(args).strip().split('\n')
>> +        lines = check_output(args).strip().split('\n'.encode('utf-8'))
>>          refs = {}
>>          for line in lines:
>> -            value, name = line.split(' ')
>> -            name = name.strip('commit\t')
>> +            value, name = line.split(' '.encode('utf-8'))
>> +            name = name.strip('commit\t'.encode('utf-8'))
>>              refs[name] = value
>>          return refs
> 
> I'd suggest for this Python conundrum using byte-string literals, e.g.:
> 
>         lines = check_output(args).strip().split(b'\n')
> 	value, name = line.split(b' ')
> 	name = name.strip(b'commit\t')
> 
> Essentially identical to what you have, but avoids naming "utf-8" as
> the encoding.  It instead relies on Python's interpretation of
> ASCII characters in string context, which is exactly what C does.

The problem is that AFAICT the byte-string prefix is only available in
Python 2.7 and later (compare [1] and [2]).  I think we need this more
convoluted code if we want to keep supporting Python 2.6 (although
perhaps 'ascii' would be a better choice than 'utf-8').

[1] http://docs.python.org/2.6/reference/lexical_analysis.html#literals
[2] http://docs.python.org/2.7/reference/lexical_analysis.html#literals


John

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 3/8] git_remote_helpers: Force rebuild if python version changes
  2013-01-15 22:58           ` John Keeping
@ 2013-01-17  0:27             ` Pete Wyckoff
  0 siblings, 0 replies; 53+ messages in thread
From: Pete Wyckoff @ 2013-01-17  0:27 UTC (permalink / raw)
  To: John Keeping; +Cc: git, Eric S. Raymond, Felipe Contreras, Sverre Rabbelier

john@keeping.me.uk wrote on Tue, 15 Jan 2013 22:58 +0000:
> For reference, putting the version check in setup.py looks like this:
> 
> -- >8 --
> 
> diff --git a/git_remote_helpers/setup.py b/git_remote_helpers/setup.py
> index 6de41de..2c21eb5 100644
> --- a/git_remote_helpers/setup.py
> +++ b/git_remote_helpers/setup.py
> @@ -3,6 +3,7 @@
>  """Distutils build/install script for the git_remote_helpers package."""
>  
>  from distutils.core import setup
> +import sys
>  
>  # If building under Python3 we need to run 2to3 on the code, do this by
>  # trying to import distutils' 2to3 builder, which is only available in
> @@ -13,6 +14,24 @@ except ImportError:
>      # 2.x
>      from distutils.command.build_py import build_py
>  
> +
> +current_version = '%d.%d' % sys.version_info[:2]
> +try:
> +    f = open('GIT-PYTHON_VERSION', 'r')
> +    latest_version = f.read().strip()
> +    f.close()
> +
> +    if latest_version != current_version:
> +        if not '--force' in sys.argv:
> +            sys.argv.insert(0, '--force')
> +except IOError:
> +    pass
> +
> +f = open('GIT-PYTHON_VERSION', 'w')
> +f.write(current_version)
> +f.close()
> +
> +
>  setup(
>      name = 'git_remote_helpers',
>      version = '0.1.0',
> 

That's about the same overhead as doing it in the Makefile,
and a bit more obscure.  I don't mind your initial version
so much anymore.  Thanks for thinking about it.

		-- Pete

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC/PATCH 2/8 v3] git_remote_helpers: fix input when running under Python 3
  2013-01-16  9:45                       ` John Keeping
@ 2013-01-17  0:29                         ` Pete Wyckoff
  0 siblings, 0 replies; 53+ messages in thread
From: Pete Wyckoff @ 2013-01-17  0:29 UTC (permalink / raw)
  To: John Keeping
  Cc: Junio C Hamano, Michael Haggerty, git, Eric S. Raymond,
	Felipe Contreras, Sverre Rabbelier

john@keeping.me.uk wrote on Wed, 16 Jan 2013 09:45 +0000:
> On Tue, Jan 15, 2013 at 07:03:16PM -0500, Pete Wyckoff wrote:
> > I'd suggest for this Python conundrum using byte-string literals, e.g.:
> > 
> >         lines = check_output(args).strip().split(b'\n')
> > 	value, name = line.split(b' ')
> > 	name = name.strip(b'commit\t')
> > 
> > Essentially identical to what you have, but avoids naming "utf-8" as
> > the encoding.  It instead relies on Python's interpretation of
> > ASCII characters in string context, which is exactly what C does.
> 
> The problem is that AFAICT the byte-string prefix is only available in
> Python 2.7 and later (compare [1] and [2]).  I think we need this more
> convoluted code if we want to keep supporting Python 2.6 (although
> perhaps 'ascii' would be a better choice than 'utf-8').
> 
> [1] http://docs.python.org/2.6/reference/lexical_analysis.html#literals
> [2] http://docs.python.org/2.7/reference/lexical_analysis.html#literals

Drat.  The b'' syntax seems to work on 2.6.8, in spite of
the docs, but certainly isn't in 2.5.

I think you had hit on the best compromise with encoding,
but maybe ascii is a little less presumptuous than utf-8,
and more indicative of the encoding assumption.

		-- Pete

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [PATCH v2 0/8] Initial Python 3 support
  2013-01-12 19:23 [PATCH 0/8] Initial support for Python 3 John Keeping
                   ` (8 preceding siblings ...)
  2013-01-12 23:43 ` [PATCH 0/8] Initial support for Python 3 Pete Wyckoff
@ 2013-01-17 18:53 ` John Keeping
  2013-01-17 18:53 ` [PATCH v2 1/8] git_remote_helpers: allow building with Python 3 John Keeping
                   ` (7 subsequent siblings)
  17 siblings, 0 replies; 53+ messages in thread
From: John Keeping @ 2013-01-17 18:53 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: John Keeping, Michael Haggerty, git, Pete Wyckoff

This series does enough so that everything except git-p4 runs under
Python 3.

As discussed with Pete, it may not make sense to change git-p4 to
support Python 3 until Perforce's Python output mode is changed.  So
does it make sense to merge this now and say "use Python 2 if you want
git-p4"?

Changes since v1:

* rebased on master after fc/remote-testgit-feature-done was merged,
  leading to an extra change in patch 8 (git-remote-testpy: call print
  as a function)
* changed patch 2 (git_remote_helpers: fix input when running under
  Python 3) to treat ref names as byte strings

John Keeping (8):
  git_remote_helpers: Allow building with Python 3
  git_remote_helpers: fix input when running under Python 3
  git_remote_helpers: Force rebuild if python version changes
  git_remote_helpers: Use 2to3 if building with Python 3
  svn-fe: allow svnrdump_sim.py to run with Python 3
  git-remote-testpy: hash bytes explicitly
  git-remote-testpy: don't do unbuffered text I/O
  git-remote-testpy: call print as a function

 contrib/svn-fe/svnrdump_sim.py     |  4 ++--
 git-remote-testpy.py               | 42 +++++++++++++++++++-------------------
 git_remote_helpers/.gitignore      |  1 +
 git_remote_helpers/Makefile        | 10 +++++++--
 git_remote_helpers/git/importer.py |  9 +++++---
 git_remote_helpers/setup.py        | 10 +++++++++
 6 files changed, 48 insertions(+), 28 deletions(-)

-- 
1.8.1.1.260.g99b33f4.dirty

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [PATCH v2 1/8] git_remote_helpers: allow building with Python 3
  2013-01-12 19:23 [PATCH 0/8] Initial support for Python 3 John Keeping
                   ` (9 preceding siblings ...)
  2013-01-17 18:53 ` [PATCH v2 0/8] Initial Python 3 support John Keeping
@ 2013-01-17 18:53 ` John Keeping
  2013-01-17 18:53 ` [PATCH v2 2/8] git_remote_helpers: fix input when running under " John Keeping
                   ` (6 subsequent siblings)
  17 siblings, 0 replies; 53+ messages in thread
From: John Keeping @ 2013-01-17 18:53 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: John Keeping, Michael Haggerty, git, Pete Wyckoff

Change inline Python to call "print" as a function not a statement.

This is harmless because Python 2 will see the parentheses as redundant
grouping but they are necessary to run this code with Python 3.

Signed-off-by: John Keeping <john@keeping.me.uk>
---
 git_remote_helpers/Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/git_remote_helpers/Makefile b/git_remote_helpers/Makefile
index 74b05dc..f65f064 100644
--- a/git_remote_helpers/Makefile
+++ b/git_remote_helpers/Makefile
@@ -23,7 +23,7 @@ endif
 
 PYLIBDIR=$(shell $(PYTHON_PATH) -c \
 	 "import sys; \
-	 print 'lib/python%i.%i/site-packages' % sys.version_info[:2]")
+	 print('lib/python%i.%i/site-packages' % sys.version_info[:2])")
 
 all: $(pysetupfile)
 	$(QUIET)$(PYTHON_PATH) $(pysetupfile) $(QUIETSETUP) build
-- 
1.8.1.1.260.g99b33f4.dirty

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v2 2/8] git_remote_helpers: fix input when running under Python 3
  2013-01-12 19:23 [PATCH 0/8] Initial support for Python 3 John Keeping
                   ` (10 preceding siblings ...)
  2013-01-17 18:53 ` [PATCH v2 1/8] git_remote_helpers: allow building with Python 3 John Keeping
@ 2013-01-17 18:53 ` John Keeping
  2013-01-17 18:53 ` [PATCH v2 3/8] git_remote_helpers: force rebuild if python version changes John Keeping
                   ` (5 subsequent siblings)
  17 siblings, 0 replies; 53+ messages in thread
From: John Keeping @ 2013-01-17 18:53 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: John Keeping, Michael Haggerty, git, Pete Wyckoff

Although 2to3 will fix most issues in Python 2 code to make it run under
Python 3, it does not handle the new strict separation between byte
strings and unicode strings.  There is one instance in
git_remote_helpers where we are caught by this, which is when reading
refs from "git for-each-ref".

Fix this by operating on the returned string as a byte string rather
than a unicode string.  As this method is currently only used internally
by the class this does not affect code anywhere else.

Note that we cannot use byte strings in the source as the 'b' prefix is
not supported before Python 2.7 so in order to maintain compatibility
with the maximum range of Python versions we use an explicit call to
encode().

Signed-off-by: John Keeping <john@keeping.me.uk>
---
 git_remote_helpers/git/importer.py | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/git_remote_helpers/git/importer.py b/git_remote_helpers/git/importer.py
index e28cc8f..d3f90e1 100644
--- a/git_remote_helpers/git/importer.py
+++ b/git_remote_helpers/git/importer.py
@@ -18,13 +18,16 @@ class GitImporter(object):
 
     def get_refs(self, gitdir):
         """Returns a dictionary with refs.
+
+        Note that the keys in the returned dictionary are byte strings as
+        read from git.
         """
         args = ["git", "--git-dir=" + gitdir, "for-each-ref", "refs/heads"]
-        lines = check_output(args).strip().split('\n')
+        lines = check_output(args).strip().split('\n'.encode('ascii'))
         refs = {}
         for line in lines:
-            value, name = line.split(' ')
-            name = name.strip('commit\t')
+            value, name = line.split(' '.encode('ascii'))
+            name = name.strip('commit\t'.encode('ascii'))
             refs[name] = value
         return refs
 
-- 
1.8.1.1.260.g99b33f4.dirty

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v2 3/8] git_remote_helpers: force rebuild if python version changes
  2013-01-12 19:23 [PATCH 0/8] Initial support for Python 3 John Keeping
                   ` (11 preceding siblings ...)
  2013-01-17 18:53 ` [PATCH v2 2/8] git_remote_helpers: fix input when running under " John Keeping
@ 2013-01-17 18:53 ` John Keeping
  2013-01-17 18:53 ` [PATCH v2 4/8] git_remote_helpers: use 2to3 if building with Python 3 John Keeping
                   ` (4 subsequent siblings)
  17 siblings, 0 replies; 53+ messages in thread
From: John Keeping @ 2013-01-17 18:53 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: John Keeping, Michael Haggerty, git, Pete Wyckoff

When different version of python are used to build via distutils, the
behaviour can change.  Detect changes in version and pass --force in
this case.

Signed-off-by: John Keeping <john@keeping.me.uk>
---
 git_remote_helpers/.gitignore | 1 +
 git_remote_helpers/Makefile   | 8 +++++++-
 2 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/git_remote_helpers/.gitignore b/git_remote_helpers/.gitignore
index 2247d5f..06c664f 100644
--- a/git_remote_helpers/.gitignore
+++ b/git_remote_helpers/.gitignore
@@ -1,2 +1,3 @@
+/GIT-PYTHON_VERSION
 /build
 /dist
diff --git a/git_remote_helpers/Makefile b/git_remote_helpers/Makefile
index f65f064..91f458f 100644
--- a/git_remote_helpers/Makefile
+++ b/git_remote_helpers/Makefile
@@ -25,8 +25,14 @@ PYLIBDIR=$(shell $(PYTHON_PATH) -c \
 	 "import sys; \
 	 print('lib/python%i.%i/site-packages' % sys.version_info[:2])")
 
+py_version=$(shell $(PYTHON_PATH) -c \
+	'import sys; print("%i.%i" % sys.version_info[:2])')
+
 all: $(pysetupfile)
-	$(QUIET)$(PYTHON_PATH) $(pysetupfile) $(QUIETSETUP) build
+	$(QUIET)test "$$(cat GIT-PYTHON_VERSION 2>/dev/null)" = "$(py_version)" || \
+	flags=--force; \
+	$(PYTHON_PATH) $(pysetupfile) $(QUIETSETUP) build $$flags
+	$(QUIET)echo "$(py_version)" >GIT-PYTHON_VERSION
 
 install: $(pysetupfile)
 	$(PYTHON_PATH) $(pysetupfile) install --prefix $(DESTDIR_SQ)$(prefix)
-- 
1.8.1.1.260.g99b33f4.dirty

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v2 4/8] git_remote_helpers: use 2to3 if building with Python 3
  2013-01-12 19:23 [PATCH 0/8] Initial support for Python 3 John Keeping
                   ` (12 preceding siblings ...)
  2013-01-17 18:53 ` [PATCH v2 3/8] git_remote_helpers: force rebuild if python version changes John Keeping
@ 2013-01-17 18:53 ` John Keeping
  2013-01-18  5:15   ` Sverre Rabbelier
  2013-01-17 18:53 ` [PATCH v2 5/8] svn-fe: allow svnrdump_sim.py to run " John Keeping
                   ` (3 subsequent siblings)
  17 siblings, 1 reply; 53+ messages in thread
From: John Keeping @ 2013-01-17 18:53 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: John Keeping, Michael Haggerty, git, Pete Wyckoff

Using the approach detailed on the Python wiki[1], run 2to3 on the code
as part of the build if building with Python 3.

The code itself requires no changes to convert cleanly.

[1] http://wiki.python.org/moin/PortingPythonToPy3k

Signed-off-by: John Keeping <john@keeping.me.uk>
---
 git_remote_helpers/setup.py | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/git_remote_helpers/setup.py b/git_remote_helpers/setup.py
index 4d434b6..6de41de 100644
--- a/git_remote_helpers/setup.py
+++ b/git_remote_helpers/setup.py
@@ -4,6 +4,15 @@
 
 from distutils.core import setup
 
+# If building under Python3 we need to run 2to3 on the code, do this by
+# trying to import distutils' 2to3 builder, which is only available in
+# Python3.
+try:
+    from distutils.command.build_py import build_py_2to3 as build_py
+except ImportError:
+    # 2.x
+    from distutils.command.build_py import build_py
+
 setup(
     name = 'git_remote_helpers',
     version = '0.1.0',
@@ -14,4 +23,5 @@ setup(
     url = 'http://www.git-scm.com/',
     package_dir = {'git_remote_helpers': ''},
     packages = ['git_remote_helpers', 'git_remote_helpers.git'],
+    cmdclass = {'build_py': build_py},
 )
-- 
1.8.1.1.260.g99b33f4.dirty

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v2 5/8] svn-fe: allow svnrdump_sim.py to run with Python 3
  2013-01-12 19:23 [PATCH 0/8] Initial support for Python 3 John Keeping
                   ` (13 preceding siblings ...)
  2013-01-17 18:53 ` [PATCH v2 4/8] git_remote_helpers: use 2to3 if building with Python 3 John Keeping
@ 2013-01-17 18:53 ` John Keeping
  2013-01-17 18:53 ` [PATCH v2 6/8] git-remote-testpy: hash bytes explicitly John Keeping
                   ` (2 subsequent siblings)
  17 siblings, 0 replies; 53+ messages in thread
From: John Keeping @ 2013-01-17 18:53 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: John Keeping, Michael Haggerty, git, Pete Wyckoff

The changes to allow this script to run with Python 3 are minimal and do
not affect its functionality on the versions of Python 2 that are
already supported (2.4 onwards).

Signed-off-by: John Keeping <john@keeping.me.uk>
---
 contrib/svn-fe/svnrdump_sim.py | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/contrib/svn-fe/svnrdump_sim.py b/contrib/svn-fe/svnrdump_sim.py
index 17cf6f9..4e78a1c 100755
--- a/contrib/svn-fe/svnrdump_sim.py
+++ b/contrib/svn-fe/svnrdump_sim.py
@@ -14,7 +14,7 @@ if sys.hexversion < 0x02040000:
 
 def getrevlimit():
         var = 'SVNRMAX'
-        if os.environ.has_key(var):
+        if var in os.environ:
                 return os.environ[var]
         return None
 
@@ -44,7 +44,7 @@ def writedump(url, lower, upper):
 
 if __name__ == "__main__":
         if not (len(sys.argv) in (3, 4, 5)):
-                print "usage: %s dump URL -rLOWER:UPPER"
+                print("usage: %s dump URL -rLOWER:UPPER")
                 sys.exit(1)
         if not sys.argv[1] == 'dump': raise NotImplementedError('only "dump" is suppported.')
         url = sys.argv[2]
-- 
1.8.1.1.260.g99b33f4.dirty

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v2 6/8] git-remote-testpy: hash bytes explicitly
  2013-01-12 19:23 [PATCH 0/8] Initial support for Python 3 John Keeping
                   ` (14 preceding siblings ...)
  2013-01-17 18:53 ` [PATCH v2 5/8] svn-fe: allow svnrdump_sim.py to run " John Keeping
@ 2013-01-17 18:53 ` John Keeping
  2013-01-17 20:36   ` Junio C Hamano
  2013-01-17 18:54 ` [PATCH v2 7/8] git-remote-testpy: don't do unbuffered text I/O John Keeping
  2013-01-17 18:54 ` [PATCH v2 8/8] git-remote-testpy: call print as a function John Keeping
  17 siblings, 1 reply; 53+ messages in thread
From: John Keeping @ 2013-01-17 18:53 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: John Keeping, Michael Haggerty, git, Pete Wyckoff

Under Python 3 'hasher.update(...)' must take a byte string and not a
unicode string.  Explicitly encode the argument to this method as UTF-8
so that this code works under Python 3.

This moves the required Python version forward to 2.0.

Signed-off-by: John Keeping <john@keeping.me.uk>
---
 git-remote-testpy.py | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/git-remote-testpy.py b/git-remote-testpy.py
index d94a66a..f8dc196 100644
--- a/git-remote-testpy.py
+++ b/git-remote-testpy.py
@@ -31,9 +31,9 @@ from git_remote_helpers.git.exporter import GitExporter
 from git_remote_helpers.git.importer import GitImporter
 from git_remote_helpers.git.non_local import NonLocalGit
 
-if sys.hexversion < 0x01050200:
-    # os.makedirs() is the limiter
-    sys.stderr.write("git-remote-testgit: requires Python 1.5.2 or later.\n")
+if sys.hexversion < 0x02000000:
+    # string.encode() is the limiter
+    sys.stderr.write("git-remote-testgit: requires Python 2.0 or later.\n")
     sys.exit(1)
 
 def get_repo(alias, url):
@@ -45,7 +45,7 @@ def get_repo(alias, url):
     repo.get_head()
 
     hasher = _digest()
-    hasher.update(repo.path)
+    hasher.update(repo.path.encode('utf-8'))
     repo.hash = hasher.hexdigest()
 
     repo.get_base_path = lambda base: os.path.join(
-- 
1.8.1.1.260.g99b33f4.dirty

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v2 7/8] git-remote-testpy: don't do unbuffered text I/O
  2013-01-12 19:23 [PATCH 0/8] Initial support for Python 3 John Keeping
                   ` (15 preceding siblings ...)
  2013-01-17 18:53 ` [PATCH v2 6/8] git-remote-testpy: hash bytes explicitly John Keeping
@ 2013-01-17 18:54 ` John Keeping
  2013-01-18  3:50   ` Sverre Rabbelier
  2013-01-17 18:54 ` [PATCH v2 8/8] git-remote-testpy: call print as a function John Keeping
  17 siblings, 1 reply; 53+ messages in thread
From: John Keeping @ 2013-01-17 18:54 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: John Keeping, Michael Haggerty, git, Pete Wyckoff

Python 3 forbids unbuffered I/O in text mode.  Change the reading of
stdin in git-remote-testpy so that we read the lines as bytes and then
decode them a line at a time.

This allows us to keep the I/O unbuffered in order to avoid
reintroducing the bug fixed by commit 7fb8e16 (git-remote-testgit: fix
race when spawning fast-import).

Signed-off-by: John Keeping <john@keeping.me.uk>
---
 git-remote-testpy.py | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/git-remote-testpy.py b/git-remote-testpy.py
index f8dc196..bc5e3cf 100644
--- a/git-remote-testpy.py
+++ b/git-remote-testpy.py
@@ -154,7 +154,7 @@ def do_import(repo, args):
     refs = [ref]
 
     while True:
-        line = sys.stdin.readline()
+        line = sys.stdin.readline().decode()
         if line == '\n':
             break
         if not line.startswith('import '):
@@ -225,7 +225,7 @@ def read_one_line(repo):
 
     line = sys.stdin.readline()
 
-    cmdline = line
+    cmdline = line.decode()
 
     if not cmdline:
         warn("Unexpected EOF")
@@ -277,7 +277,7 @@ def main(args):
 
     more = True
 
-    sys.stdin = os.fdopen(sys.stdin.fileno(), 'r', 0)
+    sys.stdin = os.fdopen(sys.stdin.fileno(), 'rb', 0)
     while (more):
         more = read_one_line(repo)
 
-- 
1.8.1.1.260.g99b33f4.dirty

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v2 8/8] git-remote-testpy: call print as a function
  2013-01-12 19:23 [PATCH 0/8] Initial support for Python 3 John Keeping
                   ` (16 preceding siblings ...)
  2013-01-17 18:54 ` [PATCH v2 7/8] git-remote-testpy: don't do unbuffered text I/O John Keeping
@ 2013-01-17 18:54 ` John Keeping
  2013-01-18  3:48   ` Sverre Rabbelier
  17 siblings, 1 reply; 53+ messages in thread
From: John Keeping @ 2013-01-17 18:54 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: John Keeping, Michael Haggerty, git, Pete Wyckoff

This is harmless in Python 2, which sees the parentheses as redundant
grouping, but is required for Python 3.  Since this is the only change
required to make this script just run under Python 3 without needing
2to3 it seems worthwhile.

The case of an empty print must be handled specially because in that
case Python 2 will interpret '()' as an empty tuple and print it as
'()'; inserting an empty string fixes this.

Signed-off-by: John Keeping <john@keeping.me.uk>
---
 git-remote-testpy.py | 28 ++++++++++++++--------------
 1 file changed, 14 insertions(+), 14 deletions(-)

diff --git a/git-remote-testpy.py b/git-remote-testpy.py
index bc5e3cf..ccdb2dc 100644
--- a/git-remote-testpy.py
+++ b/git-remote-testpy.py
@@ -87,9 +87,9 @@ def do_capabilities(repo, args):
     """Prints the supported capabilities.
     """
 
-    print "import"
-    print "export"
-    print "refspec refs/heads/*:%s*" % repo.prefix
+    print("import")
+    print("export")
+    print("refspec refs/heads/*:%s*" % repo.prefix)
 
     dirname = repo.get_base_path(repo.gitdir)
 
@@ -98,11 +98,11 @@ def do_capabilities(repo, args):
 
     path = os.path.join(dirname, 'git.marks')
 
-    print "*export-marks %s" % path
+    print("*export-marks %s" % path)
     if os.path.exists(path):
-        print "*import-marks %s" % path
+        print("*import-marks %s" % path)
 
-    print # end capabilities
+    print('') # end capabilities
 
 
 def do_list(repo, args):
@@ -115,16 +115,16 @@ def do_list(repo, args):
 
     for ref in repo.revs:
         debug("? refs/heads/%s", ref)
-        print "? refs/heads/%s" % ref
+        print("? refs/heads/%s" % ref)
 
     if repo.head:
         debug("@refs/heads/%s HEAD" % repo.head)
-        print "@refs/heads/%s HEAD" % repo.head
+        print("@refs/heads/%s HEAD" % repo.head)
     else:
         debug("@refs/heads/master HEAD")
-        print "@refs/heads/master HEAD"
+        print("@refs/heads/master HEAD")
 
-    print # end list
+    print('') # end list
 
 
 def update_local_repo(repo):
@@ -164,7 +164,7 @@ def do_import(repo, args):
         ref = line[7:].strip()
         refs.append(ref)
 
-    print "feature done"
+    print("feature done")
 
     if os.environ.get("GIT_REMOTE_TESTGIT_FAILURE"):
         die('Told to fail')
@@ -172,7 +172,7 @@ def do_import(repo, args):
     repo = update_local_repo(repo)
     repo.exporter.export_repo(repo.gitdir, refs)
 
-    print "done"
+    print("done")
 
 
 def do_export(repo, args):
@@ -192,8 +192,8 @@ def do_export(repo, args):
         repo.non_local.push(repo.gitdir)
 
     for ref in changed:
-        print "ok %s" % ref
-    print
+        print("ok %s" % ref)
+    print('')
 
 
 COMMANDS = {
-- 
1.8.1.1.260.g99b33f4.dirty

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* Re: [PATCH v2 6/8] git-remote-testpy: hash bytes explicitly
  2013-01-17 18:53 ` [PATCH v2 6/8] git-remote-testpy: hash bytes explicitly John Keeping
@ 2013-01-17 20:36   ` Junio C Hamano
  2013-01-17 20:43     ` Junio C Hamano
  2013-01-17 21:00     ` John Keeping
  0 siblings, 2 replies; 53+ messages in thread
From: Junio C Hamano @ 2013-01-17 20:36 UTC (permalink / raw)
  To: John Keeping; +Cc: Michael Haggerty, git, Pete Wyckoff

John Keeping <john@keeping.me.uk> writes:

> Under Python 3 'hasher.update(...)' must take a byte string and not a
> unicode string.  Explicitly encode the argument to this method as UTF-8
> so that this code works under Python 3.
>
> This moves the required Python version forward to 2.0.
>
> Signed-off-by: John Keeping <john@keeping.me.uk>
> ---

Hmph.  So what happens when the path is _not_ encoded in UTF-8?

Is the repo.hash (and local.hash that gets a copy of it) something
that needs to stay the same across multiple invocations of this
remote helper, and between the currently shipped Git and the version
of Git after applying this patch?  If that is not the case, and if
this is used only to get a randomly-looking 40-byte hexadecimal
string, then a lossy attempt to .encode('utf-8') and falling back to
replace or ignore bytes in the original that couldn't be interpreted
as part of a UTF-8 string would be OK, but doesn't .encode('utf-8')
throw an exception if not told to 'ignore' or something?

>  git-remote-testpy.py | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/git-remote-testpy.py b/git-remote-testpy.py
> index d94a66a..f8dc196 100644
> --- a/git-remote-testpy.py
> +++ b/git-remote-testpy.py
> @@ -31,9 +31,9 @@ from git_remote_helpers.git.exporter import GitExporter
>  from git_remote_helpers.git.importer import GitImporter
>  from git_remote_helpers.git.non_local import NonLocalGit
>  
> -if sys.hexversion < 0x01050200:
> -    # os.makedirs() is the limiter
> -    sys.stderr.write("git-remote-testgit: requires Python 1.5.2 or later.\n")
> +if sys.hexversion < 0x02000000:
> +    # string.encode() is the limiter
> +    sys.stderr.write("git-remote-testgit: requires Python 2.0 or later.\n")
>      sys.exit(1)
>  
>  def get_repo(alias, url):
> @@ -45,7 +45,7 @@ def get_repo(alias, url):
>      repo.get_head()
>  
>      hasher = _digest()
> -    hasher.update(repo.path)
> +    hasher.update(repo.path.encode('utf-8'))
>      repo.hash = hasher.hexdigest()
>  
>      repo.get_base_path = lambda base: os.path.join(

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v2 6/8] git-remote-testpy: hash bytes explicitly
  2013-01-17 20:36   ` Junio C Hamano
@ 2013-01-17 20:43     ` Junio C Hamano
  2013-01-17 21:00     ` John Keeping
  1 sibling, 0 replies; 53+ messages in thread
From: Junio C Hamano @ 2013-01-17 20:43 UTC (permalink / raw)
  To: John Keeping; +Cc: Michael Haggerty, git, Pete Wyckoff

Junio C Hamano <gitster@pobox.com> writes:

> John Keeping <john@keeping.me.uk> writes:
>
>> Under Python 3 'hasher.update(...)' must take a byte string and not a
>> unicode string.  Explicitly encode the argument to this method as UTF-8
>> so that this code works under Python 3.
>>
>> This moves the required Python version forward to 2.0.
>>
>> Signed-off-by: John Keeping <john@keeping.me.uk>
>> ---
>
> Hmph.  So what happens when the path is _not_ encoded in UTF-8?

Oh, my brain was not working. Forget this part, and sorry for the
noise.  We are not decoding a bytestring to an array of unicode
characters, but going the other way around here.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v2 6/8] git-remote-testpy: hash bytes explicitly
  2013-01-17 20:36   ` Junio C Hamano
  2013-01-17 20:43     ` Junio C Hamano
@ 2013-01-17 21:00     ` John Keeping
  2013-01-17 21:05       ` John Keeping
  2013-01-17 22:24       ` Junio C Hamano
  1 sibling, 2 replies; 53+ messages in thread
From: John Keeping @ 2013-01-17 21:00 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Michael Haggerty, git, Pete Wyckoff

On Thu, Jan 17, 2013 at 12:36:33PM -0800, Junio C Hamano wrote:
> John Keeping <john@keeping.me.uk> writes:
> 
>> Under Python 3 'hasher.update(...)' must take a byte string and not a
>> unicode string.  Explicitly encode the argument to this method as UTF-8
>> so that this code works under Python 3.
>>
>> This moves the required Python version forward to 2.0.
>>
>> Signed-off-by: John Keeping <john@keeping.me.uk>
>> ---
> 
> Hmph.  So what happens when the path is _not_ encoded in UTF-8?

Do you mean encodable?  As you say below it will currently throw an
exception.

> Is the repo.hash (and local.hash that gets a copy of it) something
> that needs to stay the same across multiple invocations of this
> remote helper, and between the currently shipped Git and the version
> of Git after applying this patch?

It's used to specify the path of the repository for importing or
exporting, so it should stay consistent across invocations.  However,
this is only an example remote helper so I don't think we should worry
if it changes from one Git release to the next.

>                                    If that is not the case, and if
> this is used only to get a randomly-looking 40-byte hexadecimal
> string, then a lossy attempt to .encode('utf-8') and falling back to
> replace or ignore bytes in the original that couldn't be interpreted
> as part of a UTF-8 string would be OK, but doesn't .encode('utf-8')
> throw an exception if not told to 'ignore' or something?

You're right - I think we need to add ", errors='replace'" to the call
to encode.

> >  git-remote-testpy.py | 8 ++++----
> >  1 file changed, 4 insertions(+), 4 deletions(-)
> >
> > diff --git a/git-remote-testpy.py b/git-remote-testpy.py
> > index d94a66a..f8dc196 100644
> > --- a/git-remote-testpy.py
> > +++ b/git-remote-testpy.py
> > @@ -31,9 +31,9 @@ from git_remote_helpers.git.exporter import GitExporter
> >  from git_remote_helpers.git.importer import GitImporter
> >  from git_remote_helpers.git.non_local import NonLocalGit
> >  
> > -if sys.hexversion < 0x01050200:
> > -    # os.makedirs() is the limiter
> > -    sys.stderr.write("git-remote-testgit: requires Python 1.5.2 or later.\n")
> > +if sys.hexversion < 0x02000000:
> > +    # string.encode() is the limiter
> > +    sys.stderr.write("git-remote-testgit: requires Python 2.0 or later.\n")
> >      sys.exit(1)
> >  
> >  def get_repo(alias, url):
> > @@ -45,7 +45,7 @@ def get_repo(alias, url):
> >      repo.get_head()
> >  
> >      hasher = _digest()
> > -    hasher.update(repo.path)
> > +    hasher.update(repo.path.encode('utf-8'))
> >      repo.hash = hasher.hexdigest()
> >  
> >      repo.get_base_path = lambda base: os.path.join(

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v2 6/8] git-remote-testpy: hash bytes explicitly
  2013-01-17 21:00     ` John Keeping
@ 2013-01-17 21:05       ` John Keeping
  2013-01-17 22:24       ` Junio C Hamano
  1 sibling, 0 replies; 53+ messages in thread
From: John Keeping @ 2013-01-17 21:05 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Michael Haggerty, git, Pete Wyckoff

On Thu, Jan 17, 2013 at 09:00:48PM +0000, John Keeping wrote:
> On Thu, Jan 17, 2013 at 12:36:33PM -0800, Junio C Hamano wrote:
>> John Keeping <john@keeping.me.uk> writes:
>> 
>>> Under Python 3 'hasher.update(...)' must take a byte string and not a
>>> unicode string.  Explicitly encode the argument to this method as UTF-8
>>> so that this code works under Python 3.
>>>
>>> This moves the required Python version forward to 2.0.
>>>
>>> Signed-off-by: John Keeping <john@keeping.me.uk>
>>> ---
>> 
>> Hmph.  So what happens when the path is _not_ encoded in UTF-8?
> 
> Do you mean encodable?  As you say below it will currently throw an
> exception.

Now my brain's not working - we shouldn't get an error converting from a
Unicode string to UTF-8, so I think this patch is OK as it is.

> > Is the repo.hash (and local.hash that gets a copy of it) something
> > that needs to stay the same across multiple invocations of this
> > remote helper, and between the currently shipped Git and the version
> > of Git after applying this patch?
> 
> It's used to specify the path of the repository for importing or
> exporting, so it should stay consistent across invocations.  However,
> this is only an example remote helper so I don't think we should worry
> if it changes from one Git release to the next.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v2 6/8] git-remote-testpy: hash bytes explicitly
  2013-01-17 21:00     ` John Keeping
  2013-01-17 21:05       ` John Keeping
@ 2013-01-17 22:24       ` Junio C Hamano
  2013-01-17 22:30         ` John Keeping
  1 sibling, 1 reply; 53+ messages in thread
From: Junio C Hamano @ 2013-01-17 22:24 UTC (permalink / raw)
  To: John Keeping; +Cc: Michael Haggerty, git, Pete Wyckoff

John Keeping <john@keeping.me.uk> writes:

> You're right - I think we need to add ", errors='replace'" to the call
> to encode.

Of if it is used just as a opaque token, you can .encode('hex') or
something to punt on the whole issue, no?

>
>> >  git-remote-testpy.py | 8 ++++----
>> >  1 file changed, 4 insertions(+), 4 deletions(-)
>> >
>> > diff --git a/git-remote-testpy.py b/git-remote-testpy.py
>> > index d94a66a..f8dc196 100644
>> > --- a/git-remote-testpy.py
>> > +++ b/git-remote-testpy.py
>> > @@ -31,9 +31,9 @@ from git_remote_helpers.git.exporter import GitExporter
>> >  from git_remote_helpers.git.importer import GitImporter
>> >  from git_remote_helpers.git.non_local import NonLocalGit
>> >  
>> > -if sys.hexversion < 0x01050200:
>> > -    # os.makedirs() is the limiter
>> > -    sys.stderr.write("git-remote-testgit: requires Python 1.5.2 or later.\n")
>> > +if sys.hexversion < 0x02000000:
>> > +    # string.encode() is the limiter
>> > +    sys.stderr.write("git-remote-testgit: requires Python 2.0 or later.\n")
>> >      sys.exit(1)
>> >  
>> >  def get_repo(alias, url):
>> > @@ -45,7 +45,7 @@ def get_repo(alias, url):
>> >      repo.get_head()
>> >  
>> >      hasher = _digest()
>> > -    hasher.update(repo.path)
>> > +    hasher.update(repo.path.encode('utf-8'))
>> >      repo.hash = hasher.hexdigest()
>> >  
>> >      repo.get_base_path = lambda base: os.path.join(

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v2 6/8] git-remote-testpy: hash bytes explicitly
  2013-01-17 22:24       ` Junio C Hamano
@ 2013-01-17 22:30         ` John Keeping
  2013-01-17 22:57           ` Junio C Hamano
  0 siblings, 1 reply; 53+ messages in thread
From: John Keeping @ 2013-01-17 22:30 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Michael Haggerty, git, Pete Wyckoff

On Thu, Jan 17, 2013 at 02:24:37PM -0800, Junio C Hamano wrote:
> John Keeping <john@keeping.me.uk> writes:
> 
>> You're right - I think we need to add ", errors='replace'" to the call
>> to encode.
> 
> Of if it is used just as a opaque token, you can .encode('hex') or
> something to punt on the whole issue, no?

Even better.  Are you happy to squash that in (assuming nothing else
comes up) or shall I resend?

>>>>  git-remote-testpy.py | 8 ++++----
>>>>  1 file changed, 4 insertions(+), 4 deletions(-)
>>>>
>>>> diff --git a/git-remote-testpy.py b/git-remote-testpy.py
>>>> index d94a66a..f8dc196 100644
>>>> --- a/git-remote-testpy.py
>>>> +++ b/git-remote-testpy.py
>>>> @@ -31,9 +31,9 @@ from git_remote_helpers.git.exporter import GitExporter
>>>>  from git_remote_helpers.git.importer import GitImporter
>>>>  from git_remote_helpers.git.non_local import NonLocalGit
>>>>  
>>>> -if sys.hexversion < 0x01050200:
>>>> -    # os.makedirs() is the limiter
>>>> -    sys.stderr.write("git-remote-testgit: requires Python 1.5.2 or later.\n")
>>>> +if sys.hexversion < 0x02000000:
>>>> +    # string.encode() is the limiter
>>>> +    sys.stderr.write("git-remote-testgit: requires Python 2.0 or later.\n")
>>>>      sys.exit(1)
>>>>  
>>>>  def get_repo(alias, url):
>>>> @@ -45,7 +45,7 @@ def get_repo(alias, url):
>>>>      repo.get_head()
>>>>  
>>>>      hasher = _digest()
>>>> -    hasher.update(repo.path)
>>>> +    hasher.update(repo.path.encode('utf-8'))
>>>>      repo.hash = hasher.hexdigest()
>>>>  
>>>>      repo.get_base_path = lambda base: os.path.join(

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v2 6/8] git-remote-testpy: hash bytes explicitly
  2013-01-17 22:30         ` John Keeping
@ 2013-01-17 22:57           ` Junio C Hamano
  0 siblings, 0 replies; 53+ messages in thread
From: Junio C Hamano @ 2013-01-17 22:57 UTC (permalink / raw)
  To: John Keeping; +Cc: Michael Haggerty, git, Pete Wyckoff

John Keeping <john@keeping.me.uk> writes:

> On Thu, Jan 17, 2013 at 02:24:37PM -0800, Junio C Hamano wrote:
>> John Keeping <john@keeping.me.uk> writes:
>> 
>>> You're right - I think we need to add ", errors='replace'" to the call
>>> to encode.
>> 
>> Of if it is used just as a opaque token, you can .encode('hex') or
>> something to punt on the whole issue, no?
>
> Even better.  Are you happy to squash that in (assuming nothing else
> comes up) or shall I resend?

If you go the .encode('hex') route, the log message needs to explain
why the hashed values are now different from the old implementation
and justify why it is safe to do so.  I do not think I want to do
that myself ;-).

Thanks.


>
>>>>>  git-remote-testpy.py | 8 ++++----
>>>>>  1 file changed, 4 insertions(+), 4 deletions(-)
>>>>>
>>>>> diff --git a/git-remote-testpy.py b/git-remote-testpy.py
>>>>> index d94a66a..f8dc196 100644
>>>>> --- a/git-remote-testpy.py
>>>>> +++ b/git-remote-testpy.py
>>>>> @@ -31,9 +31,9 @@ from git_remote_helpers.git.exporter import GitExporter
>>>>>  from git_remote_helpers.git.importer import GitImporter
>>>>>  from git_remote_helpers.git.non_local import NonLocalGit
>>>>>  
>>>>> -if sys.hexversion < 0x01050200:
>>>>> -    # os.makedirs() is the limiter
>>>>> -    sys.stderr.write("git-remote-testgit: requires Python 1.5.2 or later.\n")
>>>>> +if sys.hexversion < 0x02000000:
>>>>> +    # string.encode() is the limiter
>>>>> +    sys.stderr.write("git-remote-testgit: requires Python 2.0 or later.\n")
>>>>>      sys.exit(1)
>>>>>  
>>>>>  def get_repo(alias, url):
>>>>> @@ -45,7 +45,7 @@ def get_repo(alias, url):
>>>>>      repo.get_head()
>>>>>  
>>>>>      hasher = _digest()
>>>>> -    hasher.update(repo.path)
>>>>> +    hasher.update(repo.path.encode('utf-8'))
>>>>>      repo.hash = hasher.hexdigest()
>>>>>  
>>>>>      repo.get_base_path = lambda base: os.path.join(

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v2 8/8] git-remote-testpy: call print as a function
  2013-01-17 18:54 ` [PATCH v2 8/8] git-remote-testpy: call print as a function John Keeping
@ 2013-01-18  3:48   ` Sverre Rabbelier
  0 siblings, 0 replies; 53+ messages in thread
From: Sverre Rabbelier @ 2013-01-18  3:48 UTC (permalink / raw)
  To: John Keeping; +Cc: Junio C Hamano, Michael Haggerty, Git List, Pete Wyckoff

Looks harmless enough.

Acked-by: Sverre Rabbelier <srabbelier@gmail.com>

On Thu, Jan 17, 2013 at 10:54 AM, John Keeping <john@keeping.me.uk> wrote:
> This is harmless in Python 2, which sees the parentheses as redundant
> grouping, but is required for Python 3.  Since this is the only change
> required to make this script just run under Python 3 without needing
> 2to3 it seems worthwhile.
>
> The case of an empty print must be handled specially because in that
> case Python 2 will interpret '()' as an empty tuple and print it as
> '()'; inserting an empty string fixes this.
>
> Signed-off-by: John Keeping <john@keeping.me.uk>
> ---
>  git-remote-testpy.py | 28 ++++++++++++++--------------
>  1 file changed, 14 insertions(+), 14 deletions(-)
>
> diff --git a/git-remote-testpy.py b/git-remote-testpy.py
> index bc5e3cf..ccdb2dc 100644
> --- a/git-remote-testpy.py
> +++ b/git-remote-testpy.py
> @@ -87,9 +87,9 @@ def do_capabilities(repo, args):
>      """Prints the supported capabilities.
>      """
>
> -    print "import"
> -    print "export"
> -    print "refspec refs/heads/*:%s*" % repo.prefix
> +    print("import")
> +    print("export")
> +    print("refspec refs/heads/*:%s*" % repo.prefix)
>
>      dirname = repo.get_base_path(repo.gitdir)
>
> @@ -98,11 +98,11 @@ def do_capabilities(repo, args):
>
>      path = os.path.join(dirname, 'git.marks')
>
> -    print "*export-marks %s" % path
> +    print("*export-marks %s" % path)
>      if os.path.exists(path):
> -        print "*import-marks %s" % path
> +        print("*import-marks %s" % path)
>
> -    print # end capabilities
> +    print('') # end capabilities
>
>
>  def do_list(repo, args):
> @@ -115,16 +115,16 @@ def do_list(repo, args):
>
>      for ref in repo.revs:
>          debug("? refs/heads/%s", ref)
> -        print "? refs/heads/%s" % ref
> +        print("? refs/heads/%s" % ref)
>
>      if repo.head:
>          debug("@refs/heads/%s HEAD" % repo.head)
> -        print "@refs/heads/%s HEAD" % repo.head
> +        print("@refs/heads/%s HEAD" % repo.head)
>      else:
>          debug("@refs/heads/master HEAD")
> -        print "@refs/heads/master HEAD"
> +        print("@refs/heads/master HEAD")
>
> -    print # end list
> +    print('') # end list
>
>
>  def update_local_repo(repo):
> @@ -164,7 +164,7 @@ def do_import(repo, args):
>          ref = line[7:].strip()
>          refs.append(ref)
>
> -    print "feature done"
> +    print("feature done")
>
>      if os.environ.get("GIT_REMOTE_TESTGIT_FAILURE"):
>          die('Told to fail')
> @@ -172,7 +172,7 @@ def do_import(repo, args):
>      repo = update_local_repo(repo)
>      repo.exporter.export_repo(repo.gitdir, refs)
>
> -    print "done"
> +    print("done")
>
>
>  def do_export(repo, args):
> @@ -192,8 +192,8 @@ def do_export(repo, args):
>          repo.non_local.push(repo.gitdir)
>
>      for ref in changed:
> -        print "ok %s" % ref
> -    print
> +        print("ok %s" % ref)
> +    print('')
>
>
>  COMMANDS = {
> --
> 1.8.1.1.260.g99b33f4.dirty
>
> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
Cheers,

Sverre Rabbelier

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v2 7/8] git-remote-testpy: don't do unbuffered text I/O
  2013-01-17 18:54 ` [PATCH v2 7/8] git-remote-testpy: don't do unbuffered text I/O John Keeping
@ 2013-01-18  3:50   ` Sverre Rabbelier
  0 siblings, 0 replies; 53+ messages in thread
From: Sverre Rabbelier @ 2013-01-18  3:50 UTC (permalink / raw)
  To: John Keeping; +Cc: Junio C Hamano, Michael Haggerty, Git List, Pete Wyckoff

On Thu, Jan 17, 2013 at 10:54 AM, John Keeping <john@keeping.me.uk> wrote:
> -    sys.stdin = os.fdopen(sys.stdin.fileno(), 'r', 0)
> +    sys.stdin = os.fdopen(sys.stdin.fileno(), 'rb', 0)

It is not immediately obvious why you would open stdin in rb mode,
please add a comment.

--
Cheers,

Sverre Rabbelier

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v2 4/8] git_remote_helpers: use 2to3 if building with Python 3
  2013-01-17 18:53 ` [PATCH v2 4/8] git_remote_helpers: use 2to3 if building with Python 3 John Keeping
@ 2013-01-18  5:15   ` Sverre Rabbelier
  2013-01-18 10:32     ` John Keeping
  0 siblings, 1 reply; 53+ messages in thread
From: Sverre Rabbelier @ 2013-01-18  5:15 UTC (permalink / raw)
  To: John Keeping; +Cc: Junio C Hamano, Michael Haggerty, Git List, Pete Wyckoff

On Thu, Jan 17, 2013 at 10:53 AM, John Keeping <john@keeping.me.uk> wrote:
> [1] http://wiki.python.org/moin/PortingPythonToPy3k

This link seems dead.

--
Cheers,

Sverre Rabbelier

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v2 4/8] git_remote_helpers: use 2to3 if building with Python 3
  2013-01-18  5:15   ` Sverre Rabbelier
@ 2013-01-18 10:32     ` John Keeping
  2013-01-19  7:52       ` Sverre Rabbelier
  0 siblings, 1 reply; 53+ messages in thread
From: John Keeping @ 2013-01-18 10:32 UTC (permalink / raw)
  To: Sverre Rabbelier; +Cc: Junio C Hamano, Michael Haggerty, Git List, Pete Wyckoff

On Thu, Jan 17, 2013 at 09:15:08PM -0800, Sverre Rabbelier wrote:
> On Thu, Jan 17, 2013 at 10:53 AM, John Keeping <john@keeping.me.uk> wrote:
> > [1] http://wiki.python.org/moin/PortingPythonToPy3k
> 
> This link seems dead.

Looks like the Python wiki is down [1].

I'll replace it with [2] since the content is similar and it should be
easier to find a mirror of the Python documentation than of the wiki.

[1] http://pyfound.blogspot.co.uk/2013/01/wikipythonorg-compromised.html
[2] http://docs.python.org/3.3/howto/pyporting.html#during-installation


John

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v2 4/8] git_remote_helpers: use 2to3 if building with Python 3
  2013-01-18 10:32     ` John Keeping
@ 2013-01-19  7:52       ` Sverre Rabbelier
  0 siblings, 0 replies; 53+ messages in thread
From: Sverre Rabbelier @ 2013-01-19  7:52 UTC (permalink / raw)
  To: John Keeping; +Cc: Junio C Hamano, Michael Haggerty, Git List, Pete Wyckoff

Assuming you tried this out on both 2.x and 3.x:

Acked-by: Sverre Rabbelier <srabbelier@gmail.com>

On Fri, Jan 18, 2013 at 2:32 AM, John Keeping <john@keeping.me.uk> wrote:
> On Thu, Jan 17, 2013 at 09:15:08PM -0800, Sverre Rabbelier wrote:
>> On Thu, Jan 17, 2013 at 10:53 AM, John Keeping <john@keeping.me.uk> wrote:
>> > [1] http://wiki.python.org/moin/PortingPythonToPy3k
>>
>> This link seems dead.
>
> Looks like the Python wiki is down [1].
>
> I'll replace it with [2] since the content is similar and it should be
> easier to find a mirror of the Python documentation than of the wiki.
>
> [1] http://pyfound.blogspot.co.uk/2013/01/wikipythonorg-compromised.html
> [2] http://docs.python.org/3.3/howto/pyporting.html#during-installation
>
>
> John



-- 
Cheers,

Sverre Rabbelier

^ permalink raw reply	[flat|nested] 53+ messages in thread

end of thread, other threads:[~2013-01-19  7:53 UTC | newest]

Thread overview: 53+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-01-12 19:23 [PATCH 0/8] Initial support for Python 3 John Keeping
2013-01-12 19:23 ` [PATCH 1/8] git_remote_helpers: Allow building with " John Keeping
2013-01-12 19:23 ` [PATCH 2/8] git_remote_helpers: fix input when running under " John Keeping
2013-01-13  3:26   ` Michael Haggerty
2013-01-13 16:17     ` John Keeping
2013-01-14  4:48       ` Michael Haggerty
2013-01-14  9:47         ` John Keeping
2013-01-15 19:48           ` [RFC/PATCH 2/8 v2] " John Keeping
2013-01-15 20:51             ` Junio C Hamano
2013-01-15 21:54               ` John Keeping
2013-01-15 22:04                 ` Junio C Hamano
2013-01-15 22:40                   ` [RFC/PATCH 2/8 v3] " John Keeping
2013-01-16  0:03                     ` Pete Wyckoff
2013-01-16  9:45                       ` John Keeping
2013-01-17  0:29                         ` Pete Wyckoff
2013-01-12 19:23 ` [PATCH 3/8] git_remote_helpers: Force rebuild if python version changes John Keeping
2013-01-12 23:30   ` Pete Wyckoff
2013-01-13 16:26     ` John Keeping
2013-01-13 17:14       ` Pete Wyckoff
2013-01-13 17:52         ` John Keeping
2013-01-15 22:58           ` John Keeping
2013-01-17  0:27             ` Pete Wyckoff
2013-01-12 19:23 ` [PATCH 4/8] git_remote_helpers: Use 2to3 if building with Python 3 John Keeping
2013-01-12 19:23 ` [PATCH 5/8] svn-fe: allow svnrdump_sim.py to run " John Keeping
2013-01-12 19:23 ` [PATCH 6/8] git-remote-testpy: hash bytes explicitly John Keeping
2013-01-12 19:23 ` [PATCH 7/8] git-remote-testpy: don't do unbuffered text I/O John Keeping
2013-01-12 19:23 ` [PATCH 8/8] git-remote-testpy: call print as a function John Keeping
2013-01-12 23:43 ` [PATCH 0/8] Initial support for Python 3 Pete Wyckoff
2013-01-13  0:41   ` John Keeping
2013-01-13 12:34     ` John Keeping
2013-01-13 16:40     ` Pete Wyckoff
2013-01-13 17:35       ` John Keeping
2013-01-17 18:53 ` [PATCH v2 0/8] Initial Python 3 support John Keeping
2013-01-17 18:53 ` [PATCH v2 1/8] git_remote_helpers: allow building with Python 3 John Keeping
2013-01-17 18:53 ` [PATCH v2 2/8] git_remote_helpers: fix input when running under " John Keeping
2013-01-17 18:53 ` [PATCH v2 3/8] git_remote_helpers: force rebuild if python version changes John Keeping
2013-01-17 18:53 ` [PATCH v2 4/8] git_remote_helpers: use 2to3 if building with Python 3 John Keeping
2013-01-18  5:15   ` Sverre Rabbelier
2013-01-18 10:32     ` John Keeping
2013-01-19  7:52       ` Sverre Rabbelier
2013-01-17 18:53 ` [PATCH v2 5/8] svn-fe: allow svnrdump_sim.py to run " John Keeping
2013-01-17 18:53 ` [PATCH v2 6/8] git-remote-testpy: hash bytes explicitly John Keeping
2013-01-17 20:36   ` Junio C Hamano
2013-01-17 20:43     ` Junio C Hamano
2013-01-17 21:00     ` John Keeping
2013-01-17 21:05       ` John Keeping
2013-01-17 22:24       ` Junio C Hamano
2013-01-17 22:30         ` John Keeping
2013-01-17 22:57           ` Junio C Hamano
2013-01-17 18:54 ` [PATCH v2 7/8] git-remote-testpy: don't do unbuffered text I/O John Keeping
2013-01-18  3:50   ` Sverre Rabbelier
2013-01-17 18:54 ` [PATCH v2 8/8] git-remote-testpy: call print as a function John Keeping
2013-01-18  3:48   ` Sverre Rabbelier

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).