git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH v2 00/14] git-p4: python3 compatibility
       [not found] <20191213235247.23660-1-yang.zhao@skyboxlabs.com>
@ 2019-12-13 23:52 ` Yang Zhao
  2020-01-17 22:00   ` Yang Zhao
  2019-12-13 23:52 ` [PATCH v2 01/14] git-p4: make python2.7 the oldest supported version Yang Zhao
                   ` (13 subsequent siblings)
  14 siblings, 1 reply; 22+ messages in thread
From: Yang Zhao @ 2019-12-13 23:52 UTC (permalink / raw)
  To: git; +Cc: Yang Zhao, luke, liu.denton, seraphire


This patchset adds python3 compatibility to git-p4.

While further clean-up refactoring would have been nice, I specifically avoided
making any major changes to the internal API, aiming to have passing tests
with as few changes as possible.

CI results can be seen from this GitHub PR: https://github.com/git/git/pull/673

Changes since v1:
  - incorporate Ben's change on dropping usage of basestring
  - don't alias string types based on python verson
  - use feature detection instead of version checking to detect python3
  - CI patch now at end of series; needs further discussion
  - some general clean-up to commit messages

Ben Keene (1):
  git-p4: change the expansion test from basestring to list

Yang Zhao (13):
  git-p4: make python2.7 the oldest supported version
  git-p4: remove string type aliasing
  git-p4: encode/decode communication with p4 for python3
  git-p4: encode/decode communication with git for python 4
  git-p4: convert path to unicode before processing them
  git-p4: open .gitp4-usercache.txt in text mode
  git-p4: use marshal format version 2 when sending to p4
  git-p4: fix freezing while waiting for fast-import progress
  git-p4: use functools.reduce instead of reduce
  git-p4: use dict.items() iteration for python3 compatibility
  git-p4: simplify regex pattern generation for parsing diff-tree
  git-p4: use python3's input() everywhere
  ci: also run linux-gcc pipeline with python3.5 environment

 azure-pipelines.yml |  11 ++
 git-p4.py           | 237 ++++++++++++++++++++++++++------------------
 2 files changed, 152 insertions(+), 96 deletions(-)

-- 
2.21.0.windows.1


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v2 01/14] git-p4: make python2.7 the oldest supported version
       [not found] <20191213235247.23660-1-yang.zhao@skyboxlabs.com>
  2019-12-13 23:52 ` [PATCH v2 00/14] git-p4: python3 compatibility Yang Zhao
@ 2019-12-13 23:52 ` Yang Zhao
  2019-12-13 23:52 ` [PATCH v2 02/14] git-p4: change the expansion test from basestring to list Yang Zhao
                   ` (12 subsequent siblings)
  14 siblings, 0 replies; 22+ messages in thread
From: Yang Zhao @ 2019-12-13 23:52 UTC (permalink / raw)
  To: git; +Cc: Yang Zhao, luke, liu.denton, seraphire

Python2.6 and earlier have been end-of-life'd for many years now, and
we actually already use 2.7-only features in the code. Make the version
check reflect current realities.

This also removes the need to explicitly define CalledProcessError if
it's not available.

Signed-off-by: Yang Zhao <yang.zhao@skyboxlabs.com>
---
 git-p4.py | 20 ++------------------
 1 file changed, 2 insertions(+), 18 deletions(-)

diff --git a/git-p4.py b/git-p4.py
index 60c73b6a37..37777bb9fd 100755
--- a/git-p4.py
+++ b/git-p4.py
@@ -8,9 +8,8 @@
 # License: MIT <http://www.opensource.org/licenses/mit-license.php>
 #
 import sys
-if sys.hexversion < 0x02040000:
-    # The limiter is the subprocess module
-    sys.stderr.write("git-p4: requires Python 2.4 or later.\n")
+if sys.version_info.major < 3 and sys.version_info.minor < 7:
+    sys.stderr.write("git-p4: requires Python 2.7 or later.\n")
     sys.exit(1)
 import os
 import optparse
@@ -43,21 +42,6 @@
     bytes = str
     basestring = basestring
 
-try:
-    from subprocess import CalledProcessError
-except ImportError:
-    # from python2.7:subprocess.py
-    # Exception classes used by this module.
-    class CalledProcessError(Exception):
-        """This exception is raised when a process run by check_call() returns
-        a non-zero exit status.  The exit status will be stored in the
-        returncode attribute."""
-        def __init__(self, returncode, cmd):
-            self.returncode = returncode
-            self.cmd = cmd
-        def __str__(self):
-            return "Command '%s' returned non-zero exit status %d" % (self.cmd, self.returncode)
-
 verbose = False
 
 # Only labels/tags matching this will be imported/exported
-- 
2.21.0.windows.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v2 02/14] git-p4: change the expansion test from basestring to list
       [not found] <20191213235247.23660-1-yang.zhao@skyboxlabs.com>
  2019-12-13 23:52 ` [PATCH v2 00/14] git-p4: python3 compatibility Yang Zhao
  2019-12-13 23:52 ` [PATCH v2 01/14] git-p4: make python2.7 the oldest supported version Yang Zhao
@ 2019-12-13 23:52 ` Yang Zhao
  2019-12-13 23:52 ` [PATCH v2 03/14] git-p4: remove string type aliasing Yang Zhao
                   ` (11 subsequent siblings)
  14 siblings, 0 replies; 22+ messages in thread
From: Yang Zhao @ 2019-12-13 23:52 UTC (permalink / raw)
  To: git; +Cc: Ben Keene, luke, liu.denton, Yang Zhao

From: Ben Keene <seraphire@gmail.com>

Python 3 handles strings differently than Python 2.7. Since Python 2
is reaching it's end of life, a series of changes are being submitted to
enable python 3.5 and following support. The current code fails basic
tests under python 3.5.

Some codepaths can represent a command line the program
internally prepares to execute either as a single string
(i.e. each token properly quoted, concatenated with $IFS) or
as a list of argv[] elements, and there are 9 places where
we say "if X is isinstance(_, basestring), then do this
thing to handle X as a command line in a single string; if
not, X is a command line in a list form".

This does not work well with Python 3, as there is no
basestring (everything is Unicode now), and even with Python
2, it was not an ideal way to tell the two cases apart,
because an internally formed command line could have been in
a single Unicode string.

Flip the check to say "if X is not a list, then handle X as
a command line in a single string; otherwise treat it as a
command line in a list form".

This will get rid of references to 'basestring', to migrate
the code ready for Python 3.

Thanks-to: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Ben Keene <seraphire@gmail.com>
Signed-off-by: Yang Zhao <yang.zhao@skyboxlabs.com>
---
 git-p4.py | 18 +++++++++---------
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/git-p4.py b/git-p4.py
index 37777bb9fd..2f177fb43b 100755
--- a/git-p4.py
+++ b/git-p4.py
@@ -89,7 +89,7 @@ def p4_build_cmd(cmd):
         # Provide a way to not pass this option by setting git-p4.retries to 0
         real_cmd += ["-r", str(retries)]
 
-    if isinstance(cmd,basestring):
+    if not isinstance(cmd, list):
         real_cmd = ' '.join(real_cmd) + ' ' + cmd
     else:
         real_cmd += cmd
@@ -155,7 +155,7 @@ def write_pipe(c, stdin):
     if verbose:
         sys.stderr.write('Writing pipe: %s\n' % str(c))
 
-    expand = isinstance(c,basestring)
+    expand = not isinstance(c, list)
     p = subprocess.Popen(c, stdin=subprocess.PIPE, shell=expand)
     pipe = p.stdin
     val = pipe.write(stdin)
@@ -177,7 +177,7 @@ def read_pipe_full(c):
     if verbose:
         sys.stderr.write('Reading pipe: %s\n' % str(c))
 
-    expand = isinstance(c,basestring)
+    expand = not isinstance(c, list)
     p = subprocess.Popen(c, stdout=subprocess.PIPE, stderr=subprocess.PIPE, shell=expand)
     (out, err) = p.communicate()
     return (p.returncode, out, err)
@@ -213,7 +213,7 @@ def read_pipe_lines(c):
     if verbose:
         sys.stderr.write('Reading pipe: %s\n' % str(c))
 
-    expand = isinstance(c, basestring)
+    expand = not isinstance(c, list)
     p = subprocess.Popen(c, stdout=subprocess.PIPE, shell=expand)
     pipe = p.stdout
     val = pipe.readlines()
@@ -256,7 +256,7 @@ def p4_has_move_command():
     return True
 
 def system(cmd, ignore_error=False):
-    expand = isinstance(cmd,basestring)
+    expand = not isinstance(cmd, list)
     if verbose:
         sys.stderr.write("executing %s\n" % str(cmd))
     retcode = subprocess.call(cmd, shell=expand)
@@ -268,7 +268,7 @@ def system(cmd, ignore_error=False):
 def p4_system(cmd):
     """Specifically invoke p4 as the system command. """
     real_cmd = p4_build_cmd(cmd)
-    expand = isinstance(real_cmd, basestring)
+    expand = not isinstance(real_cmd, list)
     retcode = subprocess.call(real_cmd, shell=expand)
     if retcode:
         raise CalledProcessError(retcode, real_cmd)
@@ -506,7 +506,7 @@ def getP4OpenedType(file):
 # Return the set of all p4 labels
 def getP4Labels(depotPaths):
     labels = set()
-    if isinstance(depotPaths,basestring):
+    if not isinstance(depotPaths, list):
         depotPaths = [depotPaths]
 
     for l in p4CmdList(["labels"] + ["%s..." % p for p in depotPaths]):
@@ -593,7 +593,7 @@ def isModeExecChanged(src_mode, dst_mode):
 def p4CmdList(cmd, stdin=None, stdin_mode='w+b', cb=None, skip_info=False,
         errors_as_exceptions=False):
 
-    if isinstance(cmd,basestring):
+    if not isinstance(cmd, list):
         cmd = "-G " + cmd
         expand = True
     else:
@@ -610,7 +610,7 @@ def p4CmdList(cmd, stdin=None, stdin_mode='w+b', cb=None, skip_info=False,
     stdin_file = None
     if stdin is not None:
         stdin_file = tempfile.TemporaryFile(prefix='p4-stdin', mode=stdin_mode)
-        if isinstance(stdin,basestring):
+        if not isinstance(stdin, list):
             stdin_file.write(stdin)
         else:
             for i in stdin:
-- 
2.21.0.windows.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v2 03/14] git-p4: remove string type aliasing
       [not found] <20191213235247.23660-1-yang.zhao@skyboxlabs.com>
                   ` (2 preceding siblings ...)
  2019-12-13 23:52 ` [PATCH v2 02/14] git-p4: change the expansion test from basestring to list Yang Zhao
@ 2019-12-13 23:52 ` Yang Zhao
  2019-12-13 23:52 ` [PATCH v2 04/14] git-p4: encode/decode communication with p4 for python3 Yang Zhao
                   ` (10 subsequent siblings)
  14 siblings, 0 replies; 22+ messages in thread
From: Yang Zhao @ 2019-12-13 23:52 UTC (permalink / raw)
  To: git; +Cc: Yang Zhao, luke, liu.denton, seraphire

Now that python2.7 is the minimum required version and we no longer use
the basestring type, it is not necessary to use type aliasing to ensure
python3 compatibility.

Signed-off-by: Yang Zhao <yang.zhao@skyboxlabs.com>
---
 git-p4.py | 16 ----------------
 1 file changed, 16 deletions(-)

diff --git a/git-p4.py b/git-p4.py
index 2f177fb43b..153aff16f3 100755
--- a/git-p4.py
+++ b/git-p4.py
@@ -26,22 +26,6 @@
 import ctypes
 import errno
 
-# support basestring in python3
-try:
-    unicode = unicode
-except NameError:
-    # 'unicode' is undefined, must be Python 3
-    str = str
-    unicode = str
-    bytes = bytes
-    basestring = (str,bytes)
-else:
-    # 'unicode' exists, must be Python 2
-    str = str
-    unicode = unicode
-    bytes = str
-    basestring = basestring
-
 verbose = False
 
 # Only labels/tags matching this will be imported/exported
-- 
2.21.0.windows.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v2 04/14] git-p4: encode/decode communication with p4 for python3
       [not found] <20191213235247.23660-1-yang.zhao@skyboxlabs.com>
                   ` (3 preceding siblings ...)
  2019-12-13 23:52 ` [PATCH v2 03/14] git-p4: remove string type aliasing Yang Zhao
@ 2019-12-13 23:52 ` Yang Zhao
  2019-12-17 22:51   ` Junio C Hamano
  2019-12-13 23:52 ` [PATCH v2 05/14] git-p4: encode/decode communication with git " Yang Zhao
                   ` (9 subsequent siblings)
  14 siblings, 1 reply; 22+ messages in thread
From: Yang Zhao @ 2019-12-13 23:52 UTC (permalink / raw)
  To: git; +Cc: Yang Zhao, luke, liu.denton, seraphire

The marshalled dict in the response given on STDOUT by p4 uses `str` for
keys and string values. When run using python3, these values are
deserialized as `bytes`, leading to a whole host of problems as the rest
of the code assumes `str` is used throughout.

This patch changes the deserialization behaviour such that, as much as
possible, text output from p4 is decoded to native unicode strings.
Exceptions are made for the field `data` as it is usually arbitrary
binary data. `depotFile[0-9]*`, `path`, and `clientFile` are also exempt
as they contain path strings not encoded with UTF-8, and must survive
survive round-trip back to p4.

Conversely, text data being piped to p4 must always be encoded when
running under python3.

encode_text_stream() and decode_text_stream() were added to make these
transformations more convenient.

Signed-off-by: Yang Zhao <yang.zhao@skyboxlabs.com>
---
 git-p4.py | 59 +++++++++++++++++++++++++++++++++++++++++++------------
 1 file changed, 46 insertions(+), 13 deletions(-)

diff --git a/git-p4.py b/git-p4.py
index 153aff16f3..ca891e3d5d 100755
--- a/git-p4.py
+++ b/git-p4.py
@@ -135,6 +135,21 @@ def die(msg):
         sys.stderr.write(msg + "\n")
         sys.exit(1)
 
+# We need different encoding/decoding strategies for text data being passed
+# around in pipes depending on python version
+if bytes is not str:
+    # For python3, always encode and decode as appropriate
+    def decode_text_stream(s):
+        return s.decode() if isinstance(s, bytes) else s
+    def encode_text_stream(s):
+        return s.encode() if isinstance(s, str) else s
+else:
+    # For python2.7, pass read strings as-is, but also allow writing unicode
+    def decode_text_stream(s):
+        return s
+    def encode_text_stream(s):
+        return s.encode('utf_8') if isinstance(s, unicode) else s
+
 def write_pipe(c, stdin):
     if verbose:
         sys.stderr.write('Writing pipe: %s\n' % str(c))
@@ -151,6 +166,8 @@ def write_pipe(c, stdin):
 
 def p4_write_pipe(c, stdin):
     real_cmd = p4_build_cmd(c)
+    if bytes is not str and isinstance(stdin, str):
+        stdin = encode_text_stream(stdin)
     return write_pipe(real_cmd, stdin)
 
 def read_pipe_full(c):
@@ -164,7 +181,7 @@ def read_pipe_full(c):
     expand = not isinstance(c, list)
     p = subprocess.Popen(c, stdout=subprocess.PIPE, stderr=subprocess.PIPE, shell=expand)
     (out, err) = p.communicate()
-    return (p.returncode, out, err)
+    return (p.returncode, out, decode_text_stream(err))
 
 def read_pipe(c, ignore_error=False):
     """ Read output from  command. Returns the output text on
@@ -187,11 +204,11 @@ def read_pipe_text(c):
     if retcode != 0:
         return None
     else:
-        return out.rstrip()
+        return decode_text_stream(out).rstrip()
 
-def p4_read_pipe(c, ignore_error=False):
+def p4_read_pipe(c, ignore_error=False, raw=False):
     real_cmd = p4_build_cmd(c)
-    return read_pipe(real_cmd, ignore_error)
+    return read_pipe(real_cmd, ignore_error, raw=raw)
 
 def read_pipe_lines(c):
     if verbose:
@@ -200,7 +217,7 @@ def read_pipe_lines(c):
     expand = not isinstance(c, list)
     p = subprocess.Popen(c, stdout=subprocess.PIPE, shell=expand)
     pipe = p.stdout
-    val = pipe.readlines()
+    val = [decode_text_stream(line) for line in pipe.readlines()]
     if pipe.close() or p.wait():
         die('Command failed: %s' % str(c))
 
@@ -231,6 +248,7 @@ def p4_has_move_command():
     cmd = p4_build_cmd(["move", "-k", "@from", "@to"])
     p = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
     (out, err) = p.communicate()
+    err = decode_text_stream(err)
     # return code will be 1 in either case
     if err.find("Invalid option") >= 0:
         return False
@@ -611,6 +629,20 @@ def p4CmdList(cmd, stdin=None, stdin_mode='w+b', cb=None, skip_info=False,
     try:
         while True:
             entry = marshal.load(p4.stdout)
+            if bytes is not str:
+                # Decode unmarshalled dict to use str keys and values, except for:
+                #   - `data` which may contain arbitrary binary data
+                #   - `depotFile[0-9]*`, `path`, or `clientFile` which may contain non-UTF8 encoded text
+                decoded_entry = {}
+                for key, value in entry.items():
+                    key = key.decode()
+                    if isinstance(value, bytes) and not (key in ('data', 'path', 'clientFile') or key.startswith('depotFile')):
+                        value = value.decode()
+                    decoded_entry[key] = value
+                # Parse out data if it's an error response
+                if decoded_entry.get('code') == 'error' and 'data' in decoded_entry:
+                    decoded_entry['data'] = decoded_entry['data'].decode()
+                entry = decoded_entry
             if skip_info:
                 if 'code' in entry and entry['code'] == 'info':
                     continue
@@ -828,6 +860,7 @@ def branch_exists(branch):
     cmd = [ "git", "rev-parse", "--symbolic", "--verify", branch ]
     p = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
     out, _ = p.communicate()
+    out = decode_text_stream(out)
     if p.returncode:
         return False
     # expect exactly one line of output: the branch name
@@ -1971,7 +2004,7 @@ def applyCommit(self, id):
         tmpFile = os.fdopen(handle, "w+b")
         if self.isWindows:
             submitTemplate = submitTemplate.replace("\n", "\r\n")
-        tmpFile.write(submitTemplate)
+        tmpFile.write(encode_text_stream(submitTemplate))
         tmpFile.close()
 
         if self.prepare_p4_only:
@@ -2018,7 +2051,7 @@ def applyCommit(self, id):
             if self.edit_template(fileName):
                 # read the edited message and submit
                 tmpFile = open(fileName, "rb")
-                message = tmpFile.read()
+                message = decode_text_stream(tmpFile.read())
                 tmpFile.close()
                 if self.isWindows:
                     message = message.replace("\r\n", "\n")
@@ -2707,7 +2740,7 @@ def splitFilesIntoBranches(self, commit):
         return branches
 
     def writeToGitStream(self, gitMode, relPath, contents):
-        self.gitStream.write('M %s inline %s\n' % (gitMode, relPath))
+        self.gitStream.write(encode_text_stream(u'M {} inline {}\n'.format(gitMode, relPath)))
         self.gitStream.write('data %d\n' % sum(len(d) for d in contents))
         for d in contents:
             self.gitStream.write(d)
@@ -2748,7 +2781,7 @@ def streamOneP4File(self, file, contents):
             git_mode = "120000"
             # p4 print on a symlink sometimes contains "target\n";
             # if it does, remove the newline
-            data = ''.join(contents)
+            data = ''.join(decode_text_stream(c) for c in contents)
             if not data:
                 # Some version of p4 allowed creating a symlink that pointed
                 # to nothing.  This causes p4 errors when checking out such
@@ -2802,7 +2835,7 @@ def streamOneP4File(self, file, contents):
         pattern = p4_keywords_regexp_for_type(type_base, type_mods)
         if pattern:
             regexp = re.compile(pattern, re.VERBOSE)
-            text = ''.join(contents)
+            text = ''.join(decode_text_stream(c) for c in contents)
             text = regexp.sub(r'$\1$', text)
             contents = [ text ]
 
@@ -2817,7 +2850,7 @@ def streamOneP4Deletion(self, file):
         if verbose:
             sys.stdout.write("delete %s\n" % relPath)
             sys.stdout.flush()
-        self.gitStream.write("D %s\n" % relPath)
+        self.gitStream.write(encode_text_stream(u'D {}\n'.format(relPath)))
 
         if self.largeFileSystem and self.largeFileSystem.isLargeFile(relPath):
             self.largeFileSystem.removeLargeFile(relPath)
@@ -2917,9 +2950,9 @@ def streamP4FilesCbSelf(entry):
                 if 'shelved_cl' in f:
                     # Handle shelved CLs using the "p4 print file@=N" syntax to print
                     # the contents
-                    fileArg = '%s@=%d' % (f['path'], f['shelved_cl'])
+                    fileArg = f['path'] + encode_text_stream('@={}'.format(f['shelved_cl']))
                 else:
-                    fileArg = '%s#%s' % (f['path'], f['rev'])
+                    fileArg = f['path'] + encode_text_stream('#{}'.format(f['rev']))
 
                 fileArgs.append(fileArg)
 
-- 
2.21.0.windows.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v2 05/14] git-p4: encode/decode communication with git for python3
       [not found] <20191213235247.23660-1-yang.zhao@skyboxlabs.com>
                   ` (4 preceding siblings ...)
  2019-12-13 23:52 ` [PATCH v2 04/14] git-p4: encode/decode communication with p4 for python3 Yang Zhao
@ 2019-12-13 23:52 ` Yang Zhao
  2019-12-13 23:52 ` [PATCH v2 06/14] git-p4: convert path to unicode before processing them Yang Zhao
                   ` (8 subsequent siblings)
  14 siblings, 0 replies; 22+ messages in thread
From: Yang Zhao @ 2019-12-13 23:52 UTC (permalink / raw)
  To: git; +Cc: Yang Zhao, luke, liu.denton, seraphire

Under python3, calls to write() on the stream to `git fast-import` must
be encoded.  This patch wraps the IO object such that this encoding is
done transparently.

Conversely, any text data read from subprocesses must also be decoded
before running through the rest of the pipeline.

Signed-off-by: Yang Zhao <yang.zhao@skyboxlabs.com>
Reviewed-by: Ben Keene <seraphire@gmail.com>
---
 git-p4.py | 21 +++++++++++++++++----
 1 file changed, 17 insertions(+), 4 deletions(-)

diff --git a/git-p4.py b/git-p4.py
index ca891e3d5d..d62fb05989 100755
--- a/git-p4.py
+++ b/git-p4.py
@@ -183,10 +183,12 @@ def read_pipe_full(c):
     (out, err) = p.communicate()
     return (p.returncode, out, decode_text_stream(err))
 
-def read_pipe(c, ignore_error=False):
+def read_pipe(c, ignore_error=False, raw=False):
     """ Read output from  command. Returns the output text on
         success. On failure, terminates execution, unless
         ignore_error is True, when it returns an empty string.
+
+        If raw is True, do not attempt to decode output text.
     """
     (retcode, out, err) = read_pipe_full(c)
     if retcode != 0:
@@ -194,6 +196,8 @@ def read_pipe(c, ignore_error=False):
             out = ""
         else:
             die('Command failed: %s\nError: %s' % (str(c), err))
+    if not raw:
+        out = decode_text_stream(out)
     return out
 
 def read_pipe_text(c):
@@ -220,7 +224,6 @@ def read_pipe_lines(c):
     val = [decode_text_stream(line) for line in pipe.readlines()]
     if pipe.close() or p.wait():
         die('Command failed: %s' % str(c))
-
     return val
 
 def p4_read_pipe_lines(c):
@@ -616,7 +619,8 @@ def p4CmdList(cmd, stdin=None, stdin_mode='w+b', cb=None, skip_info=False,
             stdin_file.write(stdin)
         else:
             for i in stdin:
-                stdin_file.write(i + '\n')
+                stdin_file.write(encode_text_stream(i))
+                stdin_file.write(b'\n')
         stdin_file.flush()
         stdin_file.seek(0)
 
@@ -1245,7 +1249,7 @@ def generatePointer(self, contentFile):
             ['git', 'lfs', 'pointer', '--file=' + contentFile],
             stdout=subprocess.PIPE
         )
-        pointerFile = pointerProcess.stdout.read()
+        pointerFile = decode_text_stream(pointerProcess.stdout.read())
         if pointerProcess.wait():
             os.remove(contentFile)
             die('git-lfs pointer command failed. Did you install the extension?')
@@ -3538,6 +3542,15 @@ def openStreams(self):
         self.gitStream = self.importProcess.stdin
         self.gitError = self.importProcess.stderr
 
+        if bytes is not str:
+            # Wrap gitStream.write() so that it can be called using `str` arguments
+            def make_encoded_write(write):
+                def encoded_write(s):
+                    return write(s.encode() if isinstance(s, str) else s)
+                return encoded_write
+
+            self.gitStream.write = make_encoded_write(self.gitStream.write)
+
     def closeStreams(self):
         self.gitStream.close()
         if self.importProcess.wait() != 0:
-- 
2.21.0.windows.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v2 06/14] git-p4: convert path to unicode before processing them
       [not found] <20191213235247.23660-1-yang.zhao@skyboxlabs.com>
                   ` (5 preceding siblings ...)
  2019-12-13 23:52 ` [PATCH v2 05/14] git-p4: encode/decode communication with git " Yang Zhao
@ 2019-12-13 23:52 ` Yang Zhao
  2019-12-13 23:52 ` [PATCH v2 07/14] git-p4: open .gitp4-usercache.txt in text mode Yang Zhao
                   ` (7 subsequent siblings)
  14 siblings, 0 replies; 22+ messages in thread
From: Yang Zhao @ 2019-12-13 23:52 UTC (permalink / raw)
  To: git; +Cc: Yang Zhao, luke, liu.denton, seraphire

P4 allows essentially arbitrary encoding for path data while we would
perfer to be dealing only with unicode strings.  Since path data need to
survive round-trip back to p4, this patch implements the general policy
that we store path data as-is, but decode them to unicode before doing
any non-trivial processing.

A new `decode_path()` method is provided that generally does the correct
conversion, taking into account `git-p4.pathEncoding` configuration.

For python2.7, path strings will be left as-is if it only contains ASCII
characters.

For python3, decoding is always done so that we have str objects.

Signed-off-by: Yang Zhao <yang.zhao@skyboxlabs.com>
Reviewed-by: Ben Keene <seraphire@gmail.com>
---
 git-p4.py | 69 +++++++++++++++++++++++++++++++++++--------------------
 1 file changed, 44 insertions(+), 25 deletions(-)

diff --git a/git-p4.py b/git-p4.py
index d62fb05989..d43f373b2d 100755
--- a/git-p4.py
+++ b/git-p4.py
@@ -150,6 +150,21 @@ def decode_text_stream(s):
     def encode_text_stream(s):
         return s.encode('utf_8') if isinstance(s, unicode) else s
 
+def decode_path(path):
+    """Decode a given string (bytes or otherwise) using configured path encoding options
+    """
+    encoding = gitConfig('git-p4.pathEncoding') or 'utf_8'
+    if bytes is not str:
+        return path.decode(encoding, errors='replace') if isinstance(path, bytes) else path
+    else:
+        try:
+            path.decode('ascii')
+        except:
+            path = path.decode(encoding, errors='replace')
+            if verbose:
+                print('Path with non-ASCII characters detected. Used {} to decode: {}'.format(encoding, path))
+        return path
+
 def write_pipe(c, stdin):
     if verbose:
         sys.stderr.write('Writing pipe: %s\n' % str(c))
@@ -697,7 +712,8 @@ def p4Where(depotPath):
         if "depotFile" in entry:
             # Search for the base client side depot path, as long as it starts with the branch's P4 path.
             # The base path always ends with "/...".
-            if entry["depotFile"].find(depotPath) == 0 and entry["depotFile"][-4:] == "/...":
+            entry_path = decode_path(entry['depotFile'])
+            if entry_path.find(depotPath) == 0 and entry_path[-4:] == "/...":
                 output = entry
                 break
         elif "data" in entry:
@@ -712,11 +728,11 @@ def p4Where(depotPath):
         return ""
     clientPath = ""
     if "path" in output:
-        clientPath = output.get("path")
+        clientPath = decode_path(output['path'])
     elif "data" in output:
         data = output.get("data")
-        lastSpace = data.rfind(" ")
-        clientPath = data[lastSpace + 1:]
+        lastSpace = data.rfind(b" ")
+        clientPath = decode_path(data[lastSpace + 1:])
 
     if clientPath.endswith("..."):
         clientPath = clientPath[:-3]
@@ -2484,7 +2500,7 @@ def append(self, view_line):
 
     def convert_client_path(self, clientFile):
         # chop off //client/ part to make it relative
-        if not clientFile.startswith(self.client_prefix):
+        if not decode_path(clientFile).startswith(self.client_prefix):
             die("No prefix '%s' on clientFile '%s'" %
                 (self.client_prefix, clientFile))
         return clientFile[len(self.client_prefix):]
@@ -2493,7 +2509,7 @@ def update_client_spec_path_cache(self, files):
         """ Caching file paths by "p4 where" batch query """
 
         # List depot file paths exclude that already cached
-        fileArgs = [f['path'] for f in files if f['path'] not in self.client_spec_path_cache]
+        fileArgs = [f['path'] for f in files if decode_path(f['path']) not in self.client_spec_path_cache]
 
         if len(fileArgs) == 0:
             return  # All files in cache
@@ -2508,16 +2524,18 @@ def update_client_spec_path_cache(self, files):
             if "unmap" in res:
                 # it will list all of them, but only one not unmap-ped
                 continue
+            depot_path = decode_path(res['depotFile'])
             if gitConfigBool("core.ignorecase"):
-                res['depotFile'] = res['depotFile'].lower()
-            self.client_spec_path_cache[res['depotFile']] = self.convert_client_path(res["clientFile"])
+                depot_path = depot_path.lower()
+            self.client_spec_path_cache[depot_path] = self.convert_client_path(res["clientFile"])
 
         # not found files or unmap files set to ""
         for depotFile in fileArgs:
+            depotFile = decode_path(depotFile)
             if gitConfigBool("core.ignorecase"):
                 depotFile = depotFile.lower()
             if depotFile not in self.client_spec_path_cache:
-                self.client_spec_path_cache[depotFile] = ""
+                self.client_spec_path_cache[depotFile] = b''
 
     def map_in_client(self, depot_path):
         """Return the relative location in the client where this
@@ -2635,7 +2653,7 @@ def isPathWanted(self, path):
             elif path.lower() == p.lower():
                 return False
         for p in self.depotPaths:
-            if p4PathStartsWith(path, p):
+            if p4PathStartsWith(path, decode_path(p)):
                 return True
         return False
 
@@ -2644,7 +2662,7 @@ def extractFilesFromCommit(self, commit, shelved=False, shelved_cl = 0):
         fnum = 0
         while "depotFile%s" % fnum in commit:
             path =  commit["depotFile%s" % fnum]
-            found = self.isPathWanted(path)
+            found = self.isPathWanted(decode_path(path))
             if not found:
                 fnum = fnum + 1
                 continue
@@ -2678,7 +2696,7 @@ def stripRepoPath(self, path, prefixes):
         if self.useClientSpec:
             # branch detection moves files up a level (the branch name)
             # from what client spec interpretation gives
-            path = self.clientSpecDirs.map_in_client(path)
+            path = decode_path(self.clientSpecDirs.map_in_client(path))
             if self.detectBranches:
                 for b in self.knownBranches:
                     if p4PathStartsWith(path, b + "/"):
@@ -2712,14 +2730,15 @@ def splitFilesIntoBranches(self, commit):
         branches = {}
         fnum = 0
         while "depotFile%s" % fnum in commit:
-            path =  commit["depotFile%s" % fnum]
+            raw_path = commit["depotFile%s" % fnum]
+            path = decode_path(raw_path)
             found = self.isPathWanted(path)
             if not found:
                 fnum = fnum + 1
                 continue
 
             file = {}
-            file["path"] = path
+            file["path"] = raw_path
             file["rev"] = commit["rev%s" % fnum]
             file["action"] = commit["action%s" % fnum]
             file["type"] = commit["type%s" % fnum]
@@ -2728,7 +2747,7 @@ def splitFilesIntoBranches(self, commit):
             # start with the full relative path where this file would
             # go in a p4 client
             if self.useClientSpec:
-                relPath = self.clientSpecDirs.map_in_client(path)
+                relPath = decode_path(self.clientSpecDirs.map_in_client(path))
             else:
                 relPath = self.stripRepoPath(path, self.depotPaths)
 
@@ -2766,14 +2785,15 @@ def encodeWithUTF8(self, path):
     # - helper for streamP4Files
 
     def streamOneP4File(self, file, contents):
-        relPath = self.stripRepoPath(file['depotFile'], self.branchPrefixes)
-        relPath = self.encodeWithUTF8(relPath)
+        file_path = file['depotFile']
+        relPath = self.stripRepoPath(decode_path(file_path), self.branchPrefixes)
+
         if verbose:
             if 'fileSize' in self.stream_file:
                 size = int(self.stream_file['fileSize'])
             else:
                 size = 0 # deleted files don't get a fileSize apparently
-            sys.stdout.write('\r%s --> %s (%i MB)\n' % (file['depotFile'], relPath, size/1024/1024))
+            sys.stdout.write('\r%s --> %s (%i MB)\n' % (file_path, relPath, size/1024/1024))
             sys.stdout.flush()
 
         (type_base, type_mods) = split_p4_type(file["type"])
@@ -2791,7 +2811,7 @@ def streamOneP4File(self, file, contents):
                 # to nothing.  This causes p4 errors when checking out such
                 # a change, and errors here too.  Work around it by ignoring
                 # the bad symlink; hopefully a future change fixes it.
-                print("\nIgnoring empty symlink in %s" % file['depotFile'])
+                print("\nIgnoring empty symlink in %s" % file_path)
                 return
             elif data[-1] == '\n':
                 contents = [data[:-1]]
@@ -2810,7 +2830,7 @@ def streamOneP4File(self, file, contents):
             # just the native "NT" type.
             #
             try:
-                text = p4_read_pipe(['print', '-q', '-o', '-', '%s@%s' % (file['depotFile'], file['change'])])
+                text = p4_read_pipe(['print', '-q', '-o', '-', '%s@%s' % (decode_path(file['depotFile']), file['change'])], raw=True)
             except Exception as e:
                 if 'Translation of file content failed' in str(e):
                     type_base = 'binary'
@@ -2818,7 +2838,7 @@ def streamOneP4File(self, file, contents):
                     raise e
             else:
                 if p4_version_string().find('/NT') >= 0:
-                    text = text.replace('\r\n', '\n')
+                    text = text.replace(b'\r\n', b'\n')
                 contents = [ text ]
 
         if type_base == "apple":
@@ -2849,8 +2869,7 @@ def streamOneP4File(self, file, contents):
         self.writeToGitStream(git_mode, relPath, contents)
 
     def streamOneP4Deletion(self, file):
-        relPath = self.stripRepoPath(file['path'], self.branchPrefixes)
-        relPath = self.encodeWithUTF8(relPath)
+        relPath = self.stripRepoPath(decode_path(file['path']), self.branchPrefixes)
         if verbose:
             sys.stdout.write("delete %s\n" % relPath)
             sys.stdout.flush()
@@ -3037,8 +3056,8 @@ def commit(self, details, files, branch, parent = "", allow_empty=False):
         if self.clientSpecDirs:
             self.clientSpecDirs.update_client_spec_path_cache(files)
 
-        files = [f for f in files
-            if self.inClientSpec(f['path']) and self.hasBranchPrefix(f['path'])]
+        files = [f for (f, path) in ((f, decode_path(f['path'])) for f in files)
+            if self.inClientSpec(path) and self.hasBranchPrefix(path)]
 
         if gitConfigBool('git-p4.keepEmptyCommits'):
             allow_empty = True
-- 
2.21.0.windows.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v2 07/14] git-p4: open .gitp4-usercache.txt in text mode
       [not found] <20191213235247.23660-1-yang.zhao@skyboxlabs.com>
                   ` (6 preceding siblings ...)
  2019-12-13 23:52 ` [PATCH v2 06/14] git-p4: convert path to unicode before processing them Yang Zhao
@ 2019-12-13 23:52 ` Yang Zhao
  2019-12-13 23:52 ` [PATCH v2 08/14] git-p4: use marshal format version 2 when sending to p4 Yang Zhao
                   ` (6 subsequent siblings)
  14 siblings, 0 replies; 22+ messages in thread
From: Yang Zhao @ 2019-12-13 23:52 UTC (permalink / raw)
  To: git; +Cc: Yang Zhao, luke, liu.denton, seraphire

Opening .gitp4-usercache.txt in text mode makes python 3 happy without
explicitly adding encoding and decoding.

Signed-off-by: Yang Zhao <yang.zhao@skyboxlabs.com>
Reviewed-by: Ben Keene <seraphire@gmail.com>
---
 git-p4.py | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/git-p4.py b/git-p4.py
index d43f373b2d..abcda60eee 100755
--- a/git-p4.py
+++ b/git-p4.py
@@ -1395,14 +1395,14 @@ def getUserMapFromPerforceServer(self):
         for (key, val) in self.users.items():
             s += "%s\t%s\n" % (key.expandtabs(1), val.expandtabs(1))
 
-        open(self.getUserCacheFilename(), "wb").write(s)
+        open(self.getUserCacheFilename(), 'w').write(s)
         self.userMapFromPerforceServer = True
 
     def loadUserMapFromCache(self):
         self.users = {}
         self.userMapFromPerforceServer = False
         try:
-            cache = open(self.getUserCacheFilename(), "rb")
+            cache = open(self.getUserCacheFilename(), 'r')
             lines = cache.readlines()
             cache.close()
             for line in lines:
-- 
2.21.0.windows.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v2 08/14] git-p4: use marshal format version 2 when sending to p4
       [not found] <20191213235247.23660-1-yang.zhao@skyboxlabs.com>
                   ` (7 preceding siblings ...)
  2019-12-13 23:52 ` [PATCH v2 07/14] git-p4: open .gitp4-usercache.txt in text mode Yang Zhao
@ 2019-12-13 23:52 ` Yang Zhao
  2019-12-13 23:52 ` [PATCH v2 09/14] git-p4: fix freezing while waiting for fast-import progress Yang Zhao
                   ` (5 subsequent siblings)
  14 siblings, 0 replies; 22+ messages in thread
From: Yang Zhao @ 2019-12-13 23:52 UTC (permalink / raw)
  To: git; +Cc: Yang Zhao, luke, liu.denton, seraphire

p4 does not appear to understand marshal format version 3 and above.
Version 2 was the latest supported by python-2.7.

Signed-off-by: Yang Zhao <yang.zhao@skyboxlabs.com>
Reviewed-by: Ben Keene <seraphire@gmail.com>
---
 git-p4.py | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/git-p4.py b/git-p4.py
index abcda60eee..c7170c9ae6 100755
--- a/git-p4.py
+++ b/git-p4.py
@@ -1679,7 +1679,8 @@ def modifyChangelistUser(self, changelist, newUser):
         c = changes[0]
         if c['User'] == newUser: return   # nothing to do
         c['User'] = newUser
-        input = marshal.dumps(c)
+        # p4 does not understand format version 3 and above
+        input = marshal.dumps(c, 2)
 
         result = p4CmdList("change -f -i", stdin=input)
         for r in result:
-- 
2.21.0.windows.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v2 09/14] git-p4: fix freezing while waiting for fast-import progress
       [not found] <20191213235247.23660-1-yang.zhao@skyboxlabs.com>
                   ` (8 preceding siblings ...)
  2019-12-13 23:52 ` [PATCH v2 08/14] git-p4: use marshal format version 2 when sending to p4 Yang Zhao
@ 2019-12-13 23:52 ` Yang Zhao
  2019-12-13 23:52 ` [PATCH v2 10/14] git-p4: use functools.reduce instead of reduce Yang Zhao
                   ` (4 subsequent siblings)
  14 siblings, 0 replies; 22+ messages in thread
From: Yang Zhao @ 2019-12-13 23:52 UTC (permalink / raw)
  To: git; +Cc: Yang Zhao, luke, liu.denton, seraphire

As part of its importing process, git-p4 sends a `checkpoint` followed
immediately by `progress` to fast-import to force synchronization.
Due to buffering, it is possible for the `progress` command to not be
flushed before git-p4 begins to wait for the corresponding response.
This causes the script to freeze completely, and is consistently
observable at least on python-3.6.9.

Make sure this command sequence is completely flushed before waiting.

Signed-off-by: Yang Zhao <yang.zhao@skyboxlabs.com>
Reviewed-by: Ben Keene <seraphire@gmail.com>
---
 git-p4.py | 1 +
 1 file changed, 1 insertion(+)

diff --git a/git-p4.py b/git-p4.py
index c7170c9ae6..68f9f4fdc6 100755
--- a/git-p4.py
+++ b/git-p4.py
@@ -2641,6 +2641,7 @@ def __init__(self):
     def checkpoint(self):
         self.gitStream.write("checkpoint\n\n")
         self.gitStream.write("progress checkpoint\n\n")
+        self.gitStream.flush()
         out = self.gitOutput.readline()
         if self.verbose:
             print("checkpoint finished: " + out)
-- 
2.21.0.windows.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v2 10/14] git-p4: use functools.reduce instead of reduce
       [not found] <20191213235247.23660-1-yang.zhao@skyboxlabs.com>
                   ` (9 preceding siblings ...)
  2019-12-13 23:52 ` [PATCH v2 09/14] git-p4: fix freezing while waiting for fast-import progress Yang Zhao
@ 2019-12-13 23:52 ` Yang Zhao
  2019-12-13 23:52 ` [PATCH v2 11/14] git-p4: use dict.items() iteration for python3 compatibility Yang Zhao
                   ` (3 subsequent siblings)
  14 siblings, 0 replies; 22+ messages in thread
From: Yang Zhao @ 2019-12-13 23:52 UTC (permalink / raw)
  To: git; +Cc: Yang Zhao, luke, liu.denton, seraphire

For python3, reduce() has been moved to functools.reduce().  This is
also available in python2.7.

Signed-off-by: Yang Zhao <yang.zhao@skyboxlabs.com>
---
 git-p4.py | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/git-p4.py b/git-p4.py
index 68f9f4fdc6..7deee1b939 100755
--- a/git-p4.py
+++ b/git-p4.py
@@ -13,6 +13,7 @@
     sys.exit(1)
 import os
 import optparse
+import functools
 import marshal
 import subprocess
 import tempfile
@@ -1158,7 +1159,7 @@ def pushFile(self, localLargeFile):
         assert False, "Method 'pushFile' required in " + self.__class__.__name__
 
     def hasLargeFileExtension(self, relPath):
-        return reduce(
+        return functools.reduce(
             lambda a, b: a or b,
             [relPath.endswith('.' + e) for e in gitConfigList('git-p4.largeFileExtensions')],
             False
-- 
2.21.0.windows.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v2 11/14] git-p4: use dict.items() iteration for python3 compatibility
       [not found] <20191213235247.23660-1-yang.zhao@skyboxlabs.com>
                   ` (10 preceding siblings ...)
  2019-12-13 23:52 ` [PATCH v2 10/14] git-p4: use functools.reduce instead of reduce Yang Zhao
@ 2019-12-13 23:52 ` Yang Zhao
  2019-12-13 23:52 ` [PATCH v2 12/14] git-p4: simplify regex pattern generation for parsing diff-tree Yang Zhao
                   ` (2 subsequent siblings)
  14 siblings, 0 replies; 22+ messages in thread
From: Yang Zhao @ 2019-12-13 23:52 UTC (permalink / raw)
  To: git; +Cc: Yang Zhao, luke, liu.denton, seraphire

Python3 uses dict.items() instead of .iteritems() to provide
iteratoration over dict.  Although items() is technically less efficient
for python2.7 (allocates a new list instead of simply iterating), the
amount of data involved is very small and the penalty negligible.

Signed-off-by: Yang Zhao <yang.zhao@skyboxlabs.com>
Reviewed-by: Ben Keene <seraphire@gmail.com>
---
 git-p4.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/git-p4.py b/git-p4.py
index 7deee1b939..b7e31d4738 100755
--- a/git-p4.py
+++ b/git-p4.py
@@ -1745,7 +1745,7 @@ def prepareSubmitTemplate(self, changelist=None):
                 break
         if not change_entry:
             die('Failed to decode output of p4 change -o')
-        for key, value in change_entry.iteritems():
+        for key, value in change_entry.items():
             if key.startswith('File'):
                 if 'depot-paths' in settings:
                     if not [p for p in settings['depot-paths']
-- 
2.21.0.windows.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v2 12/14] git-p4: simplify regex pattern generation for parsing diff-tree
       [not found] <20191213235247.23660-1-yang.zhao@skyboxlabs.com>
                   ` (11 preceding siblings ...)
  2019-12-13 23:52 ` [PATCH v2 11/14] git-p4: use dict.items() iteration for python3 compatibility Yang Zhao
@ 2019-12-13 23:52 ` Yang Zhao
  2019-12-13 23:52 ` [PATCH v2 13/14] git-p4: use python3's input() everywhere Yang Zhao
  2019-12-13 23:52 ` [RFC PATCH v2 14/14] ci: also run linux-gcc pipeline with python3.5 environment Yang Zhao
  14 siblings, 0 replies; 22+ messages in thread
From: Yang Zhao @ 2019-12-13 23:52 UTC (permalink / raw)
  To: git; +Cc: Yang Zhao, luke, liu.denton, seraphire

It is not clear why a generator was used to create the regex used to
parse git-diff-tree output; I assume an early implementation required
it, but is not part of the mainline change.

Simply use a lazily initialized global instead.

Signed-off-by: Yang Zhao <yang.zhao@skyboxlabs.com>
Reviewed-by: Ben Keene <seraphire@gmail.com>
---
 git-p4.py | 13 ++++++-------
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/git-p4.py b/git-p4.py
index b7e31d4738..3af8df9f83 100755
--- a/git-p4.py
+++ b/git-p4.py
@@ -544,12 +544,7 @@ def getGitTags():
         gitTags.add(tag)
     return gitTags
 
-def diffTreePattern():
-    # This is a simple generator for the diff tree regex pattern. This could be
-    # a class variable if this and parseDiffTreeEntry were a part of a class.
-    pattern = re.compile(':(\d+) (\d+) (\w+) (\w+) ([A-Z])(\d+)?\t(.*?)((\t(.*))|$)')
-    while True:
-        yield pattern
+_diff_tree_pattern = None
 
 def parseDiffTreeEntry(entry):
     """Parses a single diff tree entry into its component elements.
@@ -570,7 +565,11 @@ def parseDiffTreeEntry(entry):
 
     If the pattern is not matched, None is returned."""
 
-    match = diffTreePattern().next().match(entry)
+    global _diff_tree_pattern
+    if not _diff_tree_pattern:
+        _diff_tree_pattern = re.compile(':(\d+) (\d+) (\w+) (\w+) ([A-Z])(\d+)?\t(.*?)((\t(.*))|$)')
+
+    match = _diff_tree_pattern.match(entry)
     if match:
         return {
             'src_mode': match.group(1),
-- 
2.21.0.windows.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v2 13/14] git-p4: use python3's input() everywhere
       [not found] <20191213235247.23660-1-yang.zhao@skyboxlabs.com>
                   ` (12 preceding siblings ...)
  2019-12-13 23:52 ` [PATCH v2 12/14] git-p4: simplify regex pattern generation for parsing diff-tree Yang Zhao
@ 2019-12-13 23:52 ` Yang Zhao
  2019-12-13 23:52 ` [RFC PATCH v2 14/14] ci: also run linux-gcc pipeline with python3.5 environment Yang Zhao
  14 siblings, 0 replies; 22+ messages in thread
From: Yang Zhao @ 2019-12-13 23:52 UTC (permalink / raw)
  To: git; +Cc: Yang Zhao, luke, liu.denton, seraphire

Python3 deprecates raw_input() from 2.7 and replaced it with input().
Since we do not need 2.7's input() semantics, `raw_input()` is aliased
to `input()` for easy forward compatability.

Signed-off-by: Yang Zhao <yang.zhao@skyboxlabs.com>
---
 git-p4.py | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/git-p4.py b/git-p4.py
index 3af8df9f83..17f72f4309 100755
--- a/git-p4.py
+++ b/git-p4.py
@@ -27,6 +27,16 @@
 import ctypes
 import errno
 
+# On python2.7 where raw_input() and input() are both availble,
+# we want raw_input's semantics, but aliased to input for python3
+# compatibility
+# support basestring in python3
+try:
+    if raw_input and input:
+        input = raw_input
+except:
+    pass
+
 verbose = False
 
 # Only labels/tags matching this will be imported/exported
@@ -1801,7 +1811,7 @@ def edit_template(self, template_file):
             return True
 
         while True:
-            response = raw_input("Submit template unchanged. Submit anyway? [y]es, [n]o (skip this patch) ")
+            response = input("Submit template unchanged. Submit anyway? [y]es, [n]o (skip this patch) ")
             if response == 'y':
                 return True
             if response == 'n':
@@ -2372,7 +2382,7 @@ def run(self, args):
                         # prompt for what to do, or use the option/variable
                         if self.conflict_behavior == "ask":
                             print("What do you want to do?")
-                            response = raw_input("[s]kip this commit but apply"
+                            response = input("[s]kip this commit but apply"
                                                  " the rest, or [q]uit? ")
                             if not response:
                                 continue
-- 
2.21.0.windows.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [RFC PATCH v2 14/14] ci: also run linux-gcc pipeline with python3.5 environment
       [not found] <20191213235247.23660-1-yang.zhao@skyboxlabs.com>
                   ` (13 preceding siblings ...)
  2019-12-13 23:52 ` [PATCH v2 13/14] git-p4: use python3's input() everywhere Yang Zhao
@ 2019-12-13 23:52 ` Yang Zhao
  14 siblings, 0 replies; 22+ messages in thread
From: Yang Zhao @ 2019-12-13 23:52 UTC (permalink / raw)
  To: git; +Cc: Yang Zhao, luke, liu.denton, seraphire, szeder.dev

git-p4.py includes support for python3, but this was not previously
validated in CI. Lets actually do that.

As of writing, python-3.5 has reached end-of-life, but has been updated
recently enough that it's reasonable to attemp to support it. We do not
have a pressing need for features only available in 3.6 and later.

Usage of python3 is limited to the linux-gcc pipeline on Azure. It is
assumed that passing both python2 and python3 tests on one platform
translates to doing the same on others.

Travis-CI is unchanged, as no tests are run in those environments.

Signed-off-by: Yang Zhao <yang.zhao@skyboxlabs.com>
---

There has been some desire to make this more generally applicable instead
of being Azure Pipelines specific.  There may be some significant work
required to make that work for all platforms where t98** tests are being run.

I most likely won't have the bandwith to take on that task in the near future.
If this patch is deemed insufficient, I would perfer it be dropped form the
series rather than become a roadblock.

Previous discussion thread here:
http://public-inbox.org/git/20191210103014.GF6527@szeder.dev/

 azure-pipelines.yml | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/azure-pipelines.yml b/azure-pipelines.yml
index af2a5ea484..c473365812 100644
--- a/azure-pipelines.yml
+++ b/azure-pipelines.yml
@@ -331,7 +331,18 @@ jobs:
   displayName: linux-gcc
   condition: succeeded()
   pool: Hosted Ubuntu 1604
+  strategy:
+    matrix:
+      python27:
+        python.version: '2.7'
+      python35:
+        python.version: '3.5'
   steps:
+  - task: UsePythonVersion@0
+    inputs:
+      versionSpec: '$(python.version)'
+  - bash: |
+      echo "##vso[task.setvariable variable=python_path]$(which python)"
   - bash: |
        test "$GITFILESHAREPWD" = '$(gitfileshare.pwd)' || ci/mount-fileshare.sh //gitfileshare.file.core.windows.net/test-cache gitfileshare "$GITFILESHAREPWD" "$HOME/test-cache" || exit 1
 
-- 
2.21.0.windows.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 04/14] git-p4: encode/decode communication with p4 for python3
  2019-12-13 23:52 ` [PATCH v2 04/14] git-p4: encode/decode communication with p4 for python3 Yang Zhao
@ 2019-12-17 22:51   ` Junio C Hamano
  0 siblings, 0 replies; 22+ messages in thread
From: Junio C Hamano @ 2019-12-17 22:51 UTC (permalink / raw)
  To: Yang Zhao; +Cc: git, luke, liu.denton, seraphire

Yang Zhao <yang.zhao@skyboxlabs.com> writes:

> The marshalled dict in the response given on STDOUT by p4 uses `str` for
> keys and string values. When run using python3, these values are
> deserialized as `bytes`, leading to a whole host of problems as the rest
> of the code assumes `str` is used throughout.
>
> This patch changes the deserialization behaviour such that, as much as
> possible, text output from p4 is decoded to native unicode strings.
> Exceptions are made for the field `data` as it is usually arbitrary
> binary data. `depotFile[0-9]*`, `path`, and `clientFile` are also exempt
> as they contain path strings not encoded with UTF-8, and must survive
> survive round-trip back to p4.

Doubled "survive"; will drop one of them while queuing.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 00/14] git-p4: python3 compatibility
  2019-12-13 23:52 ` [PATCH v2 00/14] git-p4: python3 compatibility Yang Zhao
@ 2020-01-17 22:00   ` Yang Zhao
  2020-01-24 20:14     ` Luke Diamand
  0 siblings, 1 reply; 22+ messages in thread
From: Yang Zhao @ 2020-01-17 22:00 UTC (permalink / raw)
  To: Git Users

On Fri, Dec 13, 2019 at 3:53 PM Yang Zhao <yang.zhao@skyboxlabs.com> wrote:
> This patchset adds python3 compatibility to git-p4.

Ping?

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 00/14] git-p4: python3 compatibility
  2020-01-17 22:00   ` Yang Zhao
@ 2020-01-24 20:14     ` Luke Diamand
  2020-01-30 13:35       ` Luke Diamand
  0 siblings, 1 reply; 22+ messages in thread
From: Luke Diamand @ 2020-01-24 20:14 UTC (permalink / raw)
  To: Yang Zhao; +Cc: Git Users

On Fri, 17 Jan 2020 at 22:00, Yang Zhao <yang.zhao@skyboxlabs.com> wrote:
>
> On Fri, Dec 13, 2019 at 3:53 PM Yang Zhao <yang.zhao@skyboxlabs.com> wrote:
> > This patchset adds python3 compatibility to git-p4.
>
> Ping?

Hi!

Sorry, I've been a bit busy.

I quickly tried this and it was failing for me with:

> failure accessing depot: could not connect

It works fine with the interpreter set to /usr/bin/python, but
changing it to python3 caused this problem.

I'm at 050387ce43179f1b0da364dd6eec10ce578d6583.

Thanks,
Luke

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 00/14] git-p4: python3 compatibility
  2020-01-24 20:14     ` Luke Diamand
@ 2020-01-30 13:35       ` Luke Diamand
  2020-02-03 12:54         ` Luke Diamand
  0 siblings, 1 reply; 22+ messages in thread
From: Luke Diamand @ 2020-01-30 13:35 UTC (permalink / raw)
  To: Yang Zhao; +Cc: Git Users

On Fri, 24 Jan 2020 at 20:14, Luke Diamand <luke@diamand.org> wrote:
>
> On Fri, 17 Jan 2020 at 22:00, Yang Zhao <yang.zhao@skyboxlabs.com> wrote:
> >
> > On Fri, Dec 13, 2019 at 3:53 PM Yang Zhao <yang.zhao@skyboxlabs.com> wrote:
> > > This patchset adds python3 compatibility to git-p4.
> >
> > Ping?
>
> Hi!
>
> Sorry, I've been a bit busy.
>
> I quickly tried this and it was failing for me with:
>
> > failure accessing depot: could not connect
>
> It works fine with the interpreter set to /usr/bin/python, but
> changing it to python3 caused this problem.
>
> I'm at 050387ce43179f1b0da364dd6eec10ce578d6583.

I'm using it for day-to-day work, so far it's working well, thanks!

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 00/14] git-p4: python3 compatibility
  2020-01-30 13:35       ` Luke Diamand
@ 2020-02-03 12:54         ` Luke Diamand
  2020-02-03 18:11           ` Yang Zhao
  0 siblings, 1 reply; 22+ messages in thread
From: Luke Diamand @ 2020-02-03 12:54 UTC (permalink / raw)
  To: Yang Zhao; +Cc: Git Users

On Thu, 30 Jan 2020 at 13:35, Luke Diamand <luke@diamand.org> wrote:
>
> On Fri, 24 Jan 2020 at 20:14, Luke Diamand <luke@diamand.org> wrote:
> >
> > On Fri, 17 Jan 2020 at 22:00, Yang Zhao <yang.zhao@skyboxlabs.com> wrote:
> > >
> > > On Fri, Dec 13, 2019 at 3:53 PM Yang Zhao <yang.zhao@skyboxlabs.com> wrote:
> > > > This patchset adds python3 compatibility to git-p4.
> > >
> > > Ping?

One very small bug report:

When doing "git p4 sync" it prints out the percent complete. It looks
like it's no longer rounding it sensibly, so where before it would say
77%, now it says 77.7777777%.

It's this line:

   sys.stdout.write("\rImporting revision %s (%s%%)" % (change, cnt *
100 / len(changes)))

I think / just needs replacing with //.


> >
> > Hi!
> >
> > Sorry, I've been a bit busy.
> >
> > I quickly tried this and it was failing for me with:
> >
> > > failure accessing depot: could not connect
> >
> > It works fine with the interpreter set to /usr/bin/python, but
> > changing it to python3 caused this problem.
> >
> > I'm at 050387ce43179f1b0da364dd6eec10ce578d6583.
>
> I'm using it for day-to-day work, so far it's working well, thanks!

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 00/14] git-p4: python3 compatibility
  2020-02-03 12:54         ` Luke Diamand
@ 2020-02-03 18:11           ` Yang Zhao
  2020-02-04  1:35             ` Yang Zhao
  0 siblings, 1 reply; 22+ messages in thread
From: Yang Zhao @ 2020-02-03 18:11 UTC (permalink / raw)
  To: Luke Diamand; +Cc: Git Users

On Mon, Feb 3, 2020 at 4:54 AM Luke Diamand <luke@diamand.org> wrote:
> One very small bug report:
>
> When doing "git p4 sync" it prints out the percent complete. It looks
> like it's no longer rounding it sensibly, so where before it would say
> 77%, now it says 77.7777777%.
>
> It's this line:
>
>    sys.stdout.write("\rImporting revision %s (%s%%)" % (change, cnt *
> 100 / len(changes)))
>
> I think / just needs replacing with //.

Good catch.

The patch below should do the trick, and be more explicit about how
we're formatting things.

I'll roll it into the GitHub branch.

diff --git a/git-p4.py b/git-p4.py
index ca0a874501..183959ec8c 100755
--- a/git-p4.py
+++ b/git-p4.py
@@ -2960,7 +2960,7 @@ def streamP4FilesCb(self, marshalled):
             size = int(self.stream_file["fileSize"])
             if size > 0:
                 progress = 100*self.stream_file['streamContentSize']/size
-                sys.stdout.write('\r%s %d%% (%i MB)' %
(self.stream_file['depotFile'], progress, int(size/1024/1024)))
+                sys.stdout.write('\r{} {:.0%} ({}
MB)'.format(self.stream_file['depotFile'], progress,
int(size/1024/1024)))
                 sys.stdout.flush()

         self.stream_have_file_info = True
@@ -3435,7 +3435,7 @@ def importChanges(self, changes, origin_revision=0):
             self.updateOptionDict(description)

             if not self.silent:
-                sys.stdout.write("\rImporting revision %s (%s%%)" %
(change, cnt * 100 / len(changes)))
+                sys.stdout.write("\rImporting revision {}
({:.0%})".format(change, cnt / len(changes)))
                 sys.stdout.flush()
             cnt = cnt + 1

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 00/14] git-p4: python3 compatibility
  2020-02-03 18:11           ` Yang Zhao
@ 2020-02-04  1:35             ` Yang Zhao
  0 siblings, 0 replies; 22+ messages in thread
From: Yang Zhao @ 2020-02-04  1:35 UTC (permalink / raw)
  To: Luke Diamand; +Cc: Git Users

On Mon, Feb 3, 2020 at 10:11 AM Yang Zhao <yang.zhao@skyboxlabs.com> wrote:
>
> On Mon, Feb 3, 2020 at 4:54 AM Luke Diamand <luke@diamand.org> wrote:
> > One very small bug report:
> >
> > When doing "git p4 sync" it prints out the percent complete. It looks
> > like it's no longer rounding it sensibly, so where before it would say
> > 77%, now it says 77.7777777%.
> >
> > It's this line:
> >
> >    sys.stdout.write("\rImporting revision %s (%s%%)" % (change, cnt *
> > 100 / len(changes)))
> >
> > I think / just needs replacing with //.
>
> Good catch.
>
> The patch below should do the trick, and be more explicit about how
> we're formatting things.

Oops, missed a line. Below is more correct (pardon the webmail patch clobbering)

diff --git a/git-p4.py b/git-p4.py
index ca0a874501..da3a5aa684 100755
--- a/git-p4.py
+++ b/git-p4.py
@@ -2959,8 +2959,8 @@ def streamP4FilesCb(self, marshalled):
             'depotFile' in self.stream_file):
             size = int(self.stream_file["fileSize"])
             if size > 0:
-                progress = 100*self.stream_file['streamContentSize']/size
-                sys.stdout.write('\r%s %d%% (%i MB)' %
(self.stream_file['depotFile'], progress, int(size/1024/1024)))
+                progress = self.stream_file['streamContentSize']/size
+                sys.stdout.write('\r{} {:.0%} ({}
MB)'.format(self.stream_file['depotFile'], progress,
int(size/1024/1024)))
                 sys.stdout.flush()

         self.stream_have_file_info = True
@@ -3435,7 +3435,7 @@ def importChanges(self, changes, origin_revision=0):
             self.updateOptionDict(description)

             if not self.silent:
-                sys.stdout.write("\rImporting revision %s (%s%%)" %
(change, cnt * 100 / len(changes)))
+                sys.stdout.write("\rImporting revision {}
({:.0%})".format(change, cnt / len(changes)))
                 sys.stdout.flush()
             cnt = cnt + 1

^ permalink raw reply related	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2020-02-04  1:36 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20191213235247.23660-1-yang.zhao@skyboxlabs.com>
2019-12-13 23:52 ` [PATCH v2 00/14] git-p4: python3 compatibility Yang Zhao
2020-01-17 22:00   ` Yang Zhao
2020-01-24 20:14     ` Luke Diamand
2020-01-30 13:35       ` Luke Diamand
2020-02-03 12:54         ` Luke Diamand
2020-02-03 18:11           ` Yang Zhao
2020-02-04  1:35             ` Yang Zhao
2019-12-13 23:52 ` [PATCH v2 01/14] git-p4: make python2.7 the oldest supported version Yang Zhao
2019-12-13 23:52 ` [PATCH v2 02/14] git-p4: change the expansion test from basestring to list Yang Zhao
2019-12-13 23:52 ` [PATCH v2 03/14] git-p4: remove string type aliasing Yang Zhao
2019-12-13 23:52 ` [PATCH v2 04/14] git-p4: encode/decode communication with p4 for python3 Yang Zhao
2019-12-17 22:51   ` Junio C Hamano
2019-12-13 23:52 ` [PATCH v2 05/14] git-p4: encode/decode communication with git " Yang Zhao
2019-12-13 23:52 ` [PATCH v2 06/14] git-p4: convert path to unicode before processing them Yang Zhao
2019-12-13 23:52 ` [PATCH v2 07/14] git-p4: open .gitp4-usercache.txt in text mode Yang Zhao
2019-12-13 23:52 ` [PATCH v2 08/14] git-p4: use marshal format version 2 when sending to p4 Yang Zhao
2019-12-13 23:52 ` [PATCH v2 09/14] git-p4: fix freezing while waiting for fast-import progress Yang Zhao
2019-12-13 23:52 ` [PATCH v2 10/14] git-p4: use functools.reduce instead of reduce Yang Zhao
2019-12-13 23:52 ` [PATCH v2 11/14] git-p4: use dict.items() iteration for python3 compatibility Yang Zhao
2019-12-13 23:52 ` [PATCH v2 12/14] git-p4: simplify regex pattern generation for parsing diff-tree Yang Zhao
2019-12-13 23:52 ` [PATCH v2 13/14] git-p4: use python3's input() everywhere Yang Zhao
2019-12-13 23:52 ` [RFC PATCH v2 14/14] ci: also run linux-gcc pipeline with python3.5 environment Yang Zhao

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).