git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Ben Keene via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: Ben Keene <seraphire@gmail.com>,
	Junio C Hamano <gitster@pobox.com>,
	Ben Keene <seraphire@gmail.com>
Subject: [PATCH v5 08/15] git-p4: add casting helper functions for python 3 conversion
Date: Sat, 07 Dec 2019 17:47:36 +0000	[thread overview]
Message-ID: <1e677781d2cc75371b5362c7e63ea5ddf824d5da.1575740863.git.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.463.v5.git.1575740863.gitgitgadget@gmail.com>

From: Ben Keene <seraphire@gmail.com>

Python 3 handles strings differently than Python 2.7.  Since Python 2
is reaching it's end of life, a series of changes are being submitted to
enable python 3.5 and following support. The current code fails basic
tests under python 3.5.

Change the existing unicode test add new support functions for
Python 2 - Python 3 support.

Define the following variables:
- isunicode - a boolean variable that states if the version of python
              natively supports unicode (true) or not (false). This is
              true for Python 3 and false for Python 2.
- unicode   - a type alias for the datatype that holds a unicode string.
              It is assigned to a str under Python 3 and the unicode
              type for Python 2.
- bytes     - a type alias for an array of bytes.  It is assigned the
              native bytes type for Python 3 and str for Python 2.

Add the following new functions:

- as_string(text)  - A new function that will convert a byte array to a
                     unicode (UTF-8) string under Python 3.  Under
                     Python 2, this returns the string unchanged.
- as_bytes(text)   - A new function that will convert a unicode string
                     to a byte array under Python 3.  Under Python 2,
                     this returns the string unchanged.
- to_unicode(text) - Converts a text string as Unicode(UTF-8) on both
                     Python 2 and Python 3.

Add a new function alias raw_input:
If raw_input does not exist (it was renamed to input in Python 3) alias
input as raw_input.

The as_string() and as_bytes() functions allow for modifying the code
with a minimal amount of impact on Python 2 support. When a string is
expected, the as_string() will be used to "cast" the incoming "bytes"
to a string type.

Conversely as_bytes() will be used to cast a "string" to a "byte array"
type. Since Python 2 overloads the datatype 'str' to serve both purposes,
the Python 2 versions of these function do not change the data. This
reduces the regression impact of these code changes.

'basestring' is removed since its only references are found in tests
that were changed in modified in previous commits.

Signed-off-by: Ben Keene <seraphire@gmail.com>
---
 git-p4.py | 80 ++++++++++++++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 74 insertions(+), 6 deletions(-)

diff --git a/git-p4.py b/git-p4.py
index e020958083..e6f7513384 100755
--- a/git-p4.py
+++ b/git-p4.py
@@ -32,16 +32,84 @@
     unicode = unicode
 except NameError:
     # 'unicode' is undefined, must be Python 3
-    str = str
+    #
+    # For Python 3 which is natively unicode, we will use
+    # unicode for internal information but all P4 Data
+    # will remain in bytes
+    isunicode = True
     unicode = str
     bytes = bytes
-    basestring = (str,bytes)
+
+    def as_string(text):
+        """ Return a byte array as a unicode string
+        """
+        if text is None:
+            return None
+        if isinstance(text, bytes):
+            return unicode(text, "utf-8")
+        else:
+            return text
+
+    def as_bytes(text):
+        """ Return a Unicode string as a byte array
+        """
+        if text is None:
+            return None
+        if isinstance(text, bytes):
+            return text
+        else:
+            return bytes(text, "utf-8")
+
+    def to_unicode(text):
+        """ Return a byte array as a unicode string
+        """
+        return as_string(text)
+
+    def path_as_string(path):
+        """ Converts a path to the UTF8 encoded string
+        """
+        if isinstance(path, unicode):
+            return path
+        return encodeWithUTF8(path).decode('utf-8')
+
 else:
     # 'unicode' exists, must be Python 2
-    str = str
+    #
+    # We will treat the data as:
+    #   str   -> str
+    #   bytes -> str
+    # So for Python 2 these functions are no-ops
+    # and will leave the data in the ambiguious
+    # string/bytes state
+    isunicode = False
     unicode = unicode
     bytes = str
-    basestring = basestring
+
+    def as_string(text):
+        """ Return text unaltered (for Python 3 support)
+        """
+        return text
+
+    def as_bytes(text):
+        """ Return text unaltered (for Python 3 support)
+        """
+        return text
+
+    def to_unicode(text):
+        """ Return a string as a unicode string
+        """
+        return text.decode('utf-8')
+
+    def path_as_string(path):
+        """ Converts a path to the UTF8 encoded bytes
+        """
+        return encodeWithUTF8(path)
+
+# Check for raw_input support
+try:
+    raw_input
+except NameError:
+    raw_input = input
 
 try:
     from subprocess import CalledProcessError
@@ -740,7 +808,7 @@ def p4Where(depotPath):
             if data[:space] == depotPath:
                 output = entry
                 break
-    if output == None:
+    if output is None:
         return ""
     if output["code"] == "error":
         return ""
@@ -4175,7 +4243,7 @@ def main():
     global verbose
     verbose = cmd.verbose
     if cmd.needsGit:
-        if cmd.gitdir == None:
+        if cmd.gitdir is None:
             cmd.gitdir = os.path.abspath(".git")
             if not isValidGitDir(cmd.gitdir):
                 # "rev-parse --git-dir" without arguments will try $PWD/.git
-- 
gitgitgadget


  parent reply	other threads:[~2019-12-07 17:47 UTC|newest]

Thread overview: 77+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-13 21:07 [PATCH 0/1] git-p4.py: Cast byte strings to unicode strings in python3 Ben Keene via GitGitGadget
2019-11-13 21:07 ` [PATCH 1/1] " Ben Keene via GitGitGadget
2019-11-14  2:25 ` [PATCH 0/1] git-p4.py: " Junio C Hamano
2019-11-14  9:46   ` Luke Diamand
2019-11-15 14:39 ` [PATCH v2 0/3] " Ben Keene via GitGitGadget
2019-11-15 14:39   ` [PATCH v2 1/3] " Ben Keene via GitGitGadget
2019-11-15 14:39   ` [PATCH v2 2/3] FIX: cast as unicode fails when a value is already unicode Ben Keene via GitGitGadget
2019-11-15 14:39   ` [PATCH v2 3/3] FIX: wrap return for read_pipe_lines in ustring() and wrap GitLFS read of the pointer file in ustring() Ben Keene via GitGitGadget
2019-12-02 19:02   ` [PATCH v3 0/1] git-p4.py: Cast byte strings to unicode strings in python3 Ben Keene via GitGitGadget
2019-12-02 19:02     ` [PATCH v3 1/1] Python3 support for t9800 tests. Basic P4/Python3 support Ben Keene via GitGitGadget
2019-12-03  0:18       ` Denton Liu
2019-12-03 16:03         ` Ben Keene
2019-12-04  6:14           ` Denton Liu
2019-12-04 22:29     ` [PATCH v4 00/11] git-p4.py: Cast byte strings to unicode strings in python3 Ben Keene via GitGitGadget
2019-12-04 22:29       ` [PATCH v4 01/11] git-p4: select p4 binary by operating-system Ben Keene via GitGitGadget
2019-12-05 10:19         ` Denton Liu
2019-12-05 16:32           ` Ben Keene
2019-12-04 22:29       ` [PATCH v4 02/11] git-p4: change the expansion test from basestring to list Ben Keene via GitGitGadget
2019-12-05 10:27         ` Denton Liu
2019-12-05 17:05           ` Ben Keene
2019-12-04 22:29       ` [PATCH v4 03/11] git-p4: add new helper functions for python3 conversion Ben Keene via GitGitGadget
2019-12-05 10:40         ` Denton Liu
2019-12-05 18:42           ` Ben Keene
2019-12-04 22:29       ` [PATCH v4 04/11] git-p4: python3 syntax changes Ben Keene via GitGitGadget
2019-12-05 11:02         ` Denton Liu
2019-12-04 22:29       ` [PATCH v4 05/11] git-p4: Add new functions in preparation of usage Ben Keene via GitGitGadget
2019-12-05 10:50         ` Denton Liu
2019-12-05 19:23           ` Ben Keene
2019-12-04 22:29       ` [PATCH v4 06/11] git-p4: Fix assumed path separators to be more Windows friendly Ben Keene via GitGitGadget
2019-12-05 13:38         ` Junio C Hamano
2019-12-05 19:37           ` Ben Keene
2019-12-04 22:29       ` [PATCH v4 07/11] git-p4: Add a helper class for stream writing Ben Keene via GitGitGadget
2019-12-05 13:42         ` Junio C Hamano
2019-12-05 19:52           ` Ben Keene
2019-12-04 22:29       ` [PATCH v4 08/11] git-p4: p4CmdList - support Unicode encoding Ben Keene via GitGitGadget
2019-12-05 13:55         ` Junio C Hamano
2019-12-05 20:23           ` Ben Keene
2019-12-04 22:29       ` [PATCH v4 09/11] git-p4: Add usability enhancements Ben Keene via GitGitGadget
2019-12-05 14:04         ` Junio C Hamano
2019-12-05 15:40           ` Ben Keene
2019-12-04 22:29       ` [PATCH v4 10/11] git-p4: Support python3 for basic P4 clone, sync, and submit Ben Keene via GitGitGadget
2019-12-04 22:29       ` [PATCH v4 11/11] git-p4: Added --encoding parameter to p4 clone Ben Keene via GitGitGadget
2019-12-05  9:54       ` [PATCH v4 00/11] git-p4.py: Cast byte strings to unicode strings in python3 Luke Diamand
2019-12-05 16:16         ` Ben Keene
2019-12-05 18:51           ` Denton Liu
2019-12-05 20:47             ` Ben Keene
2019-12-07 17:47       ` [PATCH v5 00/15] " Ben Keene via GitGitGadget
2019-12-07 17:47         ` [PATCH v5 01/15] t/gitweb-lib.sh: drop confusing quotes Jeff King via GitGitGadget
2019-12-07 17:47         ` [PATCH v5 02/15] t/gitweb-lib.sh: set $REQUEST_URI Jeff King via GitGitGadget
2019-12-07 17:47         ` [PATCH v5 03/15] git-p4: select P4 binary by operating-system Ben Keene via GitGitGadget
2019-12-09 19:47           ` Junio C Hamano
2019-12-07 17:47         ` [PATCH v5 04/15] git-p4: change the expansion test from basestring to list Ben Keene via GitGitGadget
2019-12-09 20:25           ` Junio C Hamano
2019-12-13 14:40             ` Ben Keene
2019-12-07 17:47         ` [PATCH v5 05/15] git-p4: promote encodeWithUTF8() to a global function Ben Keene via GitGitGadget
2019-12-11 16:39           ` Junio C Hamano
2019-12-07 17:47         ` [PATCH v5 06/15] git-p4: remove p4_write_pipe() and write_pipe() return values Ben Keene via GitGitGadget
2019-12-07 17:47         ` [PATCH v5 07/15] git-p4: add new support function gitConfigSet() Ben Keene via GitGitGadget
2019-12-11 17:11           ` Junio C Hamano
2019-12-07 17:47         ` Ben Keene via GitGitGadget [this message]
2019-12-07 17:47         ` [PATCH v5 09/15] git-p4: python 3 syntax changes Ben Keene via GitGitGadget
2019-12-07 17:47         ` [PATCH v5 10/15] git-p4: fix assumed path separators to be more Windows friendly Ben Keene via GitGitGadget
2019-12-07 17:47         ` [PATCH v5 11/15] git-p4: add Py23File() - helper class for stream writing Ben Keene via GitGitGadget
2019-12-07 17:47         ` [PATCH v5 12/15] git-p4: p4CmdList - support Unicode encoding Ben Keene via GitGitGadget
2019-12-07 17:47         ` [PATCH v5 13/15] git-p4: support Python 3 for basic P4 clone, sync, and submit (t9800) Ben Keene via GitGitGadget
2019-12-07 17:47         ` [PATCH v5 14/15] git-p4: added --encoding parameter to p4 clone Ben Keene via GitGitGadget
2019-12-07 17:47         ` [PATCH v5 15/15] git-p4: Add depot manipulation functions Ben Keene via GitGitGadget
2019-12-07 19:47         ` [PATCH v5 00/15] git-p4.py: Cast byte strings to unicode strings in python3 Jeff King
2019-12-07 21:27           ` Ben Keene
2019-12-11 16:54             ` Junio C Hamano
2019-12-11 17:13               ` Denton Liu
2019-12-11 17:57                 ` Junio C Hamano
2019-12-11 20:19                   ` Luke Diamand
2019-12-11 21:46                     ` Junio C Hamano
2019-12-11 22:30                       ` Yang Zhao
2019-12-12 14:13                         ` Ben Keene
2019-12-13 19:42                           ` [PATCH v5 00/15] git-p4.py: Cast byte strings to unicode strings in python3 - Code Review Ben Keene

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1e677781d2cc75371b5362c7e63ea5ddf824d5da.1575740863.git.gitgitgadget@gmail.com \
    --to=gitgitgadget@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=seraphire@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).