From: "Ben Keene via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: Ben Keene <seraphire@gmail.com>,
Junio C Hamano <gitster@pobox.com>,
Ben Keene <seraphire@gmail.com>
Subject: [PATCH v4 03/11] git-p4: add new helper functions for python3 conversion
Date: Wed, 04 Dec 2019 22:29:29 +0000 [thread overview]
Message-ID: <f0e658b984ca009c575368e661016f785922f970.1575498577.git.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.463.v4.git.1575498577.gitgitgadget@gmail.com>
From: Ben Keene <seraphire@gmail.com>
Python 3+ handles strings differently than Python 2.7. Since Python 2 is reaching it's end of life, a series of changes are being submitted to enable python 3.7+ support. The current code fails basic tests under python 3.7.
Change the existing unicode test add new support functions for python2-python3 support.
Define the following variables:
- isunicode - a boolean variable that states if the version of python natively supports unicode (true) or not (false). This is true for Python3 and false for Python2.
- unicode - a type alias for the datatype that holds a unicode string. It is assigned to a str under python 3 and the unicode type for Python2.
- bytes - a type alias for an array of bytes. It is assigned the native bytes type for Python3 and str for Python2.
Add the following new functions:
- as_string(text) - A new function that will convert a byte array to a unicode (UTF-8) string under python 3. Under python 2, this returns the string unchanged.
- as_bytes(text) - A new function that will convert a unicode string to a byte array under python 3. Under python 2, this returns the string unchanged.
- to_unicode(text) - Converts a text string as Unicode(UTF-8) on both Python2 and Python3.
Add a new function alias raw_input:
If raw_input does not exist (it was renamed to input in python 3) alias input as raw_input.
The AS_STRING and AS_BYTES functions allow for modifying the code with a minimal amount of impact on Python2 support. When a string is expected, the as_string() will be used to convert "cast" the incoming "bytes" to a string type. Conversely as_bytes() will be used to convert a "string" to a "byte array" type. Since Python2 overloads the datatype 'str' to serve both purposes, the Python2 versions of these function do not change the data, since the str functions as both a byte array and a string.
basestring is removed since its only references are found in tests that were changed in the previous change list.
Signed-off-by: Ben Keene <seraphire@gmail.com>
(cherry picked from commit 7921aeb3136b07643c1a503c2d9d8b5ada620356)
---
git-p4.py | 70 +++++++++++++++++++++++++++++++++++++++++++++++++++----
1 file changed, 66 insertions(+), 4 deletions(-)
diff --git a/git-p4.py b/git-p4.py
index 0f27996393..93dfd0920a 100755
--- a/git-p4.py
+++ b/git-p4.py
@@ -32,16 +32,78 @@
unicode = unicode
except NameError:
# 'unicode' is undefined, must be Python 3
- str = str
+ #
+ # For Python3 which is natively unicode, we will use
+ # unicode for internal information but all P4 Data
+ # will remain in bytes
+ isunicode = True
unicode = str
bytes = bytes
- basestring = (str,bytes)
+
+ def as_string(text):
+ """Return a byte array as a unicode string"""
+ if text == None:
+ return None
+ if isinstance(text, bytes):
+ return unicode(text, "utf-8")
+ else:
+ return text
+
+ def as_bytes(text):
+ """Return a Unicode string as a byte array"""
+ if text == None:
+ return None
+ if isinstance(text, bytes):
+ return text
+ else:
+ return bytes(text, "utf-8")
+
+ def to_unicode(text):
+ """Return a byte array as a unicode string"""
+ return as_string(text)
+
+ def path_as_string(path):
+ """ Converts a path to the UTF8 encoded string """
+ if isinstance(path, unicode):
+ return path
+ return encodeWithUTF8(path).decode('utf-8')
+
else:
# 'unicode' exists, must be Python 2
- str = str
+ #
+ # We will treat the data as:
+ # str -> str
+ # bytes -> str
+ # So for Python2 these functions are no-ops
+ # and will leave the data in the ambiguious
+ # string/bytes state
+ isunicode = False
unicode = unicode
bytes = str
- basestring = basestring
+
+ def as_string(text):
+ """ Return text unaltered (for Python3 support) """
+ return text
+
+ def as_bytes(text):
+ """ Return text unaltered (for Python3 support) """
+ return text
+
+ def to_unicode(text):
+ """Return a string as a unicode string"""
+ return text.decode('utf-8')
+
+ def path_as_string(path):
+ """ Converts a path to the UTF8 encoded bytes """
+ return encodeWithUTF8(path)
+
+
+
+# Check for raw_input support
+try:
+ raw_input
+except NameError:
+ raw_input = input
try:
from subprocess import CalledProcessError
--
gitgitgadget
next prev parent reply other threads:[~2019-12-04 22:29 UTC|newest]
Thread overview: 77+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-11-13 21:07 [PATCH 0/1] git-p4.py: Cast byte strings to unicode strings in python3 Ben Keene via GitGitGadget
2019-11-13 21:07 ` [PATCH 1/1] " Ben Keene via GitGitGadget
2019-11-14 2:25 ` [PATCH 0/1] git-p4.py: " Junio C Hamano
2019-11-14 9:46 ` Luke Diamand
2019-11-15 14:39 ` [PATCH v2 0/3] " Ben Keene via GitGitGadget
2019-11-15 14:39 ` [PATCH v2 1/3] " Ben Keene via GitGitGadget
2019-11-15 14:39 ` [PATCH v2 2/3] FIX: cast as unicode fails when a value is already unicode Ben Keene via GitGitGadget
2019-11-15 14:39 ` [PATCH v2 3/3] FIX: wrap return for read_pipe_lines in ustring() and wrap GitLFS read of the pointer file in ustring() Ben Keene via GitGitGadget
2019-12-02 19:02 ` [PATCH v3 0/1] git-p4.py: Cast byte strings to unicode strings in python3 Ben Keene via GitGitGadget
2019-12-02 19:02 ` [PATCH v3 1/1] Python3 support for t9800 tests. Basic P4/Python3 support Ben Keene via GitGitGadget
2019-12-03 0:18 ` Denton Liu
2019-12-03 16:03 ` Ben Keene
2019-12-04 6:14 ` Denton Liu
2019-12-04 22:29 ` [PATCH v4 00/11] git-p4.py: Cast byte strings to unicode strings in python3 Ben Keene via GitGitGadget
2019-12-04 22:29 ` [PATCH v4 01/11] git-p4: select p4 binary by operating-system Ben Keene via GitGitGadget
2019-12-05 10:19 ` Denton Liu
2019-12-05 16:32 ` Ben Keene
2019-12-04 22:29 ` [PATCH v4 02/11] git-p4: change the expansion test from basestring to list Ben Keene via GitGitGadget
2019-12-05 10:27 ` Denton Liu
2019-12-05 17:05 ` Ben Keene
2019-12-04 22:29 ` Ben Keene via GitGitGadget [this message]
2019-12-05 10:40 ` [PATCH v4 03/11] git-p4: add new helper functions for python3 conversion Denton Liu
2019-12-05 18:42 ` Ben Keene
2019-12-04 22:29 ` [PATCH v4 04/11] git-p4: python3 syntax changes Ben Keene via GitGitGadget
2019-12-05 11:02 ` Denton Liu
2019-12-04 22:29 ` [PATCH v4 05/11] git-p4: Add new functions in preparation of usage Ben Keene via GitGitGadget
2019-12-05 10:50 ` Denton Liu
2019-12-05 19:23 ` Ben Keene
2019-12-04 22:29 ` [PATCH v4 06/11] git-p4: Fix assumed path separators to be more Windows friendly Ben Keene via GitGitGadget
2019-12-05 13:38 ` Junio C Hamano
2019-12-05 19:37 ` Ben Keene
2019-12-04 22:29 ` [PATCH v4 07/11] git-p4: Add a helper class for stream writing Ben Keene via GitGitGadget
2019-12-05 13:42 ` Junio C Hamano
2019-12-05 19:52 ` Ben Keene
2019-12-04 22:29 ` [PATCH v4 08/11] git-p4: p4CmdList - support Unicode encoding Ben Keene via GitGitGadget
2019-12-05 13:55 ` Junio C Hamano
2019-12-05 20:23 ` Ben Keene
2019-12-04 22:29 ` [PATCH v4 09/11] git-p4: Add usability enhancements Ben Keene via GitGitGadget
2019-12-05 14:04 ` Junio C Hamano
2019-12-05 15:40 ` Ben Keene
2019-12-04 22:29 ` [PATCH v4 10/11] git-p4: Support python3 for basic P4 clone, sync, and submit Ben Keene via GitGitGadget
2019-12-04 22:29 ` [PATCH v4 11/11] git-p4: Added --encoding parameter to p4 clone Ben Keene via GitGitGadget
2019-12-05 9:54 ` [PATCH v4 00/11] git-p4.py: Cast byte strings to unicode strings in python3 Luke Diamand
2019-12-05 16:16 ` Ben Keene
2019-12-05 18:51 ` Denton Liu
2019-12-05 20:47 ` Ben Keene
2019-12-07 17:47 ` [PATCH v5 00/15] " Ben Keene via GitGitGadget
2019-12-07 17:47 ` [PATCH v5 01/15] t/gitweb-lib.sh: drop confusing quotes Jeff King via GitGitGadget
2019-12-07 17:47 ` [PATCH v5 02/15] t/gitweb-lib.sh: set $REQUEST_URI Jeff King via GitGitGadget
2019-12-07 17:47 ` [PATCH v5 03/15] git-p4: select P4 binary by operating-system Ben Keene via GitGitGadget
2019-12-09 19:47 ` Junio C Hamano
2019-12-07 17:47 ` [PATCH v5 04/15] git-p4: change the expansion test from basestring to list Ben Keene via GitGitGadget
2019-12-09 20:25 ` Junio C Hamano
2019-12-13 14:40 ` Ben Keene
2019-12-07 17:47 ` [PATCH v5 05/15] git-p4: promote encodeWithUTF8() to a global function Ben Keene via GitGitGadget
2019-12-11 16:39 ` Junio C Hamano
2019-12-07 17:47 ` [PATCH v5 06/15] git-p4: remove p4_write_pipe() and write_pipe() return values Ben Keene via GitGitGadget
2019-12-07 17:47 ` [PATCH v5 07/15] git-p4: add new support function gitConfigSet() Ben Keene via GitGitGadget
2019-12-11 17:11 ` Junio C Hamano
2019-12-07 17:47 ` [PATCH v5 08/15] git-p4: add casting helper functions for python 3 conversion Ben Keene via GitGitGadget
2019-12-07 17:47 ` [PATCH v5 09/15] git-p4: python 3 syntax changes Ben Keene via GitGitGadget
2019-12-07 17:47 ` [PATCH v5 10/15] git-p4: fix assumed path separators to be more Windows friendly Ben Keene via GitGitGadget
2019-12-07 17:47 ` [PATCH v5 11/15] git-p4: add Py23File() - helper class for stream writing Ben Keene via GitGitGadget
2019-12-07 17:47 ` [PATCH v5 12/15] git-p4: p4CmdList - support Unicode encoding Ben Keene via GitGitGadget
2019-12-07 17:47 ` [PATCH v5 13/15] git-p4: support Python 3 for basic P4 clone, sync, and submit (t9800) Ben Keene via GitGitGadget
2019-12-07 17:47 ` [PATCH v5 14/15] git-p4: added --encoding parameter to p4 clone Ben Keene via GitGitGadget
2019-12-07 17:47 ` [PATCH v5 15/15] git-p4: Add depot manipulation functions Ben Keene via GitGitGadget
2019-12-07 19:47 ` [PATCH v5 00/15] git-p4.py: Cast byte strings to unicode strings in python3 Jeff King
2019-12-07 21:27 ` Ben Keene
2019-12-11 16:54 ` Junio C Hamano
2019-12-11 17:13 ` Denton Liu
2019-12-11 17:57 ` Junio C Hamano
2019-12-11 20:19 ` Luke Diamand
2019-12-11 21:46 ` Junio C Hamano
2019-12-11 22:30 ` Yang Zhao
2019-12-12 14:13 ` Ben Keene
2019-12-13 19:42 ` [PATCH v5 00/15] git-p4.py: Cast byte strings to unicode strings in python3 - Code Review Ben Keene
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=f0e658b984ca009c575368e661016f785922f970.1575498577.git.gitgitgadget@gmail.com \
--to=gitgitgadget@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=seraphire@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).