From: "Ben Keene via GitGitGadget" <gitgitgadget@gmail.com> To: git@vger.kernel.org Cc: Ben Keene <seraphire@gmail.com>, Junio C Hamano <gitster@pobox.com>, Ben Keene <seraphire@gmail.com> Subject: [PATCH v4 03/11] git-p4: add new helper functions for python3 conversion Date: Wed, 04 Dec 2019 22:29:29 +0000 Message-ID: <f0e658b984ca009c575368e661016f785922f970.1575498577.git.gitgitgadget@gmail.com> (raw) In-Reply-To: <pull.463.v4.git.1575498577.gitgitgadget@gmail.com> From: Ben Keene <seraphire@gmail.com> Python 3+ handles strings differently than Python 2.7. Since Python 2 is reaching it's end of life, a series of changes are being submitted to enable python 3.7+ support. The current code fails basic tests under python 3.7. Change the existing unicode test add new support functions for python2-python3 support. Define the following variables: - isunicode - a boolean variable that states if the version of python natively supports unicode (true) or not (false). This is true for Python3 and false for Python2. - unicode - a type alias for the datatype that holds a unicode string. It is assigned to a str under python 3 and the unicode type for Python2. - bytes - a type alias for an array of bytes. It is assigned the native bytes type for Python3 and str for Python2. Add the following new functions: - as_string(text) - A new function that will convert a byte array to a unicode (UTF-8) string under python 3. Under python 2, this returns the string unchanged. - as_bytes(text) - A new function that will convert a unicode string to a byte array under python 3. Under python 2, this returns the string unchanged. - to_unicode(text) - Converts a text string as Unicode(UTF-8) on both Python2 and Python3. Add a new function alias raw_input: If raw_input does not exist (it was renamed to input in python 3) alias input as raw_input. The AS_STRING and AS_BYTES functions allow for modifying the code with a minimal amount of impact on Python2 support. When a string is expected, the as_string() will be used to convert "cast" the incoming "bytes" to a string type. Conversely as_bytes() will be used to convert a "string" to a "byte array" type. Since Python2 overloads the datatype 'str' to serve both purposes, the Python2 versions of these function do not change the data, since the str functions as both a byte array and a string. basestring is removed since its only references are found in tests that were changed in the previous change list. Signed-off-by: Ben Keene <seraphire@gmail.com> (cherry picked from commit 7921aeb3136b07643c1a503c2d9d8b5ada620356) --- git-p4.py | 70 +++++++++++++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 66 insertions(+), 4 deletions(-) diff --git a/git-p4.py b/git-p4.py index 0f27996393..93dfd0920a 100755 --- a/git-p4.py +++ b/git-p4.py @@ -32,16 +32,78 @@ unicode = unicode except NameError: # 'unicode' is undefined, must be Python 3 - str = str + # + # For Python3 which is natively unicode, we will use + # unicode for internal information but all P4 Data + # will remain in bytes + isunicode = True unicode = str bytes = bytes - basestring = (str,bytes) + + def as_string(text): + """Return a byte array as a unicode string""" + if text == None: + return None + if isinstance(text, bytes): + return unicode(text, "utf-8") + else: + return text + + def as_bytes(text): + """Return a Unicode string as a byte array""" + if text == None: + return None + if isinstance(text, bytes): + return text + else: + return bytes(text, "utf-8") + + def to_unicode(text): + """Return a byte array as a unicode string""" + return as_string(text) + + def path_as_string(path): + """ Converts a path to the UTF8 encoded string """ + if isinstance(path, unicode): + return path + return encodeWithUTF8(path).decode('utf-8') + else: # 'unicode' exists, must be Python 2 - str = str + # + # We will treat the data as: + # str -> str + # bytes -> str + # So for Python2 these functions are no-ops + # and will leave the data in the ambiguious + # string/bytes state + isunicode = False unicode = unicode bytes = str - basestring = basestring + + def as_string(text): + """ Return text unaltered (for Python3 support) """ + return text + + def as_bytes(text): + """ Return text unaltered (for Python3 support) """ + return text + + def to_unicode(text): + """Return a string as a unicode string""" + return text.decode('utf-8') + + def path_as_string(path): + """ Converts a path to the UTF8 encoded bytes """ + return encodeWithUTF8(path) + + + +# Check for raw_input support +try: + raw_input +except NameError: + raw_input = input try: from subprocess import CalledProcessError -- gitgitgadget
next prev parent reply index Thread overview: 64+ messages / expand[flat|nested] mbox.gz Atom feed top 2019-11-13 21:07 [PATCH 0/1] git-p4.py: Cast byte strings to unicode strings in python3 Ben Keene via GitGitGadget 2019-11-13 21:07 ` [PATCH 1/1] " Ben Keene via GitGitGadget 2019-11-14 2:25 ` [PATCH 0/1] git-p4.py: " Junio C Hamano 2019-11-14 9:46 ` Luke Diamand 2019-11-15 14:39 ` [PATCH v2 0/3] " Ben Keene via GitGitGadget 2019-11-15 14:39 ` [PATCH v2 1/3] " Ben Keene via GitGitGadget 2019-11-15 14:39 ` [PATCH v2 2/3] FIX: cast as unicode fails when a value is already unicode Ben Keene via GitGitGadget 2019-11-15 14:39 ` [PATCH v2 3/3] FIX: wrap return for read_pipe_lines in ustring() and wrap GitLFS read of the pointer file in ustring() Ben Keene via GitGitGadget 2019-12-02 19:02 ` [PATCH v3 0/1] git-p4.py: Cast byte strings to unicode strings in python3 Ben Keene via GitGitGadget 2019-12-02 19:02 ` [PATCH v3 1/1] Python3 support for t9800 tests. Basic P4/Python3 support Ben Keene via GitGitGadget 2019-12-03 0:18 ` Denton Liu 2019-12-03 16:03 ` Ben Keene 2019-12-04 6:14 ` Denton Liu 2019-12-04 22:29 ` [PATCH v4 00/11] git-p4.py: Cast byte strings to unicode strings in python3 Ben Keene via GitGitGadget 2019-12-04 22:29 ` [PATCH v4 01/11] git-p4: select p4 binary by operating-system Ben Keene via GitGitGadget 2019-12-05 10:19 ` Denton Liu 2019-12-05 16:32 ` Ben Keene 2019-12-04 22:29 ` [PATCH v4 02/11] git-p4: change the expansion test from basestring to list Ben Keene via GitGitGadget 2019-12-05 10:27 ` Denton Liu 2019-12-05 17:05 ` Ben Keene 2019-12-04 22:29 ` Ben Keene via GitGitGadget [this message] 2019-12-05 10:40 ` [PATCH v4 03/11] git-p4: add new helper functions for python3 conversion Denton Liu 2019-12-05 18:42 ` Ben Keene 2019-12-04 22:29 ` [PATCH v4 04/11] git-p4: python3 syntax changes Ben Keene via GitGitGadget 2019-12-05 11:02 ` Denton Liu 2019-12-04 22:29 ` [PATCH v4 05/11] git-p4: Add new functions in preparation of usage Ben Keene via GitGitGadget 2019-12-05 10:50 ` Denton Liu 2019-12-05 19:23 ` Ben Keene 2019-12-04 22:29 ` [PATCH v4 06/11] git-p4: Fix assumed path separators to be more Windows friendly Ben Keene via GitGitGadget 2019-12-05 13:38 ` Junio C Hamano 2019-12-05 19:37 ` Ben Keene 2019-12-04 22:29 ` [PATCH v4 07/11] git-p4: Add a helper class for stream writing Ben Keene via GitGitGadget 2019-12-05 13:42 ` Junio C Hamano 2019-12-05 19:52 ` Ben Keene 2019-12-04 22:29 ` [PATCH v4 08/11] git-p4: p4CmdList - support Unicode encoding Ben Keene via GitGitGadget 2019-12-05 13:55 ` Junio C Hamano 2019-12-05 20:23 ` Ben Keene 2019-12-04 22:29 ` [PATCH v4 09/11] git-p4: Add usability enhancements Ben Keene via GitGitGadget 2019-12-05 14:04 ` Junio C Hamano 2019-12-05 15:40 ` Ben Keene 2019-12-04 22:29 ` [PATCH v4 10/11] git-p4: Support python3 for basic P4 clone, sync, and submit Ben Keene via GitGitGadget 2019-12-04 22:29 ` [PATCH v4 11/11] git-p4: Added --encoding parameter to p4 clone Ben Keene via GitGitGadget 2019-12-05 9:54 ` [PATCH v4 00/11] git-p4.py: Cast byte strings to unicode strings in python3 Luke Diamand 2019-12-05 16:16 ` Ben Keene 2019-12-05 18:51 ` Denton Liu 2019-12-05 20:47 ` Ben Keene 2019-12-07 17:47 ` [PATCH v5 00/15] " Ben Keene via GitGitGadget 2019-12-07 17:47 ` [PATCH v5 01/15] t/gitweb-lib.sh: drop confusing quotes Jeff King via GitGitGadget 2019-12-07 17:47 ` [PATCH v5 02/15] t/gitweb-lib.sh: set $REQUEST_URI Jeff King via GitGitGadget 2019-12-07 17:47 ` [PATCH v5 03/15] git-p4: select P4 binary by operating-system Ben Keene via GitGitGadget 2019-12-07 17:47 ` [PATCH v5 04/15] git-p4: change the expansion test from basestring to list Ben Keene via GitGitGadget 2019-12-07 17:47 ` [PATCH v5 05/15] git-p4: promote encodeWithUTF8() to a global function Ben Keene via GitGitGadget 2019-12-07 17:47 ` [PATCH v5 06/15] git-p4: remove p4_write_pipe() and write_pipe() return values Ben Keene via GitGitGadget 2019-12-07 17:47 ` [PATCH v5 07/15] git-p4: add new support function gitConfigSet() Ben Keene via GitGitGadget 2019-12-07 17:47 ` [PATCH v5 08/15] git-p4: add casting helper functions for python 3 conversion Ben Keene via GitGitGadget 2019-12-07 17:47 ` [PATCH v5 09/15] git-p4: python 3 syntax changes Ben Keene via GitGitGadget 2019-12-07 17:47 ` [PATCH v5 10/15] git-p4: fix assumed path separators to be more Windows friendly Ben Keene via GitGitGadget 2019-12-07 17:47 ` [PATCH v5 11/15] git-p4: add Py23File() - helper class for stream writing Ben Keene via GitGitGadget 2019-12-07 17:47 ` [PATCH v5 12/15] git-p4: p4CmdList - support Unicode encoding Ben Keene via GitGitGadget 2019-12-07 17:47 ` [PATCH v5 13/15] git-p4: support Python 3 for basic P4 clone, sync, and submit (t9800) Ben Keene via GitGitGadget 2019-12-07 17:47 ` [PATCH v5 14/15] git-p4: added --encoding parameter to p4 clone Ben Keene via GitGitGadget 2019-12-07 17:47 ` [PATCH v5 15/15] git-p4: Add depot manipulation functions Ben Keene via GitGitGadget 2019-12-07 19:47 ` [PATCH v5 00/15] git-p4.py: Cast byte strings to unicode strings in python3 Jeff King 2019-12-07 21:27 ` Ben Keene
Reply instructions: You may reply publically to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style List information: http://vger.kernel.org/majordomo-info.html * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=f0e658b984ca009c575368e661016f785922f970.1575498577.git.gitgitgadget@gmail.com \ --to=gitgitgadget@gmail.com \ --cc=git@vger.kernel.org \ --cc=gitster@pobox.com \ --cc=seraphire@gmail.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
git@vger.kernel.org list mirror (unofficial, one of many) Archives are clonable: git clone --mirror http://public-inbox.org/git git clone --mirror http://ou63pmih66umazou.onion/git git clone --mirror http://czquwvybam4bgbro.onion/git git clone --mirror http://hjrcffqmbrq6wope.onion/git Example config snippet for mirrors Newsgroups are available over NNTP: nntp://news.public-inbox.org/inbox.comp.version-control.git nntp://ou63pmih66umazou.onion/inbox.comp.version-control.git nntp://czquwvybam4bgbro.onion/inbox.comp.version-control.git nntp://hjrcffqmbrq6wope.onion/inbox.comp.version-control.git nntp://news.gmane.org/gmane.comp.version-control.git note: .onion URLs require Tor: https://www.torproject.org/ AGPL code for this site: git clone https://public-inbox.org/public-inbox.git