git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Ben Keene via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: Ben Keene <seraphire@gmail.com>,
	Junio C Hamano <gitster@pobox.com>,
	Ben Keene <seraphire@gmail.com>
Subject: [PATCH v5 15/15] git-p4: Add depot manipulation functions
Date: Sat, 07 Dec 2019 17:47:43 +0000	[thread overview]
Message-ID: <445dbc59f0cb82fabccc380c0346d65b778d8d1e.1575740863.git.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.463.v5.git.1575740863.gitgitgadget@gmail.com>

From: Ben Keene <seraphire@gmail.com>

Since the Depot paths and filenames are encoded according to P4, we need
to track them in bytes but also have to decode them with different
encodings (either ASCII or the encoding configured in pathEncoding,
which defaults to UTF-8)

Add the following functions to support future code conversion actions.

 * depot_count_depth         - counts the number of directories in the
       path
 * depot_remove_leading_path - removes (n) directories from the front
       of the depot path.
 * depot_Remove_p4_wildcard  - removes "/..." from the end of the path
 * depot_encode_utf8         - converts the path from the native
       encoding to utf8 encoding.  Returns (depot_path, did_decode)
 * depot_encode_restore      - restores the original encoding of the
       path.

Signed-off-by: Ben Keene <seraphire@gmail.com>

---
This code block could use review for the depot_encode_* functions.

Should this code return an absolute Unicode string or a byte array.
---
 git-p4.py | 93 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 93 insertions(+)

diff --git a/git-p4.py b/git-p4.py
index 16f29aae41..f82f05632c 100755
--- a/git-p4.py
+++ b/git-p4.py
@@ -724,6 +724,99 @@ def encodeWithUTF8(path, verbose=False):
                 print('Path with non-ASCII characters detected. Used %s to encode: %s ' % (encoding, path))
     return path
 
+
+def depot_count_depth(depot_path):
+    """Counts the number of directories found
+    in the depot_path. Paths will be decoded 
+    with encodeWithUTF8 to ensure that depot
+    encoding is repected.
+
+    Example:
+        //depot         = 1
+        //depot/        = 1
+        //depot/dir     = 2
+    """
+    depot_path=encodeWithUTF8(depot_path)
+    if not depot_path.endswith(b"/"):
+        depot_path+=b"/"
+    return depot_path.count(b"/") - 2
+
+def depot_remove_leading_path(depot_path, depth):
+    """Remove depth number of directories from 
+    the beginning of the depot_path. This will
+    be returned in the original encoding.
+    The leading "//" does not count as a directory
+    and will be automatically stripped.
+
+    depot_path should be in bytes
+
+    Example:
+    Given a depot_path of: //depot/main/file.txt
+    depth: 0        - depot/main/file.txt
+    depth: 1        - main/file.txt
+    depth: 2        - file.txt
+    depth: 3        - (empty string)
+    """
+
+    # First, decode the path
+    [depot_path, did_decode] = depot_encode_utf8(depot_path)
+
+    #remove leading //
+    if depot_path.startswith(b"//"):
+        depot_path=depot_path[2:]
+    if depth != 0:
+        segments=depot_path.split(b"/")
+        segments=segments[depth:]
+        depot_path=b"/".join(segments)
+
+    if did_decode:
+        depot_path = depot_encode_restore(depot_path)
+
+    return depot_path
+
+def depot_remove_p4_wildcard(depot_path):
+    """Removes the "/..." from the end of depot
+    path.
+
+    depot_path must be bytes. Bytes are returned.
+    """
+    # First, decode the path
+    [path, did_decode] = depot_encode_utf8(depot_path)
+    
+    if not path.endswith(b"/..."):
+        return depot_path
+    path=path[:-4]
+
+    if did_decode:
+        path = depot_encode_restore(path)
+
+    return path
+
+def depot_encode_utf8(depot_path):
+    """conditionally encodes depot_path
+    in utf8 using the defined pathEncoding.
+
+    Returns a (depot_path, was_encoded)"""
+    did_decode=False
+    encoding = 'utf8'
+    try:
+        depot_path.decode('ascii', 'strict')
+    except:
+        if gitConfig('git-p4.pathEncoding'):
+            encoding = gitConfig('git-p4.pathEncoding')
+        depot_path = depot_path.decode(encoding, 'replace').encode('utf8', 'replace')
+        did_decode=True
+    return [depot_path, did_decode]
+
+def depot_encode_restore(encoded_depot_path):
+    """Recodes an encoded_depot_path 
+    from utf8 back to the configured 
+    pathEncoding"""
+    encoding = 'utf8'
+    if gitConfig('git-p4.pathEncoding'):
+        encoding = gitConfig('git-p4.pathEncoding')
+    return encoded_depot_path.decode('utf8', 'replace').encode(encoding, 'replace')
+
 class P4Exception(Exception):
     """ Base class for exceptions from the p4 client """
     def __init__(self, exit_code):
-- 
gitgitgadget

  parent reply	other threads:[~2019-12-07 17:48 UTC|newest]

Thread overview: 77+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-13 21:07 [PATCH 0/1] git-p4.py: Cast byte strings to unicode strings in python3 Ben Keene via GitGitGadget
2019-11-13 21:07 ` [PATCH 1/1] " Ben Keene via GitGitGadget
2019-11-14  2:25 ` [PATCH 0/1] git-p4.py: " Junio C Hamano
2019-11-14  9:46   ` Luke Diamand
2019-11-15 14:39 ` [PATCH v2 0/3] " Ben Keene via GitGitGadget
2019-11-15 14:39   ` [PATCH v2 1/3] " Ben Keene via GitGitGadget
2019-11-15 14:39   ` [PATCH v2 2/3] FIX: cast as unicode fails when a value is already unicode Ben Keene via GitGitGadget
2019-11-15 14:39   ` [PATCH v2 3/3] FIX: wrap return for read_pipe_lines in ustring() and wrap GitLFS read of the pointer file in ustring() Ben Keene via GitGitGadget
2019-12-02 19:02   ` [PATCH v3 0/1] git-p4.py: Cast byte strings to unicode strings in python3 Ben Keene via GitGitGadget
2019-12-02 19:02     ` [PATCH v3 1/1] Python3 support for t9800 tests. Basic P4/Python3 support Ben Keene via GitGitGadget
2019-12-03  0:18       ` Denton Liu
2019-12-03 16:03         ` Ben Keene
2019-12-04  6:14           ` Denton Liu
2019-12-04 22:29     ` [PATCH v4 00/11] git-p4.py: Cast byte strings to unicode strings in python3 Ben Keene via GitGitGadget
2019-12-04 22:29       ` [PATCH v4 01/11] git-p4: select p4 binary by operating-system Ben Keene via GitGitGadget
2019-12-05 10:19         ` Denton Liu
2019-12-05 16:32           ` Ben Keene
2019-12-04 22:29       ` [PATCH v4 02/11] git-p4: change the expansion test from basestring to list Ben Keene via GitGitGadget
2019-12-05 10:27         ` Denton Liu
2019-12-05 17:05           ` Ben Keene
2019-12-04 22:29       ` [PATCH v4 03/11] git-p4: add new helper functions for python3 conversion Ben Keene via GitGitGadget
2019-12-05 10:40         ` Denton Liu
2019-12-05 18:42           ` Ben Keene
2019-12-04 22:29       ` [PATCH v4 04/11] git-p4: python3 syntax changes Ben Keene via GitGitGadget
2019-12-05 11:02         ` Denton Liu
2019-12-04 22:29       ` [PATCH v4 05/11] git-p4: Add new functions in preparation of usage Ben Keene via GitGitGadget
2019-12-05 10:50         ` Denton Liu
2019-12-05 19:23           ` Ben Keene
2019-12-04 22:29       ` [PATCH v4 06/11] git-p4: Fix assumed path separators to be more Windows friendly Ben Keene via GitGitGadget
2019-12-05 13:38         ` Junio C Hamano
2019-12-05 19:37           ` Ben Keene
2019-12-04 22:29       ` [PATCH v4 07/11] git-p4: Add a helper class for stream writing Ben Keene via GitGitGadget
2019-12-05 13:42         ` Junio C Hamano
2019-12-05 19:52           ` Ben Keene
2019-12-04 22:29       ` [PATCH v4 08/11] git-p4: p4CmdList - support Unicode encoding Ben Keene via GitGitGadget
2019-12-05 13:55         ` Junio C Hamano
2019-12-05 20:23           ` Ben Keene
2019-12-04 22:29       ` [PATCH v4 09/11] git-p4: Add usability enhancements Ben Keene via GitGitGadget
2019-12-05 14:04         ` Junio C Hamano
2019-12-05 15:40           ` Ben Keene
2019-12-04 22:29       ` [PATCH v4 10/11] git-p4: Support python3 for basic P4 clone, sync, and submit Ben Keene via GitGitGadget
2019-12-04 22:29       ` [PATCH v4 11/11] git-p4: Added --encoding parameter to p4 clone Ben Keene via GitGitGadget
2019-12-05  9:54       ` [PATCH v4 00/11] git-p4.py: Cast byte strings to unicode strings in python3 Luke Diamand
2019-12-05 16:16         ` Ben Keene
2019-12-05 18:51           ` Denton Liu
2019-12-05 20:47             ` Ben Keene
2019-12-07 17:47       ` [PATCH v5 00/15] " Ben Keene via GitGitGadget
2019-12-07 17:47         ` [PATCH v5 01/15] t/gitweb-lib.sh: drop confusing quotes Jeff King via GitGitGadget
2019-12-07 17:47         ` [PATCH v5 02/15] t/gitweb-lib.sh: set $REQUEST_URI Jeff King via GitGitGadget
2019-12-07 17:47         ` [PATCH v5 03/15] git-p4: select P4 binary by operating-system Ben Keene via GitGitGadget
2019-12-09 19:47           ` Junio C Hamano
2019-12-07 17:47         ` [PATCH v5 04/15] git-p4: change the expansion test from basestring to list Ben Keene via GitGitGadget
2019-12-09 20:25           ` Junio C Hamano
2019-12-13 14:40             ` Ben Keene
2019-12-07 17:47         ` [PATCH v5 05/15] git-p4: promote encodeWithUTF8() to a global function Ben Keene via GitGitGadget
2019-12-11 16:39           ` Junio C Hamano
2019-12-07 17:47         ` [PATCH v5 06/15] git-p4: remove p4_write_pipe() and write_pipe() return values Ben Keene via GitGitGadget
2019-12-07 17:47         ` [PATCH v5 07/15] git-p4: add new support function gitConfigSet() Ben Keene via GitGitGadget
2019-12-11 17:11           ` Junio C Hamano
2019-12-07 17:47         ` [PATCH v5 08/15] git-p4: add casting helper functions for python 3 conversion Ben Keene via GitGitGadget
2019-12-07 17:47         ` [PATCH v5 09/15] git-p4: python 3 syntax changes Ben Keene via GitGitGadget
2019-12-07 17:47         ` [PATCH v5 10/15] git-p4: fix assumed path separators to be more Windows friendly Ben Keene via GitGitGadget
2019-12-07 17:47         ` [PATCH v5 11/15] git-p4: add Py23File() - helper class for stream writing Ben Keene via GitGitGadget
2019-12-07 17:47         ` [PATCH v5 12/15] git-p4: p4CmdList - support Unicode encoding Ben Keene via GitGitGadget
2019-12-07 17:47         ` [PATCH v5 13/15] git-p4: support Python 3 for basic P4 clone, sync, and submit (t9800) Ben Keene via GitGitGadget
2019-12-07 17:47         ` [PATCH v5 14/15] git-p4: added --encoding parameter to p4 clone Ben Keene via GitGitGadget
2019-12-07 17:47         ` Ben Keene via GitGitGadget [this message]
2019-12-07 19:47         ` [PATCH v5 00/15] git-p4.py: Cast byte strings to unicode strings in python3 Jeff King
2019-12-07 21:27           ` Ben Keene
2019-12-11 16:54             ` Junio C Hamano
2019-12-11 17:13               ` Denton Liu
2019-12-11 17:57                 ` Junio C Hamano
2019-12-11 20:19                   ` Luke Diamand
2019-12-11 21:46                     ` Junio C Hamano
2019-12-11 22:30                       ` Yang Zhao
2019-12-12 14:13                         ` Ben Keene
2019-12-13 19:42                           ` [PATCH v5 00/15] git-p4.py: Cast byte strings to unicode strings in python3 - Code Review Ben Keene

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=445dbc59f0cb82fabccc380c0346d65b778d8d1e.1575740863.git.gitgitgadget@gmail.com \
    --to=gitgitgadget@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=seraphire@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).