From: Tzadik Vanderhoof <tzadik.vanderhoof@gmail.com>
To: git@vger.kernel.org, tboegi@web.de, tzadik.vanderhoof@gmail.com
Subject: [PATCH] add git-p4.fallbackEncoding config variable, to prevent git-p4 from crashing on non UTF-8 changeset descriptions
Date: Wed, 21 Apr 2021 01:46:04 -0700 [thread overview]
Message-ID: <20210421084604.3095-1-tzadik.vanderhoof@gmail.com> (raw)
In-Reply-To: <20210412040614.gqiot5qcsfpiae3a@tb-raspi4>
---
Documentation/git-p4.txt | 10 ++++
git-p4.py | 11 +++-
t/t9835-git-p4-config-fallback-encoding.sh | 65 ++++++++++++++++++++++
3 files changed, 85 insertions(+), 1 deletion(-)
create mode 100755 t/t9835-git-p4-config-fallback-encoding.sh
diff --git a/Documentation/git-p4.txt b/Documentation/git-p4.txt
index f89e68b..e0131a9 100644
--- a/Documentation/git-p4.txt
+++ b/Documentation/git-p4.txt
@@ -638,6 +638,16 @@ git-p4.pathEncoding::
to transcode the paths to UTF-8. As an example, Perforce on Windows
often uses "cp1252" to encode path names.
+git-p4.fallbackEncoding::
+ Perforce changeset descriptions can be in a mixture of encodings.
+ Git-p4 first tries to interpret each description as UTF-8. If that
+ fails, this config allows another encoding to be tried. You
+ can specify, for example, "cp1252". If instead of an encoding,
+ you specify "replace", UTF-8 will be used, with invalid UTF-8
+ characters replaced by the Unicode replacement character. If you
+ specify "none" (the default), there is no fallback, and any non
+ UTF-8 character will cause git-p4 to immediately fail.
+
git-p4.largeFileSystem::
Specify the system that is used for large (binary) files. Please note
that large file systems do not support the 'git p4 submit' command.
diff --git a/git-p4.py b/git-p4.py
index 09c9e93..173f78a 100755
--- a/git-p4.py
+++ b/git-p4.py
@@ -771,7 +771,16 @@ def p4CmdList(cmd, stdin=None, stdin_mode='w+b', cb=None, skip_info=False,
for key, value in entry.items():
key = key.decode()
if isinstance(value, bytes) and not (key in ('data', 'path', 'clientFile') or key.startswith('depotFile')):
- value = value.decode()
+ try:
+ value = value.decode()
+ except UnicodeDecodeError as ex:
+ fallbackEncoding = gitConfig("git-p4.fallbackEncoding").lower() or 'none'
+ if fallbackEncoding == 'none':
+ raise Exception("UTF8 decoding failed. Consider using git config git-p4.fallbackEncoding") from ex
+ elif fallbackEncoding == 'replace':
+ value = value.decode(errors='replace')
+ else:
+ value = value.decode(encoding=fallbackEncoding)
decoded_entry[key] = value
# Parse out data if it's an error response
if decoded_entry.get('code') == 'error' and 'data' in decoded_entry:
diff --git a/t/t9835-git-p4-config-fallback-encoding.sh b/t/t9835-git-p4-config-fallback-encoding.sh
new file mode 100755
index 0000000..56a245e
--- /dev/null
+++ b/t/t9835-git-p4-config-fallback-encoding.sh
@@ -0,0 +1,65 @@
+#!/bin/sh
+
+test_description='test git-p4.fallbackEncoding config'
+
+GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
+export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
+
+. ./lib-git-p4.sh
+
+if test_have_prereq !MINGW,!CYGWIN; then
+ skip_all='This system is not subject to encoding failures in "git p4 clone"'
+ test_done
+fi
+
+test_expect_success 'start p4d' '
+ start_p4d
+'
+
+test_expect_success 'add cp1252 description' '
+ cd "$cli" &&
+ echo file1 >file1 &&
+ p4 add file1 &&
+ p4 submit -d documentación
+'
+
+test_expect_success 'clone fails with git-p4.fallbackEncoding unset' '
+ test_might_fail git config --global --unset git-p4.fallbackEncoding &&
+ test_when_finished cleanup_git &&
+ (
+ test_must_fail git p4 clone --dest="$git" //depot@all 2>> actual &&
+ grep "UTF8 decoding failed. Consider using git config git-p4.fallbackEncoding" actual
+ )
+'
+test_expect_success 'clone fails with git-p4.fallbackEncoding set to "none"' '
+ git config --global git-p4.fallbackEncoding none &&
+ test_when_finished cleanup_git &&
+ (
+ test_must_fail git p4 clone --dest="$git" //depot@all 2>> actual &&
+ grep "UTF8 decoding failed. Consider using git config git-p4.fallbackEncoding" actual
+ )
+'
+
+test_expect_success 'clone succeeds with git-p4.fallbackEncoding set to "cp1252"' '
+ git config --global git-p4.fallbackEncoding cp1252 &&
+ test_when_finished cleanup_git &&
+ (
+ git p4 clone --dest="$git" //depot@all &&
+ cd "$git" &&
+ git log --oneline >log &&
+ desc=$(head -1 log | awk '\''{print $2}'\'') && [ "$desc" = "documentación" ]
+ )
+'
+
+test_expect_success 'clone succeeds with git-p4.fallbackEncoding set to "replace"' '
+ git config --global git-p4.fallbackEncoding replace &&
+ test_when_finished cleanup_git &&
+ (
+ git p4 clone --dest="$git" //depot@all &&
+ cd "$git" &&
+ git log --oneline >log &&
+ desc=$(head -1 log | awk '\''{print $2}'\'') && [ "$desc" = "documentaci�n" ]
+ )
+'
+
+test_done
--
2.31.1.windows.1
next prev parent reply other threads:[~2021-04-21 8:46 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-04-08 19:28 git-p4 crashes on non UTF-8 output from p4 Tzadik Vanderhoof
2021-04-09 15:38 ` Torsten Bögershausen
2021-04-11 7:16 ` Tzadik Vanderhoof
2021-04-11 9:37 ` Torsten Bögershausen
2021-04-11 20:21 ` Tzadik Vanderhoof
2021-04-12 4:06 ` Torsten Bögershausen
2021-04-21 8:46 ` Tzadik Vanderhoof [this message]
2021-04-21 8:55 ` [PATCH] add git-p4.fallbackEncoding config variable, to prevent git-p4 from crashing on non UTF-8 changeset descriptions Tzadik Vanderhoof
2021-04-22 5:05 ` [PATCH v3] add git-p4.fallbackEncoding config setting, " Tzadik Vanderhoof
2021-04-22 15:50 ` Torsten Bögershausen
2021-04-22 16:17 ` Eric Sunshine
2021-04-22 22:33 ` Eric Sunshine
2021-04-23 6:36 ` [PATCH] add git-p4.fallbackEncoding config variable, " Tzadik Vanderhoof
2021-04-23 6:44 ` Tzadik Vanderhoof
2021-04-23 19:08 ` Tzadik Vanderhoof
2021-04-24 8:14 ` Torsten Bögershausen
2021-04-27 5:39 ` [PATCH v5] " Tzadik Vanderhoof
2021-04-27 5:45 ` Tzadik Vanderhoof
2021-04-28 4:39 ` Junio C Hamano
2021-04-28 14:58 ` Torsten Bögershausen
2021-04-29 7:39 ` [PATCH v6] Add git-p4.fallbackEncoding Tzadik Vanderhoof
2021-04-29 8:36 ` Luke Diamand
2021-04-29 17:29 ` Tzadik Vanderhoof
[not found] ` <20210429074458.891-1-tzadik.vanderhoof@gmail.com>
[not found] ` <c4c48615-d1f4-fd37-0960-979535907f15@web.de>
2021-04-29 17:14 ` Tzadik Vanderhoof
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210421084604.3095-1-tzadik.vanderhoof@gmail.com \
--to=tzadik.vanderhoof@gmail.com \
--cc=git@vger.kernel.org \
--cc=tboegi@web.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).