git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: John Keeping <john@keeping.me.uk>
To: Michael Haggerty <mhagger@alum.mit.edu>
Cc: git@vger.kernel.org, "Eric S. Raymond" <esr@thyrsus.com>,
	Felipe Contreras <felipe.contreras@gmail.com>,
	Sverre Rabbelier <srabbelier@gmail.com>
Subject: [RFC/PATCH 2/8 v2] git_remote_helpers: fix input when running under Python 3
Date: Tue, 15 Jan 2013 19:48:09 +0000	[thread overview]
Message-ID: <20130115194809.GU4574@serenity.lan> (raw)
In-Reply-To: <20130114094721.GQ4574@serenity.lan>

Although 2to3 will fix most issues in Python 2 code to make it run under
Python 3, it does not handle the new strict separation between byte
strings and unicode strings.  There is one instance in
git_remote_helpers where we are caught by this, which is when reading
refs from "git for-each-ref".

While we could fix this by explicitly handling refs as byte strings,
this is merely punting the problem to users of the library since the
same problem will be encountered as soon you want to display the ref
name to a user.

Instead of doing this, explicit decode the incoming byte string into a
unicode string.  Following the lead of pygit2 (the Python bindings for
libgit2 - see [1] and [2]), use the filesystem encoding by default,
providing a way for callers to override this if necessary.

[1] https://github.com/libgit2/pygit2/blob/e34911b63e5d2266f9f72a4e3f32e27b13190feb/src/pygit2/reference.c#L261
[2] https://github.com/libgit2/pygit2/blob/e34911b63e5d2266f9f72a4e3f32e27b13190feb/include/pygit2/utils.h#L55

Signed-off-by: John Keeping <john@keeping.me.uk>
---

I think this is in fact the best way to handle this, and I hope the
above description clarified why I don't think we want to treat refs as
byte strings in Python 3.

My only remaining question is whether it would be better to set the
error mode when decoding to "replace" instead of "strict" (the default).
"strict" will cause a UnicodeError if the string cannot be decoded
whereas "replace" will use U+FFFD (the replacement character). [3]

I think it's better to use "strict" and let the user know that
something has gone wrong rather than silently change the string, but I'd
welcome other opinions.

[3] http://docs.python.org/2/library/codecs.html#codec-base-classes

 git_remote_helpers/git/importer.py | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/git_remote_helpers/git/importer.py b/git_remote_helpers/git/importer.py
index e28cc8f..5bc16a4 100644
--- a/git_remote_helpers/git/importer.py
+++ b/git_remote_helpers/git/importer.py
@@ -1,5 +1,6 @@
 import os
 import subprocess
+import sys
 
 from git_remote_helpers.util import check_call, check_output
 
@@ -10,17 +11,26 @@ class GitImporter(object):
     This importer simply delegates to git fast-import.
     """
 
-    def __init__(self, repo):
+    def __init__(self, repo, ref_encoding=None):
         """Creates a new importer for the specified repo.
+
+        If ref_encoding is specified that refs are decoded using that
+        encoding.  Otherwise the system filesystem encoding is used.
         """
 
         self.repo = repo
+        self.ref_encoding = ref_encoding
 
     def get_refs(self, gitdir):
         """Returns a dictionary with refs.
         """
         args = ["git", "--git-dir=" + gitdir, "for-each-ref", "refs/heads"]
-        lines = check_output(args).strip().split('\n')
+        encoding = self.ref_encoding
+        if encoding is None:
+            encoding = sys.getfilesystemencoding()
+            if encoding is None:
+                encoding = sys.getdefaultencoding()
+        lines = check_output(args).decode(encoding).strip().split('\n')
         refs = {}
         for line in lines:
             value, name = line.split(' ')
-- 
1.8.1

  reply	other threads:[~2013-01-15 19:48 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-01-12 19:23 [PATCH 0/8] Initial support for Python 3 John Keeping
2013-01-12 19:23 ` [PATCH 1/8] git_remote_helpers: Allow building with " John Keeping
2013-01-12 19:23 ` [PATCH 2/8] git_remote_helpers: fix input when running under " John Keeping
2013-01-13  3:26   ` Michael Haggerty
2013-01-13 16:17     ` John Keeping
2013-01-14  4:48       ` Michael Haggerty
2013-01-14  9:47         ` John Keeping
2013-01-15 19:48           ` John Keeping [this message]
2013-01-15 20:51             ` [RFC/PATCH 2/8 v2] " Junio C Hamano
2013-01-15 21:54               ` John Keeping
2013-01-15 22:04                 ` Junio C Hamano
2013-01-15 22:40                   ` [RFC/PATCH 2/8 v3] " John Keeping
2013-01-16  0:03                     ` Pete Wyckoff
2013-01-16  9:45                       ` John Keeping
2013-01-17  0:29                         ` Pete Wyckoff
2013-01-12 19:23 ` [PATCH 3/8] git_remote_helpers: Force rebuild if python version changes John Keeping
2013-01-12 23:30   ` Pete Wyckoff
2013-01-13 16:26     ` John Keeping
2013-01-13 17:14       ` Pete Wyckoff
2013-01-13 17:52         ` John Keeping
2013-01-15 22:58           ` John Keeping
2013-01-17  0:27             ` Pete Wyckoff
2013-01-12 19:23 ` [PATCH 4/8] git_remote_helpers: Use 2to3 if building with Python 3 John Keeping
2013-01-12 19:23 ` [PATCH 5/8] svn-fe: allow svnrdump_sim.py to run " John Keeping
2013-01-12 19:23 ` [PATCH 6/8] git-remote-testpy: hash bytes explicitly John Keeping
2013-01-12 19:23 ` [PATCH 7/8] git-remote-testpy: don't do unbuffered text I/O John Keeping
2013-01-12 19:23 ` [PATCH 8/8] git-remote-testpy: call print as a function John Keeping
2013-01-12 23:43 ` [PATCH 0/8] Initial support for Python 3 Pete Wyckoff
2013-01-13  0:41   ` John Keeping
2013-01-13 12:34     ` John Keeping
2013-01-13 16:40     ` Pete Wyckoff
2013-01-13 17:35       ` John Keeping
2013-01-17 18:53 ` [PATCH v2 0/8] Initial Python 3 support John Keeping
2013-01-17 18:53 ` [PATCH v2 1/8] git_remote_helpers: allow building with Python 3 John Keeping
2013-01-17 18:53 ` [PATCH v2 2/8] git_remote_helpers: fix input when running under " John Keeping
2013-01-17 18:53 ` [PATCH v2 3/8] git_remote_helpers: force rebuild if python version changes John Keeping
2013-01-17 18:53 ` [PATCH v2 4/8] git_remote_helpers: use 2to3 if building with Python 3 John Keeping
2013-01-18  5:15   ` Sverre Rabbelier
2013-01-18 10:32     ` John Keeping
2013-01-19  7:52       ` Sverre Rabbelier
2013-01-17 18:53 ` [PATCH v2 5/8] svn-fe: allow svnrdump_sim.py to run " John Keeping
2013-01-17 18:53 ` [PATCH v2 6/8] git-remote-testpy: hash bytes explicitly John Keeping
2013-01-17 20:36   ` Junio C Hamano
2013-01-17 20:43     ` Junio C Hamano
2013-01-17 21:00     ` John Keeping
2013-01-17 21:05       ` John Keeping
2013-01-17 22:24       ` Junio C Hamano
2013-01-17 22:30         ` John Keeping
2013-01-17 22:57           ` Junio C Hamano
2013-01-17 18:54 ` [PATCH v2 7/8] git-remote-testpy: don't do unbuffered text I/O John Keeping
2013-01-18  3:50   ` Sverre Rabbelier
2013-01-17 18:54 ` [PATCH v2 8/8] git-remote-testpy: call print as a function John Keeping
2013-01-18  3:48   ` Sverre Rabbelier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130115194809.GU4574@serenity.lan \
    --to=john@keeping.me.uk \
    --cc=esr@thyrsus.com \
    --cc=felipe.contreras@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=mhagger@alum.mit.edu \
    --cc=srabbelier@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).