git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Andrew Oakley <andrew@adoakley.name>
To: Tzadik Vanderhoof <tzadik.vanderhoof@gmail.com>
Cc: Git List <git@vger.kernel.org>, Luke Diamand <luke@diamand.org>,
	Feiyang Xue <me@feiyangxue.com>
Subject: Re: [PATCH 2/2] git-p4: do not decode data from perforce by default
Date: Fri, 30 Apr 2021 09:53:42 +0100	[thread overview]
Message-ID: <20210430095342.58134e4e@ado-tr> (raw)
In-Reply-To: <CAKu1iLXRrsB4mRsDfhBH5aahWzDjpfqLuWP9t47RMB=RdpL1iA@mail.gmail.com>

On Thu, 29 Apr 2021 03:00:06 -0700
Tzadik Vanderhoof <tzadik.vanderhoof@gmail.com> wrote:
> However, on Windows, UTF-8 strings passed to "p4 submit -d" are
> somehow converted to the default Windows code page by the time they
> are stored in the Perforce database, probably as part of the process
> of passing the command line arguments to the Windows p4 executable.
> However, the "code page" data is *not* converted to UTF-8 on the way
> back from p4 to git-p4.py.  The only way to get it into UTF-8 is to
> call string.decode().  As a result, this patch, which takes out the
> call to string.decode() will not work on Windows.

Thanks for that explanation, the reencoding of the data on Windows is
not something I was expecting.  Given the behaviour you've described, I
suspect that there might be two different problems that we are trying
to solve.

The perforce depot I'm working with has a mixture of encodings, and
commits are created from a variety of different environments. The
majority of commits are ASCII or UTF-8, there are a small number that
are in some other encoding.  Any attempt to reencode the data is likely
to make the problem worse in at least some cases.

I suspect that other perforce depots are used primarily from Windows
machines, and have data that is encoded in a mostly consistent way but
the encoding is not UTF-8.  Re-encoding the data for git makes sense in
that case.  Is this the kind of repository you have?

If there are these two different cases then we probably need to come up
with a patch that solves both issues.

For my cases where we've got a repository containing all sorts of junk,
it sounds like it might be awkward to create a test case that works on
Windows.

  reply	other threads:[~2021-04-30  8:53 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-12  8:52 [PATCH 0/2] git-p4: encoding of data from perforce Andrew Oakley
2021-04-12  8:52 ` [PATCH 1/2] git-p4: avoid decoding more " Andrew Oakley
2021-04-12  8:52 ` [PATCH 2/2] git-p4: do not decode data from perforce by default Andrew Oakley
2021-04-29 10:00   ` Tzadik Vanderhoof
2021-04-30  8:53     ` Andrew Oakley [this message]
2021-04-30 15:33       ` Luke Diamand
2021-04-30 18:08         ` Tzadik Vanderhoof
2021-05-04 21:01           ` Andrew Oakley
2021-05-04 21:46             ` Tzadik Vanderhoof
2021-05-05  1:11               ` Junio C Hamano
2021-05-05  4:02                 ` Tzadik Vanderhoof
2021-05-05  4:06                   ` Tzadik Vanderhoof
2021-05-05  4:34                   ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210430095342.58134e4e@ado-tr \
    --to=andrew@adoakley.name \
    --cc=git@vger.kernel.org \
    --cc=luke@diamand.org \
    --cc=me@feiyangxue.com \
    --cc=tzadik.vanderhoof@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).