git@vger.kernel.org list mirror (unofficial, one of many)
 help / color / mirror / code / Atom feed
From: "Torsten Bögershausen" <tboegi@web.de>
To: Tzadik Vanderhoof <tzadik.vanderhoof@gmail.com>
Cc: Eric Sunshine <sunshine@sunshineco.com>,
	Git List <git@vger.kernel.org>,
	Johannes Schindelin <Johannes.Schindelin@gmx.de>,
	Luke Diamand <luke@diamand.org>, Pete Wyckoff <pw@padd.com>
Subject: Re: [PATCH] add git-p4.fallbackEncoding config variable, to prevent git-p4 from crashing on non UTF-8 changeset descriptions
Date: Sat, 24 Apr 2021 10:14:47 +0200	[thread overview]
Message-ID: <20210424081447.uxabqbxc54k6yxrg@tb-raspi4> (raw)
In-Reply-To: <CAKu1iLXPi4zc-5-RtZo3UBwTQ1GqshXjLEZKT=WvtvB0aiuUJA@mail.gmail.com>

(Adding some of the p4 and Windows experts in cc)

On Fri, Apr 23, 2021 at 12:08:17PM -0700, Tzadik Vanderhoof wrote:
> To clarify....

Good. This is good information, and the important stuff could go
into the commit message. And because the commit as such should be
self-contained (as much as possible).
Giving an overview about the problem.

>
> The new config variable I am introducing addresses an issue that only
> occurs on Windows.  This is because the behavior of the "p4" command
> differs on Windows vs Linux around Unicode in changeset descriptions.

What does Windows mean in this context ?
Is p4 a "console application" ?
In this case it may be possible to use CHCP to change to a different code page ?

>
> I don't have the source code for "p4", but I'm guessing it's written
> in C, and that this difference in behavior is simply a result of the
> fact that there is no defined standard of how "char *argv[]" in "main"
> should deal with non-ASCII characters being passed in from the command
> line.
>
> As a result, "git p4 clone" on Linux is not affected by this "p4"
> behavior.

Is it ?
What happens if yoy have a p4 depot that was feed from Windows in CP-1252 and is now
accessed from a Linux  machine ?
Doesthe Linux box face the same problems ?

> Since my tests assume the Windows behavior, they fail when
> run on Linux.  For this reason, I added code to my tests to skip them
> on Linux.

That makes sense, but what is the "Windows behavior" more in detail ?
My understanding is that when you press e.g. the key 'Ä' on the keybaord,
it will give a different byte sequence once that 'Ä' is transferred
across the wire (to the p4 server).
This is dependent on what Linux calls a locale, and all major Linux installations
use UTF-8 these days by default.
But that was not always the case, since in old days they used ISO-8851-1 or something
else, usable for your contry/region.

So most Windows "console applications" are not run under UTF-8, but it
may be possible that "CHCP 65000" (or so) works - more testing needed.
>
> On a related note, I don't think there are any CI environments on
> github for git that are (a) on Windows, and (b) have Python and (c)
> have Perforce, so I don't think my tests are actually running on
> github CI.  I'm not sure how that can be addressed.

That are 3 different questions -
(a) Yes, git is compiled under Windows, both gcc and MSVC (correct me if that is wrong)
(b) Yes, we have python on the different CI. Github actions has python, I use it every day.
(c) There are tests run for p4, but it seems if they are only run under Linux.

It would be nice if your test can pass under Linux, why are they failing ?

If I dig here:
<https://github.com/git/git/runs/2420889332?check_suite_focus=true>

We can see that the t98 test are run, and are passing. Just to pick one:
[15:28:22] t9804-git-p4-label.sh .............................. ok     3969

Thanks for working on this.
It would be good to have a v5 version of the patch with some more informations,
like above, and may be :how is the p4 server configured ?
(Unicode or not ?)


  reply	other threads:[~2021-04-24  8:15 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-08 19:28 git-p4 crashes on non UTF-8 output from p4 Tzadik Vanderhoof
2021-04-09 15:38 ` Torsten Bögershausen
2021-04-11  7:16   ` Tzadik Vanderhoof
2021-04-11  9:37     ` Torsten Bögershausen
2021-04-11 20:21       ` Tzadik Vanderhoof
2021-04-12  4:06         ` Torsten Bögershausen
2021-04-21  8:46           ` [PATCH] add git-p4.fallbackEncoding config variable, to prevent git-p4 from crashing on non UTF-8 changeset descriptions Tzadik Vanderhoof
2021-04-21  8:55             ` Tzadik Vanderhoof
2021-04-22  5:05               ` [PATCH v3] add git-p4.fallbackEncoding config setting, " Tzadik Vanderhoof
2021-04-22 15:50                 ` Torsten Bögershausen
2021-04-22 16:17                   ` Eric Sunshine
2021-04-22 22:33                     ` Eric Sunshine
2021-04-23  6:36                       ` [PATCH] add git-p4.fallbackEncoding config variable, " Tzadik Vanderhoof
2021-04-23  6:44                         ` Tzadik Vanderhoof
2021-04-23 19:08                           ` Tzadik Vanderhoof
2021-04-24  8:14                             ` Torsten Bögershausen [this message]
2021-04-27  5:39                               ` [PATCH v5] " Tzadik Vanderhoof
2021-04-27  5:45                                 ` Tzadik Vanderhoof
2021-04-28  4:39                                 ` Junio C Hamano
2021-04-28 14:58                                   ` Torsten Bögershausen
2021-04-29  7:39                                     ` [PATCH v6] Add git-p4.fallbackEncoding Tzadik Vanderhoof
2021-04-29  8:36                                       ` Luke Diamand
2021-04-29 17:29                                         ` Tzadik Vanderhoof
     [not found]                                     ` <20210429074458.891-1-tzadik.vanderhoof@gmail.com>
     [not found]                                       ` <c4c48615-d1f4-fd37-0960-979535907f15@web.de>
2021-04-29 17:14                                         ` Tzadik Vanderhoof

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210424081447.uxabqbxc54k6yxrg@tb-raspi4 \
    --to=tboegi@web.de \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=git@vger.kernel.org \
    --cc=luke@diamand.org \
    --cc=pw@padd.com \
    --cc=sunshine@sunshineco.com \
    --cc=tzadik.vanderhoof@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).