git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* gettext, multiple Preferred languages, and English
@ 2019-04-21 11:08 Andrew Janke
  2019-04-21 12:59 ` Philip Oakley
  2019-04-22  0:35 ` Duy Nguyen
  0 siblings, 2 replies; 9+ messages in thread
From: Andrew Janke @ 2019-04-21 11:08 UTC (permalink / raw)
  To: git

Hi, Git folks,

This is a follow-up to https://marc.info/?l=git&m=154757938429747&w=2.

With the current git 2.21.0, some users, including myself, are still
having problems with git selecting the "wrong" language for localization.

This happens on macOS in the situation where:
* The user has multiple Preferred languages defined in Language & Region
system preferences
* English is set as the Primary language
* Another language, for which git has a .po translation file defined, is
set as another Preferred language, for example, Spanish
* Environment variable $LANG is unset
* git was built with gettext support enabled

In this situation, when git is run, it will use the translations from
the secondary Preferred language instead of displaying messages in
English, the Primary language.

I've seen this situation with other gettext-enabled applications before.
I believe what's happening is that when selecting the language to use,
gettext goes through the Preferred languages in order, looking for a .po
translation file for each. It does not find one for English, but it does
find one for Spanish, so it uses that, instead of falling back to the
non-translated message strings.

Some examples of this happening in the wild:
-
https://stackoverflow.com/questions/55145901/force-git-to-use-the-default-system-language/55160216
- https://github.com/Homebrew/homebrew-core/issues/37331
- https://github.com/Homebrew/homebrew-core/issues/31980

I think an easy fix for this would be to add an "en.po" translation
file, so that when gettext does its translation selection, it finds that
first for when English is the Primary language (or a Preferred language
earlier in the order than other languages), and uses it. This .po file
would be an "identity" translation where the translated strings are all
just the same as the original strings. I don't think it would even have
to be actively maintained, because for new message strings that aren't
included in the .po file, it would fall back to the non-translated input
strings, which are in English anyway, which is the desired behavior.

This would be a convenience for git users, because it would "just work"
without any modifications to the configure/build process, or requiring
users to force a $LANG setting.

Would you consider adding this?

I've put together a patch that does this:
https://github.com/apjanke/git/tree/english-dummy-translation
https://github.com/apjanke/git/commit/7e6704167018e1d47399af04230521927991811b
Not attaching a patch because it's kind of a large file. I have tested
it locally and it fixes the language selection problem for me. I'm not
sure if the appropriate thing to do is make a PR for this to the
git-l10n/git-po GitHub repo or not.

Cheers,
Andrew Janke

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: gettext, multiple Preferred languages, and English
  2019-04-21 11:08 gettext, multiple Preferred languages, and English Andrew Janke
@ 2019-04-21 12:59 ` Philip Oakley
  2019-04-21 13:27   ` Andrew Janke
  2019-04-21 23:44   ` Junio C Hamano
  2019-04-22  0:35 ` Duy Nguyen
  1 sibling, 2 replies; 9+ messages in thread
From: Philip Oakley @ 2019-04-21 12:59 UTC (permalink / raw)
  To: Andrew Janke, git, Jiang Xin

Hi Andrew,

On 21/04/2019 12:08, Andrew Janke wrote:
https://public-inbox.org/git/d001a2b5-57c3-1eb3-70fd-679919bb2eb6@apjanke.net/
> I don't think it would even have
> to be actively maintained, because for new message strings that aren't
> included in the .po file, it would fall back to the non-translated input
> strings, which are in English anyway, which is the desired behavior.
Given the above comment, could the en.po file 
(https://github.com/apjanke/git/blob/english-dummy-translation/po/en.po) 
be some very very short version with only one 'translated' string? This 
may be a way-off comment, but if it could be such a simple maintenance 
free file then that sounds sensible.

also adding in Jiang Xin <worldhello.net@gmail.com>, the coordinator for 
extra comment.

Philip

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: gettext, multiple Preferred languages, and English
  2019-04-21 12:59 ` Philip Oakley
@ 2019-04-21 13:27   ` Andrew Janke
  2019-04-22  1:33     ` Andrew Janke
  2019-04-21 23:44   ` Junio C Hamano
  1 sibling, 1 reply; 9+ messages in thread
From: Andrew Janke @ 2019-04-21 13:27 UTC (permalink / raw)
  To: Philip Oakley, git, Jiang Xin


On 4/21/19 8:59 AM, Philip Oakley wrote:
> Hi Andrew,
> 
> On 21/04/2019 12:08, Andrew Janke wrote:
> https://public-inbox.org/git/d001a2b5-57c3-1eb3-70fd-679919bb2eb6@apjanke.net/
> 
>> I don't think it would even have
>> to be actively maintained, because for new message strings that aren't
>> included in the .po file, it would fall back to the non-translated input
>> strings, which are in English anyway, which is the desired behavior.
> Given the above comment, could the en.po file
> (https://github.com/apjanke/git/blob/english-dummy-translation/po/en.po)
> be some very very short version with only one 'translated' string?

Yes, I believe so. I only provided a full translation file because it
was trivial for me to create using the "msginit" instructions I found in
po/README. Since all the translations are just identity relationships, I
believe that is effectively the same as their not being there in the
first place.

I tested your approach locally, and it seems to work for me.

> This
> may be a way-off comment, but if it could be such a simple maintenance
> free file then that sounds sensible.
> 
> also adding in Jiang Xin <worldhello.net@gmail.com>, the coordinator for
> extra comment.
> 
> Philip

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: gettext, multiple Preferred languages, and English
  2019-04-21 12:59 ` Philip Oakley
  2019-04-21 13:27   ` Andrew Janke
@ 2019-04-21 23:44   ` Junio C Hamano
  1 sibling, 0 replies; 9+ messages in thread
From: Junio C Hamano @ 2019-04-21 23:44 UTC (permalink / raw)
  To: Philip Oakley; +Cc: Andrew Janke, git, Jiang Xin

Philip Oakley <philipoakley@talktalk.net> writes:

> On 21/04/2019 12:08, Andrew Janke wrote:
> https://public-inbox.org/git/d001a2b5-57c3-1eb3-70fd-679919bb2eb6@apjanke.net/
>> I don't think it would even have
>> to be actively maintained, because for new message strings that aren't
>> included in the .po file, it would fall back to the non-translated input
>> strings, which are in English anyway, which is the desired behavior.
> Given the above comment, could the en.po file
> (https://github.com/apjanke/git/blob/english-dummy-translation/po/en.po)
> be some very very short version with only one 'translated' string?

Or use LC_ALL=C and be done with it?

> This may be a way-off comment, but if it could be such a simple
> maintenance free file then that sounds sensible.
>
> also adding in Jiang Xin <worldhello.net@gmail.com>, the coordinator
> for extra comment.
>
> Philip

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: gettext, multiple Preferred languages, and English
  2019-04-21 11:08 gettext, multiple Preferred languages, and English Andrew Janke
  2019-04-21 12:59 ` Philip Oakley
@ 2019-04-22  0:35 ` Duy Nguyen
  2019-04-22  0:57   ` Andrew Janke
  1 sibling, 1 reply; 9+ messages in thread
From: Duy Nguyen @ 2019-04-22  0:35 UTC (permalink / raw)
  To: Andrew Janke; +Cc: Git Mailing List

On Sun, Apr 21, 2019 at 6:40 PM Andrew Janke <floss@apjanke.net> wrote:
>
> Hi, Git folks,
>
> This is a follow-up to https://marc.info/?l=git&m=154757938429747&w=2.

This says the problem with "en" detection has been fixed. Would
upgrading gettext fix it?

You would need to upgrade something (git or gettext) and if it's
already fixed in gettext I don't see why we need a workaround in git.

> With the current git 2.21.0, some users, including myself, are still
> having problems with git selecting the "wrong" language for localization.
>
> This happens on macOS in the situation where:
> * The user has multiple Preferred languages defined in Language & Region
> system preferences
> * English is set as the Primary language
> * Another language, for which git has a .po translation file defined, is
> set as another Preferred language, for example, Spanish
> * Environment variable $LANG is unset
> * git was built with gettext support enabled
>
> In this situation, when git is run, it will use the translations from
> the secondary Preferred language instead of displaying messages in
> English, the Primary language.
>
> I've seen this situation with other gettext-enabled applications before.
> I believe what's happening is that when selecting the language to use,
> gettext goes through the Preferred languages in order, looking for a .po
> translation file for each. It does not find one for English, but it does
> find one for Spanish, so it uses that, instead of falling back to the
> non-translated message strings.
>
> Some examples of this happening in the wild:
> -
> https://stackoverflow.com/questions/55145901/force-git-to-use-the-default-system-language/55160216
> - https://github.com/Homebrew/homebrew-core/issues/37331
> - https://github.com/Homebrew/homebrew-core/issues/31980
>
> I think an easy fix for this would be to add an "en.po" translation
> file, so that when gettext does its translation selection, it finds that
> first for when English is the Primary language (or a Preferred language
> earlier in the order than other languages), and uses it. This .po file
> would be an "identity" translation where the translated strings are all
> just the same as the original strings. I don't think it would even have
> to be actively maintained, because for new message strings that aren't
> included in the .po file, it would fall back to the non-translated input
> strings, which are in English anyway, which is the desired behavior.
>
> This would be a convenience for git users, because it would "just work"
> without any modifications to the configure/build process, or requiring
> users to force a $LANG setting.
>
> Would you consider adding this?
>
> I've put together a patch that does this:
> https://github.com/apjanke/git/tree/english-dummy-translation
> https://github.com/apjanke/git/commit/7e6704167018e1d47399af04230521927991811b
> Not attaching a patch because it's kind of a large file. I have tested
> it locally and it fixes the language selection problem for me. I'm not
> sure if the appropriate thing to do is make a PR for this to the
> git-l10n/git-po GitHub repo or not.
>
> Cheers,
> Andrew Janke



-- 
Duy

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: gettext, multiple Preferred languages, and English
  2019-04-22  0:35 ` Duy Nguyen
@ 2019-04-22  0:57   ` Andrew Janke
  2019-04-22 17:47     ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 9+ messages in thread
From: Andrew Janke @ 2019-04-22  0:57 UTC (permalink / raw)
  To: Duy Nguyen; +Cc: Git Mailing List



On 4/21/19 8:35 PM, Duy Nguyen wrote:
> On Sun, Apr 21, 2019 at 6:40 PM Andrew Janke <floss@apjanke.net> wrote:
>>
>> Hi, Git folks,
>>
>> This is a follow-up to https://marc.info/?l=git&m=154757938429747&w=2.
> 
> This says the problem with "en" detection has been fixed. Would
> upgrading gettext fix it?
> 
> You would need to upgrade something (git or gettext) and if it's
> already fixed in gettext I don't see why we need a workaround in git.

From reading the bug report, that does sound like it would fix it. But
from what I can see, that fix hasn't made it out into a released version
of gettext yet. I haven't downloaded the development gettext to confirm
the fix.

Looking at the gettext ftp site at https://ftp.gnu.org/pub/gnu/gettext/,
it looks like gettext does not make frequent releases, and the last
release was two and a half years ago. Who knows when the next release
will be. And then it'll take longer to trickle down into Linux
distributions and such.

From your release history at https://github.com/git/git/releases, it
seems like Git is a lot more active in making releases than gettext. So
including this fix in Git would get it into the hands of affected users
sooner. And it seems like a pretty low-risk change to me.

Then once the new gettext release is out, their fix is confirmed, and it
makes it out into common distros, the workaround could be removed from Git.

Cheers,
Andrew

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: gettext, multiple Preferred languages, and English
  2019-04-21 13:27   ` Andrew Janke
@ 2019-04-22  1:33     ` Andrew Janke
  0 siblings, 0 replies; 9+ messages in thread
From: Andrew Janke @ 2019-04-22  1:33 UTC (permalink / raw)
  To: Philip Oakley, git, Jiang Xin



On 4/21/19 9:27 AM, Andrew Janke wrote:
> 
> On 4/21/19 8:59 AM, Philip Oakley wrote:
>> Hi Andrew,
>>
>> On 21/04/2019 12:08, Andrew Janke wrote:
>> https://public-inbox.org/git/d001a2b5-57c3-1eb3-70fd-679919bb2eb6@apjanke.net/
>>
>>> I don't think it would even have
>>> to be actively maintained, because for new message strings that aren't
>>> included in the .po file, it would fall back to the non-translated input
>>> strings, which are in English anyway, which is the desired behavior.
>> Given the above comment, could the en.po file
>> (https://github.com/apjanke/git/blob/english-dummy-translation/po/en.po)
>> be some very very short version with only one 'translated' string?
> 
> Yes, I believe so. I only provided a full translation file because it
> was trivial for me to create using the "msginit" instructions I found in
> po/README. Since all the translations are just identity relationships, I
> believe that is effectively the same as their not being there in the
> first place.
> 
> I tested your approach locally, and it seems to work for me.

BTW, here's a branch with just the "stub" translation file:

https://github.com/apjanke/git/tree/english-dummy-translation-stub

Cheers,
Andrew

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: gettext, multiple Preferred languages, and English
  2019-04-22  0:57   ` Andrew Janke
@ 2019-04-22 17:47     ` Ævar Arnfjörð Bjarmason
  2019-04-23 11:45       ` Andrew Janke
  0 siblings, 1 reply; 9+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2019-04-22 17:47 UTC (permalink / raw)
  To: Andrew Janke; +Cc: Duy Nguyen, Git Mailing List


On Mon, Apr 22 2019, Andrew Janke wrote:

> On 4/21/19 8:35 PM, Duy Nguyen wrote:
>> On Sun, Apr 21, 2019 at 6:40 PM Andrew Janke <floss@apjanke.net> wrote:
>>>
>>> Hi, Git folks,
>>>
>>> This is a follow-up to https://marc.info/?l=git&m=154757938429747&w=2.
>>
>> This says the problem with "en" detection has been fixed. Would
>> upgrading gettext fix it?
>>
>> You would need to upgrade something (git or gettext) and if it's
>> already fixed in gettext I don't see why we need a workaround in git.
>
> From reading the bug report, that does sound like it would fix it. But
> from what I can see, that fix hasn't made it out into a released version
> of gettext yet. I haven't downloaded the development gettext to confirm
> the fix.
>
> Looking at the gettext ftp site at https://ftp.gnu.org/pub/gnu/gettext/,
> it looks like gettext does not make frequent releases, and the last
> release was two and a half years ago. Who knows when the next release
> will be. And then it'll take longer to trickle down into Linux
> distributions and such.
>
> From your release history at https://github.com/git/git/releases, it
> seems like Git is a lot more active in making releases than gettext. So
> including this fix in Git would get it into the hands of affected users
> sooner. And it seems like a pretty low-risk change to me.
>
> Then once the new gettext release is out, their fix is confirmed, and it
> makes it out into common distros, the workaround could be removed from Git.

What does Linux distro release schedule have to do with this? Your
initial report and the linked-to bug on GNU savannah only talk about
this being an issue on OSX. Is there some more general issue I'm
missing?

People have reported issues with OSX's weird language selection in the
past. I think it makes sense to do whatever we need to hack around it as
long as it's some well-understood and OSX-only hack.

I'm paranoid that the suggestion of adding an en.po *in general* would
break stuff elsewhere. I'd be surprised if the project linked-to
upthread that used that hack is as widely ported as we are, and that
includes a lot of i18n implementations, not just GNU's.

Ultimately setlocale() is *supposed* to be a well-understood thing. You
set your preferred locale, programs have translations, the OS takes care
of it. I'm concerned that us trying to be specifically smart in git will
backfire (e.g. it's been suggested in the past to have core.language or
whatever..).

But it looks like we don't need to go there, this seems like a
workaround needed for some specific OSX version.

That can just live behind a flag and be detected in config.mak.uname,
no? And then we'd do whatever hack digs us out of that specific hole on
OSX, e.g. maybe generating an en.po *just* there, and just for that list
of known broken version(s) of OSX.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: gettext, multiple Preferred languages, and English
  2019-04-22 17:47     ` Ævar Arnfjörð Bjarmason
@ 2019-04-23 11:45       ` Andrew Janke
  0 siblings, 0 replies; 9+ messages in thread
From: Andrew Janke @ 2019-04-23 11:45 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: Duy Nguyen, Git Mailing List



On 4/22/19 1:47 PM, Ævar Arnfjörð Bjarmason wrote:
> 
> On Mon, Apr 22 2019, Andrew Janke wrote:
> 
>> On 4/21/19 8:35 PM, Duy Nguyen wrote:
>>> On Sun, Apr 21, 2019 at 6:40 PM Andrew Janke <floss@apjanke.net> wrote:
>>>>
>>>> Hi, Git folks,
>>>>
>>>> This is a follow-up to https://marc.info/?l=git&m=154757938429747&w=2.
>>>
>>> This says the problem with "en" detection has been fixed. Would
>>> upgrading gettext fix it?
>>>
>>> You would need to upgrade something (git or gettext) and if it's
>>> already fixed in gettext I don't see why we need a workaround in git.
>>
>> From reading the bug report, that does sound like it would fix it. But
>> from what I can see, that fix hasn't made it out into a released version
>> of gettext yet. I haven't downloaded the development gettext to confirm
>> the fix.
>>
>> Looking at the gettext ftp site at https://ftp.gnu.org/pub/gnu/gettext/,
>> it looks like gettext does not make frequent releases, and the last
>> release was two and a half years ago. Who knows when the next release
>> will be. And then it'll take longer to trickle down into Linux
>> distributions and such.
>>
>> From your release history at https://github.com/git/git/releases, it
>> seems like Git is a lot more active in making releases than gettext. So
>> including this fix in Git would get it into the hands of affected users
>> sooner. And it seems like a pretty low-risk change to me.
>>
>> Then once the new gettext release is out, their fix is confirmed, and it
>> makes it out into common distros, the workaround could be removed from Git.
> 
> What does Linux distro release schedule have to do with this? Your
> initial report and the linked-to bug on GNU savannah only talk about
> this being an issue on OSX. Is there some more general issue I'm
> missing?
> 
> People have reported issues with OSX's weird language selection in the
> past. I think it makes sense to do whatever we need to hack around it as
> long as it's some well-understood and OSX-only hack.
> 
> I'm paranoid that the suggestion of adding an en.po *in general* would
> break stuff elsewhere. I'd be surprised if the project linked-to
> upthread that used that hack is as widely ported as we are, and that
> includes a lot of i18n implementations, not just GNU's.
> 
> Ultimately setlocale() is *supposed* to be a well-understood thing. You
> set your preferred locale, programs have translations, the OS takes care
> of it. I'm concerned that us trying to be specifically smart in git will
> backfire (e.g. it's been suggested in the past to have core.language or
> whatever..).
> 
> But it looks like we don't need to go there, this seems like a
> workaround needed for some specific OSX version.
> 
> That can just live behind a flag and be detected in config.mak.uname,
> no? And then we'd do whatever hack digs us out of that specific hole on
> OSX, e.g. maybe generating an en.po *just* there, and just for that list
> of known broken version(s) of OSX.
> 

Good point. I had forgotten this was OS X-specific; Linux release
schedules are not relevant. (I don't know enough about language
selection on Linux to know if would ever be relevant there.) There's no
more general issue you're missing; as far as I know this only happens
under OS X's multiple-Preferred-languages setup.

Yeah, adding the workaround only on OS X sounds like it would work, and
would be the more conservative thing to do. Generating a stub en.po just
for OS X, and maybe just for affected versions of OS X and gettext,
sounds sensible to me.

The bad behavior is due to gettext's interaction with OS X's language
selection, so whether it happens is probably going to depend on both the
version of OS X and the version of gettext you're building against. So
if you want to generate it selectively, I think you'll need to check the
version of gettext (in the expectation that a future version of gettext
will fix this), as well as maybe the version of OS X. This
multi-language-selection behavior seems to be by design in OS X; I
wouldn't expect it to change in the future. But it seems like some older
versions of OS X are not affected by it.

I can reproduce the bad-language-selection behavior on OS X 10.12,
10.13, and 10.14.
I cannot reproduce it on OS X 10.11, which is the oldest version of OS X
I can get running these days. (I'm not sure if that's due to different
OS behavior, or if I'm just doing something wrong on that box.) All my
testing was done with git 2.21.0 and GNU gettext 0.19.8.1.

Cheers,
Andrew

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2019-04-23 11:45 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-04-21 11:08 gettext, multiple Preferred languages, and English Andrew Janke
2019-04-21 12:59 ` Philip Oakley
2019-04-21 13:27   ` Andrew Janke
2019-04-22  1:33     ` Andrew Janke
2019-04-21 23:44   ` Junio C Hamano
2019-04-22  0:35 ` Duy Nguyen
2019-04-22  0:57   ` Andrew Janke
2019-04-22 17:47     ` Ævar Arnfjörð Bjarmason
2019-04-23 11:45       ` Andrew Janke

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).