bug-gnulib@gnu.org mirror (unofficial)
 help / color / mirror / Atom feed
* version-etc.c: about "Do not include URLS in translatable strings"
@ 2019-05-26  8:05 Akim Demaille
  2019-05-26  8:19 ` Akim Demaille
  0 siblings, 1 reply; 10+ messages in thread
From: Akim Demaille @ 2019-05-26  8:05 UTC (permalink / raw)
  To: Gnulib bugs; +Cc: John Darrington

Hi,

Recently there was a change that pulls the URL out of the translatable strings.  This was sometimes used to redirect to the translated GPL.  For instance the Coreutils in French:

$ gcp --version
cp (GNU coreutils) 8.31
Copyright © 2019 Free Software Foundation, Inc.
License GPLv3+ : GNU GPL version 3 ou ultérieure <https://www.gnu.org/licenses/gpl.fr.html>
Ceci est un logiciel libre. Vous êtes libre de le modifier et de le redistribuer.
Ce logiciel n'est accompagné d'ABSOLUMENT AUCUNE GARANTIE, dans les limites
permises par la loi.
Écrit par Torbjorn Granlund, David MacKenzie et Jim Meyering.

Cheers!

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: version-etc.c: about "Do not include URLS in translatable strings"
  2019-05-26  8:05 version-etc.c: about "Do not include URLS in translatable strings" Akim Demaille
@ 2019-05-26  8:19 ` Akim Demaille
  2019-05-27 22:07   ` localized URLs Bruno Haible
  0 siblings, 1 reply; 10+ messages in thread
From: Akim Demaille @ 2019-05-26  8:19 UTC (permalink / raw)
  To: Gnulib bugs; +Cc: John Darrington



> Le 26 mai 2019 à 10:05, Akim Demaille <akim@lrde.epita.fr> a écrit :
> 
> Hi,
> 
> Recently there was a change that pulls the URL out of the translatable strings.  This was sometimes used to redirect to the translated GPL.  For instance the Coreutils in French:
> 
> $ gcp --version
> cp (GNU coreutils) 8.31
> Copyright © 2019 Free Software Foundation, Inc.
> License GPLv3+ : GNU GPL version 3 ou ultérieure <https://www.gnu.org/licenses/gpl.fr.html>
> Ceci est un logiciel libre. Vous êtes libre de le modifier et de le redistribuer.
> Ce logiciel n'est accompagné d'ABSOLUMENT AUCUNE GARANTIE, dans les limites
> permises par la loi.
> Écrit par Torbjorn Granlund, David MacKenzie et Jim Meyering.

I'm currently moving Bison to version-etc, and I see another regression in the URLs, this time in the --help part about getting help:

Before (so translations from from bison's fr.po):

Rapportez toutes anomalies à <bug-bison@gnu.org>.
page d'accueil de GNU Bison: <http://www.gnu.org/software/bison/>.
Aide générique sur l'utilisation des logiciels GNU: <http://www.gnu.org/help/gethelp.fr.html>.
Pour la documentation complète, exécutez: info bison.

After (translations from gnulib-po's fr.po, but the URL is not localizable):

Signalez toute anomalie à : bug-bison@gnu.org
page d'accueil de GNU Bison : <http://www.gnu.org/software/bison/>
Aide globale sur les logiciels GNU : <https://www.gnu.org/gethelp/>
Pour la documentation complète, exécutez: info bison.



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: localized URLs
  2019-05-26  8:19 ` Akim Demaille
@ 2019-05-27 22:07   ` Bruno Haible
  2019-05-28  6:44     ` Akim Demaille
  0 siblings, 1 reply; 10+ messages in thread
From: Bruno Haible @ 2019-05-27 22:07 UTC (permalink / raw)
  To: bug-gnulib; +Cc: Akim Demaille, John Darrington

Hi Akim,

> Before (so translations from from bison's fr.po):
> 
> Aide générique sur l'utilisation des logiciels GNU: <http://www.gnu.org/help/gethelp.fr.html>.
> 
> After (translations from gnulib-po's fr.po, but the URL is not localizable):
> 
> Aide globale sur les logiciels GNU : <https://www.gnu.org/gethelp/>

Good point. Yes, you are right, gettext or gnulib should provide a way to deal
with localized URLs.

Like with localized strings in PO files, we are merely looking for a
string -> string replacement that is localizable.

However, it would be silly to put the burden of this localization on the
translators, when in fact it can be done (and updated) by the package
maintainer easily.

I'm thinking at a small program (shell script) that can be invoked as

  gen-localized-url IDENTIFIER DEFAULT-URL PATTERN

where PATTERN may contain format directives such as
  %ll-%cc
  %ll_%CC

This program will try the PATTERN for all possible languages and emit a
set of PO files of the form

  msgid "https://www.gnu.org/gethelp/"
  msgstr "http://www.gnu.org/help/gethelp.fr.html"

into a subdirectory. The package maintainer can then use dcgettext(),
and rerun the gen-localized-url command once a year, to capture newly
arrived translations.

This consideration proves that translator involvement is not necessary.

Now, this way of doing things is heavy from the packaging point of view:
Most developers won't like the idea to have 2 x 10 additional .mo files
to be installed with their package. Especially since the concept of PO and
MO files was conceived for translators, and translators are not involved
here.

A possible optimization would be to let the

  gen-localized-url IDENTIFIER DEFAULT-URL PATTERN

command emit a .h file that looks like this:

===============================================================================
#include "localized-url.h"

/*
 * List of localizations of <https://www.gnu.org/software/gethelp.html>.
 * This file is in the public domain.
 */

static const char * const IDENTIFIER_translations =
  {
    "en",
    "cs",
    "de",
    "es",
    "fr",
    "ja",
    "ro",
    "ru",
    "sq",
    "zh_CN",
    NULL
  };

static localized_url_t const IDENTIFIER =
  {
    "https://www.gnu.org/software/gethelp.html",
    "https://www.gnu.org/software/gethelp.%ll-%cc.html",
    IDENTIFIER_translations
  };
===============================================================================

where localized-url.h essentially contains the type definition

typedef struct
  {
    const char * default_url;
    const char * pattern;
    const char * const * translations;
  }
localized_url_t;

This would be a much more compact way to represent the set of localizations
of a URL.

The problem that I see here is that in order to fetch the appropriate URL,
the runtime code must access
  gl_locale_name (LC_MESSAGES, "LC_MESSAGES")
or - using functions defined in dcigettext.c -
  get_category_value (LC_MESSAGES, "LC_MESSAGES").

gl_locale_name has the drawback that it requires a large bunch of code from
gnulib. (100 KB of source code, but only 1 KB of binary code on glibc systems.)

get_category_value has the drawback that it is not exported from glibc
and libintl.

So, this optimized representation of localized URLs requires some prior work
in gettext, if we don't want gl_locale_name.

Or maybe you have better ideas to solve this dilemma?

Bruno



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: localized URLs
  2019-05-27 22:07   ` localized URLs Bruno Haible
@ 2019-05-28  6:44     ` Akim Demaille
  2019-05-28  8:49       ` Bruno Haible
  2019-05-28  8:55       ` Bruno Haible
  0 siblings, 2 replies; 10+ messages in thread
From: Akim Demaille @ 2019-05-28  6:44 UTC (permalink / raw)
  To: Bruno Haible; +Cc: bug-gnulib, John Darrington

Hi Bruno,

> Le 28 mai 2019 à 00:07, Bruno Haible <bruno@clisp.org> a écrit :
> 
> Hi Akim,
> [...]
> Or maybe you have better ideas to solve this dilemma?

I would certainly not claim to have better ideas :)

You are looking for a generic way to deal with URLs that have alternatives for languages.  I was looking only at the problem at hand, that is to say gnulib's version-etc module.

I personally do not see much of a problem to have the translators specify the URLs themselves.  For instance in Bison we have:

Report translation bugs to <http://translationproject.org/team/>.

that becomes

po/ca.po-"Informa dels errors de traducció a <http://translationproject.org/team/ca."
po/ca.po-"html>.\n"

but also

po/da.po-msgstr "Rapporter oversættelsesfejl til <dansk@dansk-gruppen.dk>.\n"

or

po/fi.po-"Ilmoita käännösvirheistä osoitteeseen <translation-team-fi@lists.sourceforge."
po/fi.po-"net>.\n"

or

po/nl.po-msgstr "Meld fouten in de vertaling aan <vertaling@vrijschrift.org>.\n"

po/pl.po-"O błędach tłumaczenia poinformuj <translation-team-pl@lists.sourceforge."
po/pl.po-"net>.\n"

etc.


Of course, this very case is kind of special, agreed.  But still, I don't think errors in URL translations are more serious that plain errors in the translations (I fixed some of them in my own language that were even conveying incorrect semantics), or delicate pieces of text such as the GPL disclaimer excerpt.

For the general case, I suppose you can't even expect the pattern to be always the language code appearing somewhere.

Anyway, if we focus on version-etc, gnulib could easily provide the correspondance for the main URLs of the FSF, e.g., as a C function.  Alternatively it could even bake them into gnulib-po, which would also avoid having to deal with yet another po dir, but still benefit from the Gettext ecosystem.

On the other end, it would be nice if the FSF supported something like '?locale=CODE' or '?lang=CODE', as some sites do, as it would relieve us from having to maintain some correspondance by hand.

Cheers!

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: localized URLs
  2019-05-28  6:44     ` Akim Demaille
@ 2019-05-28  8:49       ` Bruno Haible
  2019-05-28 16:37         ` Akim Demaille
  2019-05-28  8:55       ` Bruno Haible
  1 sibling, 1 reply; 10+ messages in thread
From: Bruno Haible @ 2019-05-28  8:49 UTC (permalink / raw)
  To: Akim Demaille; +Cc: bug-gnulib, John Darrington

Hi Akim,

> On the other end, it would be nice if the FSF supported something like
> '?locale=CODE' or '?lang=CODE', as some sites do, as it would relieve us
> from having to maintain some correspondance by hand.

The only benefit that this syntax would have is that the PATTERN could be
derived from the DEFAULT-URL.

It would still be required to look up which translations of the web page
exist and which don't. Why? Assume the Spanish translation exists and the
Portuguese translation doesn't exist, and the user has set LANGUAGE=pt:es.
Then it would be wrong to present DEFAULT-URL?lang=pt to the user, because
it would show a 404 page or fall back to English. The user wants to see
DEFAULT-URL?lang=es in this case, though.

Bruno



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: localized URLs
  2019-05-28  6:44     ` Akim Demaille
  2019-05-28  8:49       ` Bruno Haible
@ 2019-05-28  8:55       ` Bruno Haible
  2019-05-28 16:30         ` Akim Demaille
  1 sibling, 1 reply; 10+ messages in thread
From: Bruno Haible @ 2019-05-28  8:55 UTC (permalink / raw)
  To: Akim Demaille; +Cc: bug-gnulib, John Darrington

Akim Demaille wrote:
> I don't think errors in URL translations are more serious that plain errors
> in the translations (I fixed some of them in my own language that were even
> conveying incorrect semantics), or delicate pieces of text such as the GPL
> disclaimer excerpt.

That's not the point. The point is:

1) When translators are requested to translate an URL, they would manually
   do something that an automated script can do just as well. The linguistic
   expertise of the translators is not exploited.

2) It is perfectly normal for translators to not update a PO file in 3 years.
   But a translated URL can appear (or disappear); the translator would not
   be made aware of it.

In summary, localized URLs require another workflow than localized text.

Bruno



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: localized URLs
  2019-05-28  8:55       ` Bruno Haible
@ 2019-05-28 16:30         ` Akim Demaille
  0 siblings, 0 replies; 10+ messages in thread
From: Akim Demaille @ 2019-05-28 16:30 UTC (permalink / raw)
  To: Bruno Haible; +Cc: Gnulib bugs, John Darrington

Hi Bruno,

> Le 28 mai 2019 à 10:55, Bruno Haible <bruno@clisp.org> a écrit :
> 
> Akim Demaille wrote:
>> I don't think errors in URL translations are more serious that plain errors
>> in the translations (I fixed some of them in my own language that were even
>> conveying incorrect semantics), or delicate pieces of text such as the GPL
>> disclaimer excerpt.
> 
> That's not the point. The point is:
> 
> 1) When translators are requested to translate an URL, they would manually
>   do something that an automated script can do just as well.

I'm not sure a script can cover all the cases, but I'm fine with this approach,
of course.

> The linguistic
>   expertise of the translators is not exploited.

If all the pages we want to map to strictly follow some unique pattern,
yep.

> 2) It is perfectly normal for translators to not update a PO file in 3 years.
>   But a translated URL can appear (or disappear); the translator would not
>   be made aware of it.
> 
> In summary, localized URLs require another workflow than localized text.

Yes, indeed.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: localized URLs
  2019-05-28  8:49       ` Bruno Haible
@ 2019-05-28 16:37         ` Akim Demaille
  2019-05-28 17:59           ` Bruno Haible
  0 siblings, 1 reply; 10+ messages in thread
From: Akim Demaille @ 2019-05-28 16:37 UTC (permalink / raw)
  To: Bruno Haible; +Cc: bug-gnulib, John Darrington



> Le 28 mai 2019 à 10:49, Bruno Haible <bruno@clisp.org> a écrit :
> 
> Hi Akim,
> 
>> On the other end, it would be nice if the FSF supported something like
>> '?locale=CODE' or '?lang=CODE', as some sites do, as it would relieve us
>> from having to maintain some correspondance by hand.
> 
> The only benefit that this syntax would have is that the PATTERN could be
> derived from the DEFAULT-URL.
> 
> It would still be required to look up which translations of the web page
> exist and which don't. Why? Assume the Spanish translation exists and the
> Portuguese translation doesn't exist, and the user has set LANGUAGE=pt:es.
> Then it would be wrong to present DEFAULT-URL?lang=pt to the user, because
> it would show a 404 page or fall back to English. The user wants to see
> DEFAULT-URL?lang=es in this case, though.

I was unaware there was a means to have cascading languages...

You have a point!  However, such support from the web site itself
is the surest way to be always up to date, even if you are on some old
distro with a piece of software that was released many years ago.  If
there's really demand, they could also support this "path" of languages.

You are addressing a wider problem, I was focusing only on version-etc.

Cheers!

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: localized URLs
  2019-05-28 16:37         ` Akim Demaille
@ 2019-05-28 17:59           ` Bruno Haible
  2019-05-28 18:41             ` Akim Demaille
  0 siblings, 1 reply; 10+ messages in thread
From: Bruno Haible @ 2019-05-28 17:59 UTC (permalink / raw)
  To: Akim Demaille; +Cc: bug-gnulib, John Darrington

Akim Demaille wrote:
> However, such support from the web site itself
> is the surest way to be always up to date, even if you are on some old
> distro with a piece of software that was released many years ago.

Indeed.

> If there's really demand, they could also support this "path" of languages.

Support for a preference list of languages is part of the current web standards.
[1]

However, I'm not sure I prefer the web pages that do dynamic guesses based
on the user's designated languages, IP address, and tracked preferences.
URLs that display the same thing for user A as for user B also have advantages.

Bruno

[1] https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Accept-Language



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: localized URLs
  2019-05-28 17:59           ` Bruno Haible
@ 2019-05-28 18:41             ` Akim Demaille
  0 siblings, 0 replies; 10+ messages in thread
From: Akim Demaille @ 2019-05-28 18:41 UTC (permalink / raw)
  To: Bruno Haible; +Cc: Gnulib bugs, John Darrington



> Le 28 mai 2019 à 19:59, Bruno Haible <bruno@clisp.org> a écrit :
> 
> Support for a preference list of languages is part of the current web standards.
> [1]

Oh wow, weighted languages...  It feels like language theory :)

Thanks, I had no idea.

> However, I'm not sure I prefer the web pages that do dynamic guesses based
> on the user's designated languages, IP address, and tracked preferences.
> URLs that display the same thing for user A as for user B also have advantages.

I agree (and the Mozilla page, too.  Its own URL showing one way to be
explicit).  I was really referring to explicit params _in_ the URL.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2019-05-28 18:41 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-26  8:05 version-etc.c: about "Do not include URLS in translatable strings" Akim Demaille
2019-05-26  8:19 ` Akim Demaille
2019-05-27 22:07   ` localized URLs Bruno Haible
2019-05-28  6:44     ` Akim Demaille
2019-05-28  8:49       ` Bruno Haible
2019-05-28 16:37         ` Akim Demaille
2019-05-28 17:59           ` Bruno Haible
2019-05-28 18:41             ` Akim Demaille
2019-05-28  8:55       ` Bruno Haible
2019-05-28 16:30         ` Akim Demaille

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).