git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jakub Narebski <jnareb@gmail.com>
To: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Cc: git@vger.kernel.org, Jeff Epler <jepler@unpythonic.net>
Subject: Re: [PATCH/RFC v2 5/6] gettext: Add a Gettext interface for Perl
Date: Wed, 2 Jun 2010 13:47:50 +0200	[thread overview]
Message-ID: <201006021347.51960.jnareb@gmail.com> (raw)
In-Reply-To: <AANLkTilLWZA6OrXhF28qCao2svpZWZiEwb5nVhXUTB9i@mail.gmail.com>

On Tue, 1 June 2010, Ævar Arnfjörð Bjarmason wrote:
> On Tue, Jun 1, 2010 at 17:00, Jakub Narebski <jnareb@gmail.com> wrote:
>> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
>>
>>> Make Git's gettext messages available to Perl programs through
>>> Locale::Messages. [...]
>>
>> This is I think a very good idea [...] using the
>> Locale::Messages module from libintl-perl library.  The "On the state of
>> i18n in Perl" (http://rassie.org/archives/247) blog post from 26 April
>> 2009 provides nice counterpoint to Locale::Maketext::TPJ13 / The Perl
>> Journal #13 article[1] from 1999... especially because using gettext is
>> natural for translating git command output and GUI, because git uses it
>> already for Tcl/Tk (gitk and git-gui), and it is natural solution for
>> code in C, which slightly less than half of git code.
>>
>> Well, we could use Locale::Maketext::Gettext, but it is not in Perl
>> core either, and as http://rassie.org/archives/247 says its '*.po'
>> files are less natural.  The gettext documentation (gettext.info) also
>> recommends libintl-perl, or to be more exact Locale::TextDomain from
>> it.
>>
>> [1] http://search.cpan.org/perldoc?Locale::Maketext::TPJ13
>>     http://interglacial.com/~sburke/tpj/as_html/tpj13.html
> 
> Locale::Maketext is a less complete and non-standard alternative as
> the above blogs note. I think the main reason it's used in favor of
> Gettext in many Perl projects is that it can be compiled stand-alone
> from the CPAN, i.e. it doesn't depend on libintl.

That, and the fact that some Perl projects / modules on CPAN started
using Locale::Maketext before gettext acquired functionality for
handling plural forms like ngettext.  The mentioned article from
The Perl Journal #13 might have also mislead module authors...

On the other hand libintl-perl distribution contains pure-Perl gettext
emulation module: Locale::gettext_pp.

> 
> That's comfortable when CPAN is your main distribution channel, but
> Git's main distribution channel is via OS packages, which can
> trivially introduce a gettext dependency (for which they doubtless
> already have a package).

We need gettext for C, for shell, and for Tcl/Tk anyway.

BTW. some projects keep copy of libintl / gettext library in 'intl/'
subdirectory, to be used if it is not present in the target system.
Perl has 'inc/' and 'use inc::latest' (unfortunately in core only
since 5.011002, i.e. 5.12 in practice) to keep private copy of modules
to be used / installed if they are not present in system.

> 
> Locale::Maketext has a Gettext emulation layer, but using it would be
> a potential source of bugs (no emulation is perfect).

OTOH see Locale::gettext_pp from libintl-perl (the distribution
containing Locale::Messages and Locale::TextDomain modules).

> 
> That being said the way I wrote Git::Gexttext means any alterante
> implementation or emulation layer can be seemlessly added later on. We
> could use (or write) something for Perl, C or Shell that completely
> bypasses libintl for systems that don't have a gettext C library.

Right.

Besides I think that the base gettext (no plural forms etc.) is the
same (would produce the same *.po files) for Locale::Maketext::Gettext
and for Locale::Messages / Locale::TextDomain.

> 
>> The question is why not use Locale::TextDomain, the high-level Perl-y
>> framework, wrapper around Locale::Messages from the same libintl-perl
>> library?  The gettext documentation (in gettext.info, chapter "13.5.18
>> Perl") says:
>>
>>  Prerequisite
>>     `use POSIX;'
>>     `use Locale::TextDomain;' (included in the package libintl-perl
>>     which is available on the Comprehensive Perl Archive Network CPAN,
>>     http://www.cpan.org/).
>>
>>
>> This would change
>>
>>  -      print "Emails will be sent from: ", $sender, "\n";
>>  +      printf gettext("Emails will be sent from: %s\n"), $sender;
>>
>> to either
>>
>>  +      print __"Emails will be sent from: ", $sender, "\n";
>>
>> or
>>
>>  +      printf __("Emails will be sent from: %s\n"), $sender;
>>
>> or
>>
>>  +      print __x("Emails will be sent from: {sender}\n",
>>  +                sender => $sender);
> 
> I didn't use Locale::TextDomain because it exports a fatter interface
> by default, and I'm not sure at this point what subset of it it would
> make sense to support.
> 
> This is the default Locale::TextDomain interface:
> 
>    @EXPORT = qw (__ __x __n __nx __xn __p __px __np __npx $__ %__
>                  N__ N__n N__p N__np);
> 
> My Git::Gettext only exports a single gettext() function now. I think
> it's better at this point to have a really small interface and decide
> later if we'd like to expand it, preferably in a way that'll work
> consistently across C, Perl and Shell programs.

That is not, I think, the problem.  You don't need to re-export _all_
of Locale::TextDomain interface (or create and export stubs for all
above functions and variables).

> 
> We might want to improve plural support, add msgctxt etc. later. But
> for an initial implementation I'd rather have something simple &
> stupid.

That's understandable.

BTW. I wonder if git currently has anywhere mesage that depends on
plural form, or does it always use plural / multiform, like for
example '1 files changed' from "git diff --summary"...

> 
>>> Signed-off-by: ?var Arnfjör? Bjarmason <avarab@gmail.com>
>>> ---
>>>  Makefile            |    4 ++-
>>>  git-send-email.perl |    3 +-
>>>  perl/Git/Gettext.pm |   83 +++++++++++++++++++++++++++++++++++++++++++++++++++
>>>  perl/Makefile.PL    |    5 ++-
>>>  4 files changed, 92 insertions(+), 3 deletions(-)
>>>  create mode 100644 perl/Git/Gettext.pm
>>>
>>> diff --git a/Makefile b/Makefile
>>> index dce2faa..2101713 100644
>>> --- a/Makefile
>>> +++ b/Makefile
>>> @@ -1884,7 +1884,9 @@ cscope:
>>>       $(FIND) . -name '*.[hcS]' -print | xargs cscope -b
>>>
>>>  pot:
>>> -     $(XGETTEXT) -k_ -o po/git.pot $(C_OBJ:o=c) $(SCRIPT_SH)
>>> +     $(XGETTEXT) --keyword=_ --output=po/git.pot --language=C $(C_OBJ:o=c)
>>> +     $(XGETTEXT) --join-existing --output=po/git.pot --language=Shell $(SCRIPT_SH)
>>
>> Shouldn't this line be in earlier patch, i.e. in "gettext: Add a
>> Gettext interface for shell scripts"?
> 
> It needed to be split up into seperate line in the Perl commit because ...
> 
>>> +     $(XGETTEXT) --join-existing --output=po/git.pot --language=Perl $(SCRIPT_PERL)
>>
>> From gettext documentation (in gettext.info, chapter "13.5.18 Perl"):
>>
>>  Extractor
>>     `xgettext -k__ -k\$__ -k%__ -k__x -k__n:1,2 -k__nx:1,2 -k__xn:1,2
>>     -kN__ -k'
>>
>> Is it equivalent to specifying 'xgettext --language=Perl'?
> 
> I'm doing --language=Perl because gettext doesn't recognize .perl as a
> valid Perl extension. It only knows about `pl', `PL', `pm', `cgi'.

All right.

> (As an aside, I haven't found why Git chose to use the quaint .perl
> extension).
> 
> But yeah, using --language=Perl is overshooting it with regards to
> keyword extraction. From "5.1 Invoking the `xgettext' Program":
> 
>      To disable the default keyword specifications, the option `-k' or
>      `--keyword' or `--keyword=', without a KEYWORDSPEC, can be used.
> 
> I'll check if I can supply --keyword to all the XGETTEXT invocations.
> 
>> Of course the above assumes that you are using Locale::TextDomain, or
>> at least use the same conventions.

In my opinion we should use the common convention for the programming
language, like described in chapter 15 "Other Programming Languages"
in gettext manual: 
  http://www.gnu.org/software/gettext/manual/gettext.html#Programming-Languages

 * C:      _("msg")
 * Perl:   __"msg"
 * shell:  $(gettext "abc")
 * Tcl/Tk: [_ "msg"]
 * Python: _('msg')

On the other hand the only currently traslated part of git, namely
gitk and git-gui, both of which are written in Tcl/Tk, do not use
recommended gettext shorthand, i.e. [_ "msg"], but the gettext
function directly, i.e. [mc "msg"] (in the imported form, and not
[::msgcat::mc "msg"]).  I guess this is caused by the fact that
shorthand is not that shorter here...

>>>  Getopt::Long::Configure qw/ pass_through /;
>>>
>>> @@ -674,7 +675,7 @@ if (!defined $sender) {
>>>       $sender = $repoauthor || $repocommitter || '';
>>>       $sender = ask("Who should the emails appear to be from? [$sender] ",
>>>                     default => $sender);
>>> -     print "Emails will be sent from: ", $sender, "\n";
>>> +     printf gettext("Emails will be sent from: %s\n"), $sender;
>>
>> As I wrote, this IMHO should be either
>>
>>  +     printf __("Emails will be sent from: %s\n"), $sender;
>>
>> or
>>
>>  +     print __x("Emails will be sent from: {sender}\n", sender => $sender);
>>
>> (note that parantheses differ in those two examples).
> 
> I'd prefer the former argument expansion form (i.e. %s, not
> {$sender}), just so translators don't have to deal with two different
> argument forms.

Well, in this case this does not matter (much), because we have only
single placeholder.

The Locale::TextDomain manpage says that the reason behind __x() is
twofold.  First, you would need context to know what placeholder means
in

  printf __"This is the %s %s.\n", $color, $thing;

as you would have only

  msgid "This is the %s %s.\n"

in the PO file.  Second, formatting with position (to change order of
parameters) is possible with 'printf' only for Perl 5.8.0 or never:

  printf __("This is the %1\$s %2\$s.\n"), $color, $thing;

So that's the reason for introducing

  print __x("This is the {color} {thing}.\n",
            color => $color, thing => $thing);

> As for __ or gettext I don't mind. I just picked the former on a whim
> to match the Shell version. Maybe I should just change C|Perl|Shell to
> use _?
> 
> And we can use _, __ is just used to disambiguate it from the _
> filehandle, and Perl's pretty good at disambiguating that already:
> 
>     $ perl -E 'sub _ ($) { scalar reverse $_[0] }; say _("moo"); stat
> "/etc/hosts"; say ((stat(_))[7])'
>     oom
>     168

Locale::TextDomain::FAQ says:

 Q. Why does Locale::TextDomain use a double underscore? I am used to
    a single underscore from C or other languages.

 A. Function names that consist of exactly one non-alphanumerical
    character make the function automatically global in Perl. Besides,
    in Perl 6 the concatenation operator will be the underscore
    instead of the dot.

    [jn: actually the string concatenation operator in Perl 6 is '~',
     according to http://perlgeek.de/en/article/5-to-6#post_11: 
     "Perl 5 to 6" Lesson 11 - Changes to Perl 5 Operators]


Although 'gettext-tools/examples/hello-perl/hello-1.pl.in' from
gettext sources
  http://git.savannah.gnu.org/cgit/gettext.git/tree/gettext-tools/examples/hello-perl/hello-1.pl.in
(hello-1 uses Locale::Messages, hello-2 uses Locale::TextDomain)
uses _("msg") shortcut:

  sub _ ($) { &gettext; }

Hmmm...

>>> diff --git a/perl/Git/Gettext.pm b/perl/Git/Gettext.pm
>>> new file mode 100644
>>> index 0000000..f434783
>>> --- /dev/null
>>> +++ b/perl/Git/Gettext.pm
>>> @@ -0,0 +1,83 @@
>>> +package Git::Gettext;
>>
>> Should this package be named Git::Gettext, or other name would be
>> better, perhaps Git::I18N (like e.g. Games::Risk have
>> Games::Risk::I18N), or Git::Locale, or even Git::Translator?
>>
>> Not very important.
> 
> I really have no opinion on that, but it does seem like a lot of Perl
> packages on CPAN use the ::I18N suffix.

Also using Git::I18N allows us to change the backend from gettext to
something other (like Locale::Maketext lexicon, or something yet
another).

>>> +use strict;
>>> +use warnings;
>>> +use Exporter;
>>> +use base 'Exporter';
>>
>> O.K.
>>
>> The alternative would be to use
>>
>>  +use Exporter qw(import);
> 
> Not if we want to maintain 5.6 support. The `use Exporter "import"'
> form was only introduced in 5.8. I use it everywhere where I don't
> have to care about 5.6 support (which is everywhere but Git).

Oh, all right.  Thanks for explanation.

>>> +our %EXPORT_TAGS;
>>> +@{ $EXPORT_TAGS{'all'} } = @EXPORT_OK;
>>
>> Why not simply
>>
>>  +our %EXPORT_TAGS = ('all' => [ @EXPORT_OK ]);
[...]
> 
> Just because it some old code of mine I had around that I copied
> from. I could rewrite it to a nicer form.
> 
> Actually I was considering changing it to just have @EXPORT and
> nothing else. You're going to use gettext if you use the module
> anyway, so we might as well just import them all everywhere we use
> Gettext in Perl.

Right, if we use Git::Gettext / Git::I18N it means that we want to do
message traslation, so we would need to import subroutines anyway.

>>> +
>>> +sub __bootstrap_locale_messages {
>>> +     our $TEXTDOMAIN = 'git';
>>> +
>>> +     # TODO: How do I make the sed replacements in the top level
>>> +     # Makefile reach me here?
>>> +     #our $TEXTDOMAINDIR = q|@@LOCALEDIR@@|;
[...]
>> Make invoked on perl/Makefile, when invoked from main Makefile by
>> '$(MAKE) -C perl' (via QUIET_SUBDIR0) passes 'localedir' to submake;
>> perl/Makefile should probably have something like
>>
>>  localedir ?= $(sharedir)/locale
>>
>> That is assuming that 'localedir' is added to list of exported
>> variables.
>>
>> But I am not sure how such substitution should be performed.
> 
> I'll try to get something like that to work. Actually the main problem
> seemed to be that it had a hybrid handwritten Makefile and one made by
> EU::MM.

The problem with EU::MM is that while ExtUtils::MakeMaker was first
released as a core module with Perl 5, git requires for generated
makefile (well, perl.mak included by perl/Makefile) to be compatibile
with DESTDIR mechanism.  This requires ExtUtils::MakeMaker version 6.11+,
which was released with perl 5.006002 (Perl v5.6.2).

Hmmm... perhaps we can bump requirement from Perl 5.6 to Perl 5.6.2
and get rid of hybrid handwritten Makefile?

Otherwise you would have to either duplicate EU::MM work in
handwritten Makefile, or somehow make perl/Makefile to do
transformation before perl/perl.mak generated by EU::MM starts
working...

>>> +
>>> +BEGIN
>>> +{
>>> +     local ($@, $!);
>>> +     eval { __bootstrap_locale_messages() };
>>> +     if ($@) {
>>> +             # Oh noes, no Locale::Messages here
>>> +             *gettext = sub ($) { $_[0] };
>>> +     }
>>> +}
>>
>> Does it need to be in BEGIN block?  Probably yes.
> 
> I'm pretty sure it needs to all be at BEGIN time. I recall that
> declaring prototypes at runtime was an error in some older versions,
> but it works now:
> 
>     $ perl -E '*meh = sub ($) { "foo" }; say prototype "meh"'
>     $
> 
> Anyway since it's code that's being use'd it's going to always be at
> BEGIN time anyway. I just wanted to be explicit.

All right, probably better to be safe.

>>> +=head1 EXPORTS
>>> +
>>> +Exports are done via L<Exporter>. Invididual functions can be
>>> +exporter, or all of them via the C<:all> export tag.
>>
>> Shouldn't this be described in less technical way in SYNOPSIS and
>> DESCRIPTION sections instead?
> 
> Yeah, maybe. But there's a convention for `=head1 EXPORTS' in perl
> core / CPAN. So that's what I added at a whim.

Hmmm... I don't recall seeing EXPORTS section often in manpages for
Perl modules, at least not for the ones I use...


BTW. I tried to find Perl module / Perl program that uses
Locale::TextDomain for l10n, but provides passthrough fallback if it
is not installed... but haven't find any.

-- 
Jakub Narebski
Poland

  reply	other threads:[~2010-06-02 12:13 UTC|newest]

Thread overview: 56+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-05-29 22:45 [PATCH/RFC 0/5] Add internationalization support to Git Ævar Arnfjörð Bjarmason
2010-05-29 22:45 ` [PATCH 1/5] Add infrastructure for translating Git with gettext Ævar Arnfjörð Bjarmason
2010-05-29 22:45 ` [PATCH 2/5] gitignore: Ignore files generated by gettext Ævar Arnfjörð Bjarmason
2010-05-29 22:45 ` [PATCH 3/5] Makefile: Remove Gettext files on make clean Ævar Arnfjörð Bjarmason
2010-05-29 22:45 ` [PATCH 4/5] gettext: Add a skeleton po/is.po Ævar Arnfjörð Bjarmason
2010-05-29 22:45 ` [PATCH 5/5] Add infrastructure to make shellscripts translatable Ævar Arnfjörð Bjarmason
2010-05-30  1:46 ` [PATCH/RFC 0/5] Add internationalization support to Git Jonathan Nieder
2010-05-30 16:04   ` Ævar Arnfjörð Bjarmason
2010-05-30 22:23     ` Jonathan Nieder
2010-05-31 12:17       ` Ævar Arnfjörð Bjarmason
2010-05-30 20:54 ` [PATCH/RFC v2 0/6] " Ævar Arnfjörð Bjarmason
2010-05-30 20:54 ` [PATCH/RFC v2 1/6] Add infrastructure for translating Git with gettext Ævar Arnfjörð Bjarmason
2010-06-01 17:01   ` Jakub Narebski
2010-06-01 18:11     ` [PATCH/RCF] autoconf: Check if <libintl.h> exists and set NO_GETTEXT Ævar Arnfjörð Bjarmason
2010-06-01 21:22       ` Jakub Narebski
2010-05-30 20:54 ` [PATCH/RFC v2 2/6] gitignore: Ignore files generated by gettext Ævar Arnfjörð Bjarmason
2010-05-30 20:54 ` [PATCH/RFC v2 3/6] Makefile: Remove Gettext files on make clean Ævar Arnfjörð Bjarmason
2010-05-30 20:54 ` [PATCH/RFC v2 4/6] gettext: Add a Gettext interface for shell scripts Ævar Arnfjörð Bjarmason
2010-05-30 20:54 ` [PATCH/RFC v2 5/6] gettext: Add a Gettext interface for Perl Ævar Arnfjörð Bjarmason
2010-06-01 17:00   ` Jakub Narebski
2010-06-01 19:06     ` Ævar Arnfjörð Bjarmason
2010-06-02 11:47       ` Jakub Narebski [this message]
2010-05-30 20:54 ` [PATCH/RFC v2 6/6] gettext: Add a skeleton po/is.po Ævar Arnfjörð Bjarmason
2010-05-30 21:29   ` Jonathan Nieder
2010-05-30 21:39     ` Jonathan Nieder
2010-05-31 14:17     ` Ævar Arnfjörð Bjarmason
2010-06-01 23:39 ` [PATCH/RFC v3 0/7] Add internationalization support to Git Ævar Arnfjörð Bjarmason
2010-06-02  0:11   ` Ævar Arnfjörð Bjarmason
2010-06-02  1:05     ` [PATCH/RFC v4 " Ævar Arnfjörð Bjarmason
2010-06-02  1:05     ` [PATCH/RFC v4 1/7] Add infrastructure for translating Git with gettext Ævar Arnfjörð Bjarmason
2010-06-02  9:12       ` Peter Krefting
2010-06-02  9:29         ` Ævar Arnfjörð Bjarmason
2010-06-02 10:11           ` Peter Krefting
2010-06-02 10:56             ` Ævar Arnfjörð Bjarmason
2010-06-02 11:31               ` Peter Krefting
2010-06-02  1:05     ` [PATCH/RFC v4 2/7] gettext: Add a Gettext interface for shell scripts Ævar Arnfjörð Bjarmason
2010-06-02  1:06     ` [PATCH/RFC v4 3/7] gettext: Add a Gettext interface for Perl Ævar Arnfjörð Bjarmason
2010-06-02  1:06     ` [PATCH/RFC v4 4/7] Makefile: Don't install Gettext .mo files if NO_GETTEXT Ævar Arnfjörð Bjarmason
2010-06-02  1:06     ` [PATCH/RFC v4 5/7] Makefile: Override --keyword= for all languages Ævar Arnfjörð Bjarmason
2010-06-02  1:06     ` [PATCH/RFC v4 6/7] gettext: Sanity tests for Git's Gettext support Ævar Arnfjörð Bjarmason
2010-06-02  1:06     ` [PATCH/RFC v4 7/7] gettext: Add a skeleton po/is.po Ævar Arnfjörð Bjarmason
2010-06-02  6:32     ` [PATCH/RFC v3 0/7] Add internationalization support to Git Johannes Sixt
2010-06-02 22:33     ` [PATCH/RFC v5 0/2] Add infrastructure for translating Git with gettext Ævar Arnfjörð Bjarmason
2010-06-02 22:33     ` [PATCH/RFC v5 1/2] " Ævar Arnfjörð Bjarmason
2010-06-02 22:33     ` [PATCH/RFC v5 2/2] Add initial C, Shell and Perl gettext translations Ævar Arnfjörð Bjarmason
2010-06-01 23:39 ` [PATCH/RFC v3 1/7] Add infrastructure for translating Git with gettext Ævar Arnfjörð Bjarmason
2010-06-01 23:39 ` [PATCH/RFC v3 2/7] gettext: Add a Gettext interface for shell scripts Ævar Arnfjörð Bjarmason
2010-06-02  6:32   ` Johannes Sixt
2010-06-02  8:53     ` Ævar Arnfjörð Bjarmason
2010-06-02  9:38       ` Johannes Sixt
2010-06-01 23:39 ` [PATCH/RFC v3 3/7] gettext: Add a Gettext interface for Perl Ævar Arnfjörð Bjarmason
2010-06-01 23:39 ` [PATCH/RFC v3 4/7] Makefile: Don't install Gettext .mo files if NO_GETTEXT Ævar Arnfjörð Bjarmason
2010-06-01 23:39 ` [PATCH/RFC v3 5/7] Makefile: Override --keyword= for all languages Ævar Arnfjörð Bjarmason
2010-06-01 23:39 ` [PATCH/RFC v3 6/7] gettext: Basic sanity tests for Git's Gettext support Ævar Arnfjörð Bjarmason
2010-06-02  6:32   ` Johannes Sixt
2010-06-01 23:39 ` [PATCH/RFC v3 7/7] gettext: Add a skeleton po/is.po Ævar Arnfjörð Bjarmason

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=201006021347.51960.jnareb@gmail.com \
    --to=jnareb@gmail.com \
    --cc=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=jepler@unpythonic.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).