bug-gnulib@gnu.org mirror (unofficial)
 help / color / mirror / Atom feed
* GNU gnulib: calling for beta-testers
@ 2024-04-21 10:52 Bruno Haible
  2024-04-21 11:52 ` Vivien Kraus
  2024-04-22  7:56 ` Paul Eggert
  0 siblings, 2 replies; 10+ messages in thread
From: Bruno Haible @ 2024-04-21 10:52 UTC (permalink / raw)
  To: bug-gnulib

If you are developer on a package that uses GNU gnulib as part of its build
system:

gnulib-tool has been known for being slow for many years. We have listened to
your complaints. A rewrite of gnulib-tool in another programming language
(Python) is ready for beta-testing. It is between 8 times and 100 times faster
than the original gnulib-tool.

Both implementations should behave identically, that is, produce the same
generated files and the same output. You can help us ensure this, through the
following steps:

1. Make sure you have Python (version 3.7 or newer) installed on your
machine.

2. Update your gnulib checkout. (For some packages, it comes as a git
submodule named 'gnulib'.) Like this:

  $ git checkout master
  $ git pull

     Set the environment variable GNULIB_SRCDIR, pointing to this checkout.

     If the package is using a git submodule named 'gnulib', it is also
advisable to do

  $ git commit -m 'build: Update gnulib submodule to latest.' gnulib

     (as a preparation for step 5, because the --no-git option does not work
as expected in all variants of 'bootstrap').

3. Set an environment variable that enables checking that the two
implementations behave the same:

  $ export GNULIB_TOOL_IMPL=sh+py


4. Clean the built files of your package:

  $ make -k distclean


5. Regenerate the fetched and generated files of your package. Depending on
the package, this may be a command such as

  $ ./bootstrap --no-git --gnulib-srcdir=$GNULIB_SRCDIR

     or

  $ export GNULIB_SRCDIR; ./autopull.sh; ./autogen.sh

     or, if no such script is available:

  $ $GNULIB_SRCDIR/gnulib-tool --update

     If there is a failure, due to differences between the 'sh' and 'py'
results, please report it to <bug-gnulib@gnu.org>.

6. If this invocation was successful, you can trust the rewritten gnulib-tool
and use it from now on, by setting the environment variable

  $ export GNULIB_TOOL_IMPL=py


7. Continue with

  $ ./configure
  $ make

     as usual.

And enjoy the speed! The rewritten gnulib-tool was implemented by Dmitry
Selyutin, Collin Funk, and me.



_______________________________________________
Message sent via Savannah
https://savannah.gnu.org/



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: GNU gnulib: calling for beta-testers
  2024-04-21 10:52 Bruno Haible
@ 2024-04-21 11:52 ` Vivien Kraus
  2024-04-22  7:56 ` Paul Eggert
  1 sibling, 0 replies; 10+ messages in thread
From: Vivien Kraus @ 2024-04-21 11:52 UTC (permalink / raw)
  To: bug-gnulib

Dear Gnulib developers,

Le dimanche 21 avril 2024 à 06:52 -0400, Bruno Haible a écrit :
> If you are developer on a package that uses GNU gnulib as part of its
> build
> system:

I have a very simple personal project using gnulib.

> 1. Make sure you have Python (version 3.7 or newer) installed on your
> machine.
> 
> 2. Update your gnulib checkout. (For some packages, it comes as a git
> submodule named 'gnulib'.)
> 
> 3. Set an environment variable that enables checking that the two
> implementations behave the same:
> 
>   $ export GNULIB_TOOL_IMPL=sh+py
> 
> 
> 4. Clean the built files of your package
> 
> 5. Regenerate the fetched and generated files of your package.
> Depending on
> the package, this may be a command such as
> 
>   $ ./bootstrap --no-git --gnulib-srcdir=$GNULIB_SRCDIR
> 
>      If there is a failure, due to differences between the 'sh' and
> 'py'
> results, please report it to <bug-gnulib@gnu.org>.

There are no failures.

> The rewritten gnulib-tool was implemented by Dmitry
> Selyutin, Collin Funk, and me.

You worked well, thank you.

Best regards,

Vivien


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: GNU gnulib: calling for beta-testers
       [not found] ` <7ef75a77-ec33-43e0-8e57-8960b09ccd5a@akhlaghi.org>
@ 2024-04-21 22:27   ` Bruno Haible
  2024-04-22  7:16     ` Paul Eggert
  0 siblings, 1 reply; 10+ messages in thread
From: Bruno Haible @ 2024-04-21 22:27 UTC (permalink / raw)
  To: Mohammad Akhlaghi; +Cc: bug-gnulib

[CCing bug-gnulib]
Mohammad Akhlaghi wrote:
> Dear Bruno,
> 
> Thanks for sharing the good news about the speed improvement. Gnuastro 
> uses Gnulib and it has been very valuable :-).
> 
> I had two questions:
> 
> 1. Will the shell version of Gnulib-tool continue to be the main version 
> for Gnulib?
> 
> 2. Why did you chose to do this in a high-level and ever-changing 
> language like Python? If the shell version of gnulib-tool is deprecated, 
> this will add Python as a bootstrapping dependency of all the projects 
> that use Gnulib (which is not good, because Python packages usually have 
> MANY dependencies and can complicate the build environment by 
> conflicting in their versions with other packages, virtual-env or Conda 
> also add other complications).
> 
> Cheers,
> Mohammad

Re 1: This has been discussed just yesterday [1][2].

Re 2: I disagree with the negative connotation in the "ever-changing language"
qualification. When the rewrite was started in 2012, we were targeting
Python 3.0 to 3.2. Now we are targeting Python 3.7 to 3.12, and we did not
have to touch _a single line of code_ due to obsolete or incompatible
syntax or features.

ISO C, btw, is also "ever-changing", and in practice, it causes more problems
than that. (For example, when clang 17 dropped the support for C 89 syntax
in favour of C 23.)

Why we chose Python? That was a longer thought process. First we had an
initial discussion [3][4][5][6], which convinced me that Python is the best
choice. Then in 2012 I did a final verification of trends ([7]) that showed
that Python's popularity was not likely to go away soon.

Regarding bootstrapping dependency: See [8].

Regarding Python packages: gnulib-tool does not use Python packages. It
works with a Python compiled from source from the original tarball, with
no add-on packages.

Bruno

[1] https://lists.gnu.org/archive/html/bug-gnulib/2024-04/msg00331.html
[2] https://lists.gnu.org/archive/html/bug-gnulib/2024-04/msg00334.html
[3] https://lists.gnu.org/archive/html/bug-gnulib/2009-01/msg00034.html
[4] https://lists.gnu.org/archive/html/bug-gnulib/2009-01/msg00036.html
[5] https://lists.gnu.org/archive/html/bug-gnulib/2009-01/msg00037.html
[6] https://lists.gnu.org/archive/html/bug-gnulib/2009-01/msg00059.html
[7] http://www.tiobe.com/index.php/content/paperinfo/tpci/index.html
[8] https://lists.gnu.org/archive/html/bug-gnulib/2024-04/msg00351.html





^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: GNU gnulib: calling for beta-testers
  2024-04-21 22:27   ` GNU gnulib: calling for beta-testers Bruno Haible
@ 2024-04-22  7:16     ` Paul Eggert
  2024-04-22  8:17       ` Mohammad Akhlaghi
  0 siblings, 1 reply; 10+ messages in thread
From: Paul Eggert @ 2024-04-22  7:16 UTC (permalink / raw)
  To: Bruno Haible, Mohammad Akhlaghi; +Cc: bug-gnulib

On 2024-04-21 15:27, Bruno Haible wrote:
> ISO C, btw, is also "ever-changing"

Yes, plus if a language is not "ever-changing" it's dead.

The main thing to worry about is when old code stops working not when 
new language features get added. Since Python 3 came out Python has been 
reasonably good about dealing with deprecated features.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: GNU gnulib: calling for beta-testers
  2024-04-21 10:52 Bruno Haible
  2024-04-21 11:52 ` Vivien Kraus
@ 2024-04-22  7:56 ` Paul Eggert
  2024-04-22  8:23   ` Collin Funk
  1 sibling, 1 reply; 10+ messages in thread
From: Paul Eggert @ 2024-04-22  7:56 UTC (permalink / raw)
  To: bug-gnulib

[-- Attachment #1: Type: text/plain, Size: 958 bytes --]

On 2024-04-21 03:52, Bruno Haible wrote:

> 5. Regenerate the fetched and generated files of your package. Depending on
> the package, this may be a command such as
> 
>    $ ./bootstrap --no-git --gnulib-srcdir=$GNULIB_SRCDIR

I had a failure with this step when using current GNU diffutils 
(3d1a56b906c31cc6e89f6a9c008ba54d734d4ec2, which has a gnulib submodule 
with Gnulib commit 99ce3a004a2974c71f510f5df5bc6be7e2811d30) with 
current Gnulib (5b6e410e04b48c0fd62e954fafa220ef301d2c70) and building 
on Ubuntu 23.10 x86-64. Build log attached. To reproduce, clone 
diffutils and then:

   export GNULIB_TOOL_IMPL=sh+py
   ./bootstrap
   ./configure
   make -k distclean
   git submodule foreach git pull origin master
   git commit -m 'build: update gnulib submodule to latest' gnulib
   ./bootstrap --no-git --gnulib-srcdir=gnulib

The problem is that the Python-based build leaves behind a __pycache__ 
directory, which causes the comparison to fail.

[-- Attachment #2: diffutils-log.txt.gz --]
[-- Type: application/gzip, Size: 50952 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: GNU gnulib: calling for beta-testers
  2024-04-22  7:16     ` Paul Eggert
@ 2024-04-22  8:17       ` Mohammad Akhlaghi
  0 siblings, 0 replies; 10+ messages in thread
From: Mohammad Akhlaghi @ 2024-04-22  8:17 UTC (permalink / raw)
  To: Paul Eggert, Bruno Haible; +Cc: bug-gnulib

Thank you very much Bruno and Paul,

I will look into the links.

It is great that gnulib-tool does not use Python packages, but only the 
core of Python from its own tarball :-).

Cheers,
Mohammad


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: GNU gnulib: calling for beta-testers
  2024-04-22  7:56 ` Paul Eggert
@ 2024-04-22  8:23   ` Collin Funk
  2024-04-22 11:22     ` Bruno Haible
  0 siblings, 1 reply; 10+ messages in thread
From: Collin Funk @ 2024-04-22  8:23 UTC (permalink / raw)
  To: Paul Eggert, bug-gnulib

Hi Paul,

On 4/22/24 12:56 AM, Paul Eggert wrote:>   export GNULIB_TOOL_IMPL=sh+py
>   ./bootstrap
>   ./configure
>   make -k distclean
>   git submodule foreach git pull origin master
>   git commit -m 'build: update gnulib submodule to latest' gnulib
>   ./bootstrap --no-git --gnulib-srcdir=gnulib
> 
> The problem is that the Python-based build leaves behind a __pycache__ directory, which causes the comparison to fail.

I always noticed that directory in gnulib/pygnulib. I assumed
it was my LSP or something causing it...

Now looking into this, I think Python creates it upon executing a
script and/or doing 'import module-name'.

It looks like it can be turned off with 'python3 -B' or setting the
PYTHONDONTWRITEBYTECODE environment variable to a non-empty string [1]
[2].

Since I always used a separate gnulib clone that wasn't in a
subdirectory (data caps unfortunately), I never ran into this issue.

Time for me to test my hypothesis and hope I didn't speak too soon. :)

[1] https://docs.python.org/3/using/cmdline.html#cmdoption-B
[2] https://docs.python.org/3/using/cmdline.html#envvar-PYTHONDONTWRITEBYTECODE

Collin


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: GNU gnulib: calling for beta-testers
  2024-04-22  8:23   ` Collin Funk
@ 2024-04-22 11:22     ` Bruno Haible
  2024-04-22 20:00       ` Collin Funk
  0 siblings, 1 reply; 10+ messages in thread
From: Bruno Haible @ 2024-04-22 11:22 UTC (permalink / raw)
  To: Paul Eggert, bug-gnulib; +Cc: Collin Funk

[-- Attachment #1: Type: text/plain, Size: 1813 bytes --]

Thanks for the report, Paul.
Thanks for the preliminary investigation, Collin.

> >   ./bootstrap
> >   ./configure
> >   make -k distclean
> >   git submodule foreach git pull origin master
> >   git commit -m 'build: update gnulib submodule to latest' gnulib
> >   ./bootstrap --no-git --gnulib-srcdir=gnulib
> > 
> > The problem is that the Python-based build leaves behind a __pycache__ directory, which causes the comparison to fail.

I reproduce the issue. It's because executing gnulib-tool.py creates
gnulib/pygnulib/__pycache__, while gnulib-tool.sh does not do so.

Two workarounds are possible. I'm committing both, since the first
workaround works only with Python ≥ 3.8.
  * Let Python create its cache not in gnulib/pygnulib/__pycache__,
    but instead in
    /tmp/gnulib-python-cache-$USER/<absolute_file_name>/gnulib/pygnulib/ .
  * Ignore the __pycache__ directory during the comparison.

The first workaround should fix trouble similar to what we regularly
see with 'autom4te.cache': Unnecessary difference while comparing source
trees, unnecessary "git status" noise. Clutter.


2024-04-22  Bruno Haible  <bruno@clisp.org>

	gnulib-tool: Fix trouble caused by Python's bytecode cache.
	Reported by Paul Eggert in
	<https://lists.gnu.org/archive/html/bug-gnulib/2024-04/msg00367.html>.
	* gnulib-tool: In sh+py mode, ignore the __pycache__ directory during
	comparison.

2024-04-22  Bruno Haible  <bruno@clisp.org>

	gnulib-tool.py: Fix trouble caused by Python's bytecode cache.
	Reported by Paul Eggert in
	<https://lists.gnu.org/archive/html/bug-gnulib/2024-04/msg00367.html>.
	* gnulib-tool.py: Set PYTHONPYCACHEPREFIX, so as to avoid creating a
	__pycache__ directory in the developer's gnulib checkout (only effective
	with Python ≥ 3.8).


[-- Attachment #2: 0001-gnulib-tool.py-Fix-trouble-caused-by-Python-s-byteco.patch --]
[-- Type: text/x-patch, Size: 1951 bytes --]

From eda62139d838f53e4953db26019e5a4b8b805847 Mon Sep 17 00:00:00 2001
From: Bruno Haible <bruno@clisp.org>
Date: Mon, 22 Apr 2024 13:11:05 +0200
Subject: [PATCH 1/2] gnulib-tool.py: Fix trouble caused by Python's bytecode
 cache.
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Reported by Paul Eggert in
<https://lists.gnu.org/archive/html/bug-gnulib/2024-04/msg00367.html>.

* gnulib-tool.py: Set PYTHONPYCACHEPREFIX, so as to avoid creating a
__pycache__ directory in the developer's gnulib checkout (only effective
with Python ≥ 3.8).
---
 ChangeLog      | 9 +++++++++
 gnulib-tool.py | 6 ++++++
 2 files changed, 15 insertions(+)

diff --git a/ChangeLog b/ChangeLog
index b3cef64936..4a272d326e 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,12 @@
+2024-04-22  Bruno Haible  <bruno@clisp.org>
+
+	gnulib-tool.py: Fix trouble caused by Python's bytecode cache.
+	Reported by Paul Eggert in
+	<https://lists.gnu.org/archive/html/bug-gnulib/2024-04/msg00367.html>.
+	* gnulib-tool.py: Set PYTHONPYCACHEPREFIX, so as to avoid creating a
+	__pycache__ directory in the developer's gnulib checkout (only effective
+	with Python ≥ 3.8).
+
 2024-04-21  Collin Funk  <collin.funk1@gmail.com>
 
 	gnulib-tool.py: Make temporary directories recognizable.
diff --git a/gnulib-tool.py b/gnulib-tool.py
index cdcd316909..81537c272c 100755
--- a/gnulib-tool.py
+++ b/gnulib-tool.py
@@ -144,6 +144,12 @@
   func_fatal_error "python3 not found; try setting GNULIB_TOOL_IMPL=sh"
 fi
 
+# Tell Python to store the compiled bytecode outside the gnulib directory.
+if test -z "$PYTHONPYCACHEPREFIX"; then
+  PYTHONPYCACHEPREFIX="${TMPDIR-/tmp}/gnulib-python-cache-${USER-$LOGNAME}"
+  export PYTHONPYCACHEPREFIX
+fi
+
 profiler_args=
 # For profiling, cf. <https://docs.python.org/3/library/profile.html>.
 #profiler_args="-m cProfile -s tottime"
-- 
2.34.1


[-- Attachment #3: 0002-gnulib-tool-Fix-trouble-caused-by-Python-s-bytecode-.patch --]
[-- Type: text/x-patch, Size: 1609 bytes --]

From ab5390ae6d8db323420874d1c1334feb77af9cb1 Mon Sep 17 00:00:00 2001
From: Bruno Haible <bruno@clisp.org>
Date: Mon, 22 Apr 2024 13:12:35 +0200
Subject: [PATCH 2/2] gnulib-tool: Fix trouble caused by Python's bytecode
 cache.

Reported by Paul Eggert in
<https://lists.gnu.org/archive/html/bug-gnulib/2024-04/msg00367.html>.

* gnulib-tool: In sh+py mode, ignore the __pycache__ directory during
comparison.
---
 ChangeLog   | 8 ++++++++
 gnulib-tool | 2 +-
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/ChangeLog b/ChangeLog
index 4a272d326e..462823888d 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,11 @@
+2024-04-22  Bruno Haible  <bruno@clisp.org>
+
+	gnulib-tool: Fix trouble caused by Python's bytecode cache.
+	Reported by Paul Eggert in
+	<https://lists.gnu.org/archive/html/bug-gnulib/2024-04/msg00367.html>.
+	* gnulib-tool: In sh+py mode, ignore the __pycache__ directory during
+	comparison.
+
 2024-04-22  Bruno Haible  <bruno@clisp.org>
 
 	gnulib-tool.py: Fix trouble caused by Python's bytecode cache.
diff --git a/gnulib-tool b/gnulib-tool
index 6d430e56e6..85b62883c6 100755
--- a/gnulib-tool
+++ b/gnulib-tool
@@ -199,7 +199,7 @@ case "$GNULIB_TOOL_IMPL" in
         else
           diff_options=
         fi
-        diff -r $diff_options -q . "$tmp" >/dev/null ||
+        diff -r $diff_options --exclude=__pycache__ -q . "$tmp" >/dev/null ||
           func_fatal_error "gnulib-tool.py produced different files than gnulib-tool.sh! Compare `pwd` and $tmp."
         # Compare the two outputs.
         diff -q "$tmp-sh-out" "$tmp-py-out" >/dev/null ||
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: GNU gnulib: calling for beta-testers
  2024-04-22 11:22     ` Bruno Haible
@ 2024-04-22 20:00       ` Collin Funk
  2024-04-22 20:56         ` Bruno Haible
  0 siblings, 1 reply; 10+ messages in thread
From: Collin Funk @ 2024-04-22 20:00 UTC (permalink / raw)
  To: Bruno Haible, Paul Eggert, bug-gnulib

On 4/22/24 4:22 AM, Bruno Haible wrote:
> The first workaround should fix trouble similar to what we regularly
> see with 'autom4te.cache': Unnecessary difference while comparing source
> trees, unnecessary "git status" noise. Clutter.

I don't think the Python stuff should clutter 'git status' atleast.

$ cat pygnulib/.gitignore 
*.pyc

Unless Python creates other files in there.

Collin


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: GNU gnulib: calling for beta-testers
  2024-04-22 20:00       ` Collin Funk
@ 2024-04-22 20:56         ` Bruno Haible
  0 siblings, 0 replies; 10+ messages in thread
From: Bruno Haible @ 2024-04-22 20:56 UTC (permalink / raw)
  To: Paul Eggert, bug-gnulib, Collin Funk

Collin Funk wrote:
> > The first workaround should fix trouble similar to what we regularly
> > see with 'autom4te.cache': Unnecessary difference while comparing source
> > trees, unnecessary "git status" noise. Clutter.
> 
> I don't think the Python stuff should clutter 'git status' atleast.
> 
> $ cat pygnulib/.gitignore 
> *.pyc

OK, good. So, it would not have produced unnecessary "git status" noise.
Still, it showed up during recursive diff. My first workaround fixes that.

Bruno





^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2024-04-22 20:56 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <7232287.lbvTjQenqM@nimes>
     [not found] ` <7ef75a77-ec33-43e0-8e57-8960b09ccd5a@akhlaghi.org>
2024-04-21 22:27   ` GNU gnulib: calling for beta-testers Bruno Haible
2024-04-22  7:16     ` Paul Eggert
2024-04-22  8:17       ` Mohammad Akhlaghi
2024-04-21 10:52 Bruno Haible
2024-04-21 11:52 ` Vivien Kraus
2024-04-22  7:56 ` Paul Eggert
2024-04-22  8:23   ` Collin Funk
2024-04-22 11:22     ` Bruno Haible
2024-04-22 20:00       ` Collin Funk
2024-04-22 20:56         ` Bruno Haible

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).