bug-gnulib@gnu.org mirror (unofficial)
 help / color / mirror / Atom feed
From: "Dmitry V. Levin" <ldv@altlinux.org>
To: Bruno Haible <bruno@clisp.org>
Cc: bug-gnulib@gnu.org
Subject: Re: [PATCH] tests: fix test-execute with GNU make jobserver
Date: Thu, 8 Apr 2021 01:24:40 +0300	[thread overview]
Message-ID: <20210407222440.GA1491@altlinux.org> (raw)
In-Reply-To: <2966714.KBtFPX2tBE@omega>

Hi Bruno,

On Wed, Apr 07, 2021 at 11:30:47PM +0200, Bruno Haible wrote:
> Hi Dmitry,
> 
> Thanks for the proposed patch.
> 
> > On POSIX systems the GNU make jobserver is implemented as a pipe,
> > and these two unexpected descriptors make test-execute-child fail.
> 
> This problem description is a bit technical. It took me a bit of work
> to translate your description into a "how to reproduce". Namely,
> 
>   $ make -j 8 check

Just "make -j2 check", in fact, any parallelism would do, but it's not a
race, it's not due to any parallelized execution, it's a side effect of
the GNU make jobserver being enabled.

> produces this output:
> 
>   Test case 14: 0 1 2 3 4 10 
>   test-execute-main.c:295: assertion 'ret == 0' failed
>   Aborted (core dumped)
>   test-execute.sh: test case 14 failed
>   Test case 15: 0 1 2 3 4 10 
>   test-execute-main.c:305: assertion 'ret == 0' failed
>   Aborted (core dumped)
>   test-execute.sh: test case 15 failed
>   Test case 16: 0 1 2 3 4 
>   test-execute-main.c:315: assertion 'ret == 0' failed
>   Aborted (core dumped)
>   test-execute.sh: test case 16 failed
> 
> Now, there are several questions:
> 
> 1) Why is GNU make passing these descriptors 3 and 4 to the child
> process? The Makefile rule looks like this:
> 
> test-execute.sh.log: test-execute.sh
>         @p='test-execute.sh'; \
>         b='test-execute.sh'; \
>         $(am__check_pre) $(LOG_DRIVER) --test-name "$$f" \
>         --log-file $$b.log --trs-file $$b.trs \
>         $(am__common_driver_flags) $(AM_LOG_DRIVER_FLAGS) $(LOG_DRIVER_FLAGS) -- $(LOG_COMPILE) \
>         "$$tst" $(AM_TESTS_FD_REDIRECT)
> 
> and it does not start with '+'. (Cf. [1].)
> 
> Oh, I see: $(am__check_pre) expands to something that contains
> $(TESTS_ENVIRONMENT), and $(TESTS_ENVIRONMENT) contains
>   MAKE='$(MAKE)'
> and that occurrence of $(MAKE) makes 'make' think that the rule may
> execute a sub-make.
> 
> 2) You suggest to deal with the file descriptors 3..19. Is this reliable?

Yes, it is reliable as it mirrors
	for (fd = 0; fd < 20; fd++)
piece of code in tests/test-execute-child.c file.

> The MAKEFLAGS variable in the environment is defined as
>   MAKEFLAGS="w -j8 --jobserver-auth=3,4 -- ..."
> So we could parse this string and look for the --jobserver-auth
> option.
> 
> Oh, wait: this option was named differently in older GNU make versions.[2]

Since tests/test-execute-child.c explicitly checks for any unexpected
descriptors, I think the most appropriate thing to do is to make sure
that no unexpected descriptors are inherited by the child.

The jobserver was the thing that triggered the test error, but there could
be other external sources of inherited descriptors, so the test should
rather be robust and make sure these external sources do not affect the
test.

> 3) You suggest to set the cloexec bit on these file descriptors, so that
> test-execute-child will have them closed. However, it is simpler to just
> close() them in test-execute-parent already, since test-execute-parent
> does not need these file descriptors either.

Yes, this is another option, in fact, it's more reliable on GNU/Linux:
The test currently relies on the fact that descriptor #10 is not
inherited, so if this is not the case, the test will fail.
If, however, all descriptors in the range are closed, the test succeeds.

I'd choose a simple close() rather than a bit more complex cloexec if
tests/test-execute-child.c did not contain this uncommented hunk:
        for (fd = 0; fd < 20; fd++)
          #ifdef __NetBSD__
          if (fd != 3)
          #endif

If for some reason __NetBSD__ is allowed to have descriptor #3 inherited,
then it shouldn't be closed in tests/test-execute-main.c either.

Bruno, while we are here, could you shed some light on this __NetBSD__
exception, ideally as a comment in tests/test-execute-child.c, please?

Maybe the reason is not strong enough to avoid closing this descriptor
at all costs.

> 4) How about other environments? I saw 'test-execute.sh' also fail in
> some recent Linux distro with KDE desktop. Apparently 'plasma' is passing
> one or two open file descriptors to all children (that is, to all processes
> running in this desktop environment). Do you think this is worth reporting?

If it isn't documented, then I'd say it's a bug, please report it.

> 5) Will glibc at some point export the close_range() function [3]?

Yes, glibc will provide close_range and closefrom functions sooner or later,
the patches are being reviewed:
https://sourceware.org/pipermail/libc-alpha/2021-March/123532.html
https://sourceware.org/pipermail/libc-alpha/2021-March/123535.html
If the review goes smoothly, they are likely to be available in glibc 2.34.

> If so, Gnulib should probably provide a substitute close_range(), and
> test-execute-main.c should better call this function, rather than rolling
> its own loop.

Yes, I think it makes sense to provide this API in gnulib, although it
won't be able to address all issues the syscall is intended to address.

I don't think, however, that the test-execute fix should wait for this
to happen.


-- 
ldv


  reply	other threads:[~2021-04-07 22:24 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-07  2:15 [PATCH] tests: fix test-execute with GNU make jobserver Dmitry V. Levin
2021-04-07 21:30 ` Bruno Haible
2021-04-07 22:24   ` Dmitry V. Levin [this message]
2021-04-07 23:44     ` Bruno Haible
2021-04-08  0:00       ` Dmitry V. Levin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://lists.gnu.org/mailman/listinfo/bug-gnulib

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210407222440.GA1491@altlinux.org \
    --to=ldv@altlinux.org \
    --cc=bruno@clisp.org \
    --cc=bug-gnulib@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).