unofficial mirror of libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: Adhemerval Zanella <adhemerval.zanella@linaro.org>
To: Florian Weimer <fweimer@redhat.com>
Cc: libc-alpha@sourceware.org
Subject: Re: [PATCH] RFC: Add posix_spawn_file_actions_closefrom
Date: Mon, 27 May 2019 18:02:01 -0300	[thread overview]
Message-ID: <54a51a36-6922-b367-fbc4-037796c2d3c2@linaro.org> (raw)
In-Reply-To: <11bd519f-4306-bca8-4c31-45b8c207b9e5@linaro.org>



On 24/05/2019 11:55, Adhemerval Zanella wrote:
> 
> 
> On 24/05/2019 11:37, Florian Weimer wrote:
>> * Adhemerval Zanella:
>>
>>>> The test doesn't exercise the gaps case.
>>>
>>> Do you mean gaps in file descriptor initial set before posix_spawn?
>>
>> Yes, where the directory descriptor is in the middle of the closefrom
>> range.
> 
> Ok, I would add a test to check for it.
> 
>>
>>>>> +/* Close all file descriptor up to FROM by interacting /proc/self/fd.
>>>>> +   Any failure should */
>>>>> +static bool
>>>>> +spawn_closefrom (int from)
>>>>> +{
>>>>> +  /* Increasing the buffer size incurs in less getdents syscalls from
>>>>> +     readdir, however it would require more stack size to be allocated
>>>>> +     on __spawnix.  */
>>>>> +  char buffer[sizeof (struct __dirstream) + 2 * sizeof (struct dirent)];
>>>>
>>>> We could allocate this on the heap, in the parent.  Maybe we could
>>>> opendir in the parent, and play with the underlying descriptor in the
>>>> child?  Then you wouldn't need to add __opendir_inplace at all.  Given
>>>> that we know what our implementation looks like, this should be fairly
>>>> safe.
>>>
>>> I don't have a strong opinion here, it would add some complexity on parent
>>> helper which would need to transverse all file actions, call opendir, and
>>> deallocate after helper process returns.  My idea is to keep the required 
>>> logic more in place, so its more obvious where things are initiated.
>>
>> You could turn one of the padding elements in posix_spawn_file_actions_t
>> into a flag and have posix_spawn_file_actions_addclosefrom_np set the
>> flag.  Then the second iteration isn't necessary.

Another possible advantage of using a inplace buffer for opendir is we
can tune memory usage based on expected filename entries.  For /proc/self/fds
the names are expected for be at most sizeof (int) * 3 + 1, so with:

  enum {
    dirent_base_size  = offsetof (struct dirent, d_name),
    d_name_max_length = sizeof (int) * 3 + 1,
    dirent_max_size   = ALIGN_UP (dirent_base_size + d_name_max_length,
                                 sizeof (long))
  };
  char buffer[sizeof (struct __dirstream) + 10 * dirent_max_size]

We can obtain 10 entries for each getdents calls at cost of about just
432 stack size for spawn_closefrom instead of allocate 4*BUFSIZ/BUFSIZ
(32728 in most cases).

>>
>>>>> +  DIR *dp;
>>>>> +  if ((dp = __opendir_inplace ("/proc/self/fd", buffer, sizeof buffer))
>>>>> +      == NULL)
>>>>> +    return false;
>>>>
>>>> This could check for ENFILE/EMFILE/ENOMEM and try closing descriptors
>>>> directly in case of that error, to make room for the new descriptor.
>>>> But perhaps that's not worth the complexity.
>>>
>>> Hum, this could be an enhancement indeed.  However the main issue is 
>>> to find which is the lower opened file descriptor greater than FROM
>>> without polling /proc/self/fd or by using close with random file
>>> descriptors.
>>
>> You can do close (from), close (from + 1), etc., up to a certain limit,
>> and retry if one of the close calls doesn't return EBADF.  The magic
>> limit is needed in case the closefrom does not overlap with any file
>> descriptors.
> 
> Yeah, this is exactly the random close calls I would like to avoid. But
> I also don't see a better option.
> 
>>
>>>>> +    {
>>>>> +      if (dirp->d_name[0] == '.')
>>>>> +        continue;
>>>>> +
>>>>> +      char *endptr;
>>>>> +      long int fd = strtol (dirp->d_name, &endptr, 10);
>>>>> +      if (*endptr != '\0' || fd < 0 || fd > INT_MAX)
>>>>> +	{
>>>>> +	  ret = false;
>>>>> +	  break;
>>>>> +	}
>>>>> +
>>>>> +      if (fd == dirfd (dp) || fd < from)
>>>>> +        continue;
>>>>> +
>>>>> +      __close (fd);
>>>>> +    }
>>>>> +  __closedir (dp);
>>>>> +
>>>>> +  return ret;
>>>>> +}
>>>>
>>>> I'm not sure if this is entirely correct.  If we close some descriptors,
>>>> and then readdir calls getdents64, what will the kernel return?  Will
>>>> there be a gap in the descriptor list?  (Curiously, it's the same issue
>>>> we have the the fork handler list. 8-)
>>>
>>> It does not seems to be case with my experiments.  I hack opendir to 
>>> allocate the minimum workable buffer (__dirstream plus a 
>>> struct dirent, about 40 bytes on x86_64) to force each readdir to
>>> call getdents.  A simple testcase shows:
>>
>> It's still looks very implementation-defined to me.  proc_readfd_common
>> does this:
>>
>> 	for (fd = ctx->pos - 2;
>> 	     fd < files_fdtable(files)->max_fds;
>> 	     fd++, ctx->pos++) {
>>
>> And I think ctx->pos somehow corresponds to d_off.  But I don't see a
>> 1:1 correspondence between descriptors and offsets.  I wonder whether
>> the single-entry case is indeed the worst-possible test case for this.
>>
> 
> I will check with different permutations by changing the opendir buffer
> and a file descriptor set with different set of gaps.

I checked a permutation of 3 tests:

  - Issue 10 open calls and call closefrom with minimum file descriptor
    value;
  - Issue 10 open calls, close first one, and call closefrom;
  - Issue 10 open calls, close first and last one, and call closefrom.

With a buffer for getdents starting with just sizeof (struct __dirstream)
plus ALIGN_UP (offsetof (struct dirent, d_name) + sizeof (int) * 3 + 1)
up to sizeof (struct __dirstream) plus 10 * ALIGN_UP value.  In the end
the idea is to vary the maximum entry amount getdents will return from
1 to 10 (the number of files to test).

I saw no issue, the opendir interaction close all the file without any
leakage (I can send you the testcase to check if I get something wrong).

And I think it would be expected behaviour imho: if we get all the
file descriptor information with a getdents call, holes does not really
matter. If holes does exist we have two scenarios: a hole in the obtained
buffer or a hole in a subsequent call. Former should be handled as first
case, while latter kernel removes it from next getdents call (as I am
seeing in my experiments).

      reply	other threads:[~2019-05-27 21:02 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-21 18:48 [PATCH] RFC: Add posix_spawn_file_actions_closefrom Adhemerval Zanella
2019-05-24 11:34 ` Florian Weimer
2019-05-24 14:06   ` Adhemerval Zanella
2019-05-24 14:37     ` Florian Weimer
2019-05-24 14:55       ` Adhemerval Zanella
2019-05-27 21:02         ` Adhemerval Zanella [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/libc/involved.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=54a51a36-6922-b367-fbc4-037796c2d3c2@linaro.org \
    --to=adhemerval.zanella@linaro.org \
    --cc=fweimer@redhat.com \
    --cc=libc-alpha@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).