From: Adhemerval Zanella <adhemerval.zanella@linaro.org>
To: Florian Weimer <fweimer@redhat.com>
Cc: libc-alpha@sourceware.org
Subject: Re: [PATCH] RFC: Add posix_spawn_file_actions_closefrom
Date: Mon, 27 May 2019 18:02:01 -0300 [thread overview]
Message-ID: <54a51a36-6922-b367-fbc4-037796c2d3c2@linaro.org> (raw)
In-Reply-To: <11bd519f-4306-bca8-4c31-45b8c207b9e5@linaro.org>
On 24/05/2019 11:55, Adhemerval Zanella wrote:
>
>
> On 24/05/2019 11:37, Florian Weimer wrote:
>> * Adhemerval Zanella:
>>
>>>> The test doesn't exercise the gaps case.
>>>
>>> Do you mean gaps in file descriptor initial set before posix_spawn?
>>
>> Yes, where the directory descriptor is in the middle of the closefrom
>> range.
>
> Ok, I would add a test to check for it.
>
>>
>>>>> +/* Close all file descriptor up to FROM by interacting /proc/self/fd.
>>>>> + Any failure should */
>>>>> +static bool
>>>>> +spawn_closefrom (int from)
>>>>> +{
>>>>> + /* Increasing the buffer size incurs in less getdents syscalls from
>>>>> + readdir, however it would require more stack size to be allocated
>>>>> + on __spawnix. */
>>>>> + char buffer[sizeof (struct __dirstream) + 2 * sizeof (struct dirent)];
>>>>
>>>> We could allocate this on the heap, in the parent. Maybe we could
>>>> opendir in the parent, and play with the underlying descriptor in the
>>>> child? Then you wouldn't need to add __opendir_inplace at all. Given
>>>> that we know what our implementation looks like, this should be fairly
>>>> safe.
>>>
>>> I don't have a strong opinion here, it would add some complexity on parent
>>> helper which would need to transverse all file actions, call opendir, and
>>> deallocate after helper process returns. My idea is to keep the required
>>> logic more in place, so its more obvious where things are initiated.
>>
>> You could turn one of the padding elements in posix_spawn_file_actions_t
>> into a flag and have posix_spawn_file_actions_addclosefrom_np set the
>> flag. Then the second iteration isn't necessary.
Another possible advantage of using a inplace buffer for opendir is we
can tune memory usage based on expected filename entries. For /proc/self/fds
the names are expected for be at most sizeof (int) * 3 + 1, so with:
enum {
dirent_base_size = offsetof (struct dirent, d_name),
d_name_max_length = sizeof (int) * 3 + 1,
dirent_max_size = ALIGN_UP (dirent_base_size + d_name_max_length,
sizeof (long))
};
char buffer[sizeof (struct __dirstream) + 10 * dirent_max_size]
We can obtain 10 entries for each getdents calls at cost of about just
432 stack size for spawn_closefrom instead of allocate 4*BUFSIZ/BUFSIZ
(32728 in most cases).
>>
>>>>> + DIR *dp;
>>>>> + if ((dp = __opendir_inplace ("/proc/self/fd", buffer, sizeof buffer))
>>>>> + == NULL)
>>>>> + return false;
>>>>
>>>> This could check for ENFILE/EMFILE/ENOMEM and try closing descriptors
>>>> directly in case of that error, to make room for the new descriptor.
>>>> But perhaps that's not worth the complexity.
>>>
>>> Hum, this could be an enhancement indeed. However the main issue is
>>> to find which is the lower opened file descriptor greater than FROM
>>> without polling /proc/self/fd or by using close with random file
>>> descriptors.
>>
>> You can do close (from), close (from + 1), etc., up to a certain limit,
>> and retry if one of the close calls doesn't return EBADF. The magic
>> limit is needed in case the closefrom does not overlap with any file
>> descriptors.
>
> Yeah, this is exactly the random close calls I would like to avoid. But
> I also don't see a better option.
>
>>
>>>>> + {
>>>>> + if (dirp->d_name[0] == '.')
>>>>> + continue;
>>>>> +
>>>>> + char *endptr;
>>>>> + long int fd = strtol (dirp->d_name, &endptr, 10);
>>>>> + if (*endptr != '\0' || fd < 0 || fd > INT_MAX)
>>>>> + {
>>>>> + ret = false;
>>>>> + break;
>>>>> + }
>>>>> +
>>>>> + if (fd == dirfd (dp) || fd < from)
>>>>> + continue;
>>>>> +
>>>>> + __close (fd);
>>>>> + }
>>>>> + __closedir (dp);
>>>>> +
>>>>> + return ret;
>>>>> +}
>>>>
>>>> I'm not sure if this is entirely correct. If we close some descriptors,
>>>> and then readdir calls getdents64, what will the kernel return? Will
>>>> there be a gap in the descriptor list? (Curiously, it's the same issue
>>>> we have the the fork handler list. 8-)
>>>
>>> It does not seems to be case with my experiments. I hack opendir to
>>> allocate the minimum workable buffer (__dirstream plus a
>>> struct dirent, about 40 bytes on x86_64) to force each readdir to
>>> call getdents. A simple testcase shows:
>>
>> It's still looks very implementation-defined to me. proc_readfd_common
>> does this:
>>
>> for (fd = ctx->pos - 2;
>> fd < files_fdtable(files)->max_fds;
>> fd++, ctx->pos++) {
>>
>> And I think ctx->pos somehow corresponds to d_off. But I don't see a
>> 1:1 correspondence between descriptors and offsets. I wonder whether
>> the single-entry case is indeed the worst-possible test case for this.
>>
>
> I will check with different permutations by changing the opendir buffer
> and a file descriptor set with different set of gaps.
I checked a permutation of 3 tests:
- Issue 10 open calls and call closefrom with minimum file descriptor
value;
- Issue 10 open calls, close first one, and call closefrom;
- Issue 10 open calls, close first and last one, and call closefrom.
With a buffer for getdents starting with just sizeof (struct __dirstream)
plus ALIGN_UP (offsetof (struct dirent, d_name) + sizeof (int) * 3 + 1)
up to sizeof (struct __dirstream) plus 10 * ALIGN_UP value. In the end
the idea is to vary the maximum entry amount getdents will return from
1 to 10 (the number of files to test).
I saw no issue, the opendir interaction close all the file without any
leakage (I can send you the testcase to check if I get something wrong).
And I think it would be expected behaviour imho: if we get all the
file descriptor information with a getdents call, holes does not really
matter. If holes does exist we have two scenarios: a hole in the obtained
buffer or a hole in a subsequent call. Former should be handled as first
case, while latter kernel removes it from next getdents call (as I am
seeing in my experiments).
prev parent reply other threads:[~2019-05-27 21:02 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-05-21 18:48 [PATCH] RFC: Add posix_spawn_file_actions_closefrom Adhemerval Zanella
2019-05-24 11:34 ` Florian Weimer
2019-05-24 14:06 ` Adhemerval Zanella
2019-05-24 14:37 ` Florian Weimer
2019-05-24 14:55 ` Adhemerval Zanella
2019-05-27 21:02 ` Adhemerval Zanella [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/libc/involved.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=54a51a36-6922-b367-fbc4-037796c2d3c2@linaro.org \
--to=adhemerval.zanella@linaro.org \
--cc=fweimer@redhat.com \
--cc=libc-alpha@sourceware.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).