From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS17314 8.43.84.0/22 X-Spam-Status: No, score=-5.1 required=3.0 tests=AWL,BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,MAILING_LIST_MULTI,NICE_REPLY_A, RCVD_IN_DNSWL_MED,SPF_HELO_PASS,SPF_PASS,UNPARSEABLE_RELAY shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by dcvr.yhbt.net (Postfix) with ESMTPS id 2A79A1F8C6 for ; Wed, 15 Sep 2021 16:30:01 +0000 (UTC) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id BBA893858039 for ; Wed, 15 Sep 2021 16:29:59 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org BBA893858039 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1631723399; bh=6x4EEaNr0dK65DI/ZV7TvJlvSqgaEcKT4V/BeGJeTpY=; h=Date:Subject:To:References:In-Reply-To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=xSs0+XPTnxM8iHCrIkC3oQ8Oik1DK9QpzemMxD0WOB5zQk9QPwvRRAOkBF5a5OPP7 qifoing9WGXvYVTiFUvUXFdExzFIKYs74dRzSxfGApOS25W/4PREcCVIYHqM0oSEve zNgPaT0YRIgvAysSIvEjlNunSEwsRUucqCaGU89Q= Received: from bhuna.collabora.co.uk (bhuna.collabora.co.uk [IPv6:2a00:1098:0:82:1000:25:2eeb:e3e3]) by sourceware.org (Postfix) with ESMTPS id F392C3858403 for ; Wed, 15 Sep 2021 16:29:39 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org F392C3858403 Received: from [127.0.0.1] (localhost [127.0.0.1]) (Authenticated sender: tonyk) with ESMTPSA id A42F91F43794 Message-ID: Date: Wed, 15 Sep 2021 13:29:30 -0300 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.1.0 Subject: Re: [PATCH 16/20] futex: Implement sys_futex_waitv() Content-Language: en-US To: Peter Zijlstra References: <20210915140710.596174479@infradead.org> <20210915141525.621568509@infradead.org> In-Reply-To: <20210915141525.621568509@infradead.org> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: =?utf-8?q?Andr=C3=A9_Almeida_via_Libc-alpha?= Reply-To: =?UTF-8?Q?Andr=c3=a9_Almeida?= Cc: dave@stgolabs.net, libc-alpha@sourceware.org, linux-api@vger.kernel.org, bigeasy@linutronix.de, linux-kernel@vger.kernel.org, rostedt@goodmis.org, mingo@redhat.com, mtk.manpages@gmail.com, dvhart@infradead.org, tglx@linutronix.de, kernel@collabora.com, krisman@collabora.com Errors-To: libc-alpha-bounces+e=80x24.org@sourceware.org Sender: "Libc-alpha" Às 11:07 de 15/09/21, Peter Zijlstra escreveu: > From: André Almeida > > Add support to wait on multiple futexes. This is the interface > implemented by this syscall: > > futex_waitv(struct futex_waitv *waiters, unsigned int nr_futexes, > unsigned int flags, struct timespec *timo) > > +/** > + * futex_wait_multiple_setup - Prepare to wait and enqueue multiple futexes > + * @vs: The futex list to wait on > + * @count: The size of the list > + * @awaken: Index of the last awoken futex, if any. Used to notify the > + * caller that it can return this index to userspace (return parameter) > + * > + * Prepare multiple futexes in a single step and enqueue them. This may fail if > + * the futex list is invalid or if any futex was already awoken. On success the > + * task is ready to interruptible sleep. > + * > + * Return: > + * - 1 - One of the futexes was awaken by another thread > + * - 0 - Success > + * - <0 - -EFAULT, -EWOULDBLOCK or -EINVAL > + */ > +static int futex_wait_multiple_setup(struct futex_vector *vs, int count, int *awaken) > +{ > + struct futex_hash_bucket *hb; > + bool retry = false; > + int ret, i; > + u32 uval; > + > + /* > + * Enqueuing multiple futexes is tricky, because we need to enqueue > + * each futex in the list before dealing with the next one to avoid > + * deadlocking on the hash bucket. But, before enqueuing, we need to > + * make sure that current->state is TASK_INTERRUPTIBLE, so we don't > + * absorb any awake events, which cannot be done before the > + * get_futex_key of the next key, because it calls get_user_pages, > + * which can sleep. Thus, we fetch the list of futexes keys in two > + * steps, by first pinning all the memory keys in the futex key, and > + * only then we read each key and queue the corresponding futex. > + * > + * Private futexes doesn't need to recalculate hash in retry, so skip > + * get_futex_key() when retrying. > + */ > +retry: > + for (i = 0; i < count; i++) { > + if ((vs[i].w.flags & FUTEX_PRIVATE_FLAG) && retry) > + continue; > + > + ret = get_futex_key(u64_to_user_ptr(vs[i].w.uaddr), > + !(vs[i].w.flags & FUTEX_PRIVATE_FLAG), > + &vs[i].q.key, FUTEX_READ); > + > + if (unlikely(ret)) > + return ret; > + } > + > + set_current_state(TASK_INTERRUPTIBLE); > + > + for (i = 0; i < count; i++) { > + u32 __user *uaddr = (u32 __user *)(unsigned long)vs[i].w.uaddr; > + struct futex_q *q = &vs[i].q; > + u32 val = (u32)vs[i].w.val; > + > + hb = futex_q_lock(q); > + ret = futex_get_value_locked(&uval, uaddr); > + > + if (!ret && uval == val) { > + /* > + * The bucket lock can't be held while dealing with the > + * next futex. Queue each futex at this moment so hb can > + * be unlocked. > + */ > + futex_queue(q, hb); > + continue; > + } > + > + futex_q_unlock(hb); > + __set_current_state(TASK_RUNNING); > + > + /* > + * Even if something went wrong, if we find out that a futex > + * was awaken, we don't return error and return this index to > + * userspace > + */ > + *awaken = unqueue_multiple(vs, i); > + if (*awaken >= 0) > + return 1; > + > + if (uval != val) > + return -EWOULDBLOCK; > + > + if (ret) { > + /* > + * If we need to handle a page fault, we need to do so > + * without any lock and any enqueued futex (otherwise > + * we could lose some wakeup). So we do it here, after > + * undoing all the work done so far. In success, we > + * retry all the work. > + */ > + if (get_user(uval, uaddr)) > + return -EFAULT; > + > + retry = true; > + goto retry; > + } My bad again: the last two if's should be in the reserve order. If ret != 0, the user copy didn't succeed and the value wasn't copied to uval, thus the comparison (uval != val) should happen only if ret == 0. > + } > + > + return 0; > +}