From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS17314 8.43.84.0/22 X-Spam-Status: No, score=-4.2 required=3.0 tests=AWL,BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED,SPF_HELO_PASS,SPF_PASS shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by dcvr.yhbt.net (Postfix) with ESMTPS id 3052E1F8C6 for ; Wed, 28 Jul 2021 17:44:34 +0000 (UTC) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 502D4399C03D for ; Wed, 28 Jul 2021 17:44:33 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 502D4399C03D DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1627494273; bh=QFtun8GbMeF6ai10xISMa8Ssj8pUsFXThdECdnxBJ2k=; h=To:Subject:References:Date:In-Reply-To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=u9qSQ12JhbJGRfFqQml9EJt9a8Rf3b4lwb2P5bWfWIoxTAvi5qrs127FBMnzD/B1Q U+ofEHXwy5iP1tjDT5PJWSMqmm9P2s0HqhWn5YUMsM6Rvi1HaxiSsSRNNq5SzeS0Ln 92dG8J0CzS05yVlRGKQapi0315N5gMeaFGnAjUgg= Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by sourceware.org (Postfix) with ESMTP id 5D9F5388C012 for ; Wed, 28 Jul 2021 17:44:12 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 5D9F5388C012 Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-437-GfYbgX-aPqiHIbqQPVUOaQ-1; Wed, 28 Jul 2021 13:44:08 -0400 X-MC-Unique: GfYbgX-aPqiHIbqQPVUOaQ-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id E1DE080196C; Wed, 28 Jul 2021 17:44:06 +0000 (UTC) Received: from oldenburg.str.redhat.com (ovpn-112-7.ams2.redhat.com [10.36.112.7]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 57DAA100164C; Wed, 28 Jul 2021 17:44:05 +0000 (UTC) To: Aleksa Sarai Subject: Re: RFC: Disable clone3 for glibc 2.34 References: <87eebkf8ph.fsf@oldenburg.str.redhat.com> <87y29sdsui.fsf@oldenburg.str.redhat.com> <20210727092416.layfgqi6auudbpgc@wittgenstein> <20210727094117.jid7shl7futsciih@wittgenstein> <20210727102222.r2hys526mfkpt4xo@senku> Date: Wed, 28 Jul 2021 19:44:03 +0200 In-Reply-To: <20210727102222.r2hys526mfkpt4xo@senku> (Aleksa Sarai's message of "Tue, 27 Jul 2021 20:22:22 +1000") Message-ID: <871r7i8hb0.fsf@oldenburg.str.redhat.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux) MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Florian Weimer via Libc-alpha Reply-To: Florian Weimer Cc: "Daniel P. =?utf-8?Q?Berrang=C3=A9?=" , Christian Brauner , Florian Weimer via Libc-alpha Errors-To: libc-alpha-bounces+e=80x24.org@sourceware.org Sender: "Libc-alpha" * Aleksa Sarai: > Yes, runc has had the -ENOSYS fallback behaviour for a few releases now. > > The way it works is that any syscall which has a larger syscall number > than any syscall specified in the filter will get -ENOSYS (this works > even if libseccomp is outdated). The only way you could get the -EPERM > behaviour with modern runc is if you write a seccomp profile that had > rules for newer syscalls (openat2 for instance) but not clone3 -- but > Docker doesn't do that. (The reason for this slightly convoluted > behaviour was to make sure that intentional omissions actually give you > -EPERM.) > > However this requires the container host to have an updated version of > runc which is up to GitHub. (Though we fixed a security issue in runc > recently, so I would expect that they've updated their versions of runc > by now.) Indeed I wasn't able to reproduce this locally. Ubuntu's docker.io package behaves as expected, even for =E2=80=9Cdocker build=E2=80=9D as far= as I can see. So far, the reported breakage has been focused on Github Actions and Azure Devops. They use a custom Docker-Moby build, and I don't know what's in it. The net effect is that clone3 does not work in containers by default. =E2=80=9Cdocker build=E2=80=9D still does not allow =E2=80=9C-= -security-opt seccomp=3Dunconfined=E2=80=9D for unknown reasons, but that workaround stil= l applies to =E2=80=9Cdocker create=E2=80=9D. Daniel P. Berrang=C3=A9 reported that Moby mentions a system call in its policy whose number is larger than clone3, effectively turning ENOSYS into ENOPERM for clone3. Looking at the recent change, it could be the addition of close_range and epoll_pwait2 in this commit: commit 54eff4354b17a9c460b851300f28aed1408a8615 Author: Aleksa Sarai Date: Sun Jan 17 23:39:31 2021 +1100 profiles: seccomp: update to Linux 5.11 syscall list =20 These syscalls (some of which have been in Linux for a while but were missing from the profile) fall into a few buckets: =20 * close_range(2), epoll_pwait2(2) are just extensions of existing "saf= e for everyone" syscalls. =20 * The mountv2 API syscalls (fs*(2), move_mount(2), open_tree(2)) are all equivalent to aspects of mount(2) and thus go into the CAP_SYS_ADMIN category. =20 * process_madvise(2) is similar to the other process_*(2) syscalls and thus goes in the CAP_SYS_PTRACE category. =20 Signed-off-by: Aleksa Sarai Maybe we don't see this everywhere because these higher system call numbers become available only if the system libseccomp version is recent enough to know about them. Once that is the case, the ENOSYS/EPERM line shifts and clone3 is on the wrong side of it. If that's indeed the explanation, then maybe we can simply fix moby and ask Microsoft to respin their images? Thanks, Florian