From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS22989 209.51.188.0/24 X-Spam-Status: No, score=-3.7 required=3.0 tests=AWL,BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED,RCVD_IN_MSPIKE_H2,SPF_HELO_PASS,SPF_PASS shortcircuit=no autolearn=ham autolearn_force=no version=3.4.6 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dcvr.yhbt.net (Postfix) with ESMTPS id 0F65D1F47C for ; Wed, 18 Jan 2023 12:37:06 +0000 (UTC) Authentication-Results: dcvr.yhbt.net; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=clisp.org header.i=@clisp.org header.a=rsa-sha256 header.s=strato-dkim-0002 header.b=oqzcRXpw; dkim-atps=neutral Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pI7g4-0004Oy-F6; Wed, 18 Jan 2023 07:36:28 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pI7fz-0004Kr-2v for bug-gnulib@gnu.org; Wed, 18 Jan 2023 07:36:24 -0500 Received: from mo4-p00-ob.smtp.rzone.de ([81.169.146.160]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pI7fw-0005g3-7a for bug-gnulib@gnu.org; Wed, 18 Jan 2023 07:36:22 -0500 ARC-Seal: i=1; a=rsa-sha256; t=1674045377; cv=none; d=strato.com; s=strato-dkim-0002; b=WLuJ+R1Lu4hg+wWHkzpkqQ9gYD5Mwb5gBNb8WyJOjD6co+/1zhJgwcBmH6b0PZiiEw /LCTckM7nT4d94cDBg8erbaWeTCwy4s8pZbiyA+fUGTQO2wRyr8oxUN393hXvGBGLOq2 wDu8MuxWlUKVFJp63cHT5tK1R+/HSFxnSuAT6BWLs6Fncebg5brpb2b4jW/reEYHvlho HzgaIVw8ih16So0hZeYpgAFdAFbXmuHRgqMPRxZjL3bUnEBikY9EdsUFCzcnb220chiI YqnZsZ+sH+XqE9G7I14yae08VtmEfQHT5+cz4pUVFWoSCbq6ZqceRHlAndhqPEB4JBxq lZHw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; t=1674045377; s=strato-dkim-0002; d=strato.com; h=Message-ID:Date:Subject:To:From:Cc:Date:From:Subject:Sender; bh=AyD8vk+vxwk7nUh03f276LK1BpgDwhGGVlmgdYh2K9Q=; b=F67Qffdo0aybcqVPrqNysK1P+wDJN71t3XaBgnM77tdkf0QK5Ry9+NNhj1yFg74t9h fIIcQhSPWpYpaxUXiO3JDr+xW4n+en7qPXD3MfhDS+EeD0S6vvi/HM2D2APKsFOJH6sY 8I4U50K2fcfer0PWU/Q22qyEkiqQPOZ8COvYpBVYomi6VEE+ExZzw7MmXSujSy30uXEh o9OmwDSKgvQco4GmzogjLk+Sk3fyaIb47cV7lsgg4686RWFj3uB84v6dPwYAyOj7Z1fl 3ltJp9Ow7lu/yWMdpnBkuGZnO6OEiLzSbZ5dJm9TWZQGFUWWxsJlmTi6PQle5SeNg10k /GHg== ARC-Authentication-Results: i=1; strato.com; arc=none; dkim=none X-RZG-CLASS-ID: mo00 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; t=1674045377; s=strato-dkim-0002; d=clisp.org; h=Message-ID:Date:Subject:To:From:Cc:Date:From:Subject:Sender; bh=AyD8vk+vxwk7nUh03f276LK1BpgDwhGGVlmgdYh2K9Q=; b=oqzcRXpwx512xB0D5g1wPrZFql9PNXFibZcJZKr4YspZhfn/vI/cX5gLdE945+mrcj fV3NtoxNaDRJWlGZ9vUaKE2l23GpKWwaCirotKc1pj6Zqs8ImRfES88CYjH3MzW9ORV8 6Nzm9UT9XWc0qSwPSBhNYN0CZi0W+sE30zEVsv4utkF4hq0Ef0ZlN6qD9141zWPA5514 nvCaGkVhHPlV7iBUVjFNBEpAb4AGKXn6FQ99o9JPCuQn0NDyzUG2yKXM+tJ53fKQwh/Y QVpRvTjKdXqlCsgzY3f42eh+lFBWcUDSYZGmaxeoBRJ2EKfMqEc5PFit8qagYqh5u8wV 4/Ew== X-RZG-AUTH: ":Ln4Re0+Ic/6oZXR1YgKryK8brlshOcZlIWs+iCP5vnk6shH0WWb0LN8XZoH94zq68+3cfpOU36ZlyvfpaYudzZv7Hx1QQv/ySQ==" Received: from nimes.localnet by smtp.strato.de (RZmta 48.6.2 AUTH) with ESMTPSA id I8f358z0ICaHZ4A (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256 bits)) (Client did not present a certificate); Wed, 18 Jan 2023 13:36:17 +0100 (CET) From: Bruno Haible To: bug-gnulib@gnu.org Subject: getcwd: Speed up on Linux. Add support for Android. Date: Wed, 18 Jan 2023 13:36:16 +0100 Message-ID: <15107551.Y0gOKygRFC@nimes> MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii" Received-SPF: none client-ip=81.169.146.160; envelope-from=bruno@clisp.org; helo=mo4-p00-ob.smtp.rzone.de X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_PASS=-0.001, SPF_NONE=0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: bug-gnulib@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gnulib discussion list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnulib-bounces+normalperson=yhbt.net@gnu.org Sender: bug-gnulib-bounces+normalperson=yhbt.net@gnu.org On Android, in the Termux app, I see a test failure: FAIL: test-getcwd.sh ==================== FAIL test-getcwd.sh (exit status: 5) What happens, in the test_long_name() function of this test: - The directory in which the test is run is /data/data/com.termux/files/home/testdir1/build/gltests The peculiar circumstance is that the ancestor directories /data/data and /data are not readable (they produce an error EACCES). - The test creates a hierarchy by doing 449 times chdir("confdir3"), then call rpl_getcwd. - rpl_getcwd first calls getcwd_system, which fails with error ENAMETOOLONG. - Then rpl_getcwd goes to the parent directory 455 times. 454 times this succeeds (up to /data/data/com.termux); then it fails (since /data/data is not readable). - At this point rpl_getcwd gives up and fails with errno EACCES. But we can do better: Android uses the Linux kernel. Therefore it has getcwd available as a system call, and this system call does not care about unreadable ancestor directories. More precisely, we cannot use the getcwd system call directly, because it would require chdir("..") calls and thus make our rpl_getcwd function not multi-thread safe. But the /proc file system supports a way to translate an fd to a file name, via readlink [1]. That's what we need here. So, the fix is to use this /proc file system trick repeatedly; it works as soon as the directory name is at most 4095 bytes long. This code makes the ".."-climbing loop a bit slower: after reading all directory entries it now checks whether the readlink() check works. But the advantage is that it terminates this loop much earlier than before and thus saves dozens or hundreds of loop rounds. And in particular if some of the ancestors are not readable, this won't make the loop fail. [1] https://lists.gnu.org/archive/html/bug-gnulib/2022-12/msg00053.html 2023-01-18 Bruno Haible getcwd: Speed up on Linux. Add support for Android. * lib/getcwd.c (__getcwd_generic): On Linux, use a specific readlink call to speed up the operation. diff --git a/lib/getcwd.c b/lib/getcwd.c index a4f5c5eae3..5201cd06b5 100644 --- a/lib/getcwd.c +++ b/lib/getcwd.c @@ -172,6 +172,9 @@ __getcwd_generic (char *buf, size_t size) #if HAVE_OPENAT_SUPPORT int fd = AT_FDCWD; bool fd_needs_closing = false; +# if defined __linux__ + bool proc_fs_not_mounted = false; +# endif #else char dots[DEEP_NESTING * sizeof ".." + BIG_FILE_NAME_COMPONENT_LENGTH + 1]; char *dotlist = dots; @@ -437,6 +440,67 @@ __getcwd_generic (char *buf, size_t size) thisdev = dotdev; thisino = dotino; + +#if HAVE_OPENAT_SUPPORT + /* On some platforms, a system call returns the directory that FD points + to. This is useful if some of the ancestor directories of the + directory are unreadable, because in this situation the loop that + climbs up the ancestor hierarchy runs into an EACCES error. + For example, in some Android app, /data/data/com.termux is readable, + but /data/data and /data are not. */ +# if defined __linux__ + /* On Linux, in particular, if /proc is mounted, + readlink ("/proc/self/fd/") + returns the directory, if its length is < 4096. (If the length is + >= 4096, it fails with error ENAMETOOLONG, even if the buffer that we + pass to the readlink function would be large enough.) */ + if (!proc_fs_not_mounted) + { + char namebuf[14 + 10 + 1]; + sprintf (namebuf, "/proc/self/fd/%u", (unsigned int) fd); + char linkbuf[4096]; + ssize_t linklen = readlink (namebuf, linkbuf, sizeof linkbuf); + if (linklen < 0) + { + if (errno != ENAMETOOLONG) + /* If this call was not successful, the next one will likely be + not successful either. */ + proc_fs_not_mounted = true; + } + else + { + dirroom = dirp - dir; + if (dirroom < linklen) + { + if (size != 0) + { + __set_errno (ERANGE); + goto lose; + } + else + { + char *tmp; + size_t oldsize = allocated; + + allocated += linklen - dirroom; + if (allocated < oldsize + || ! (tmp = realloc (dir, allocated))) + goto memory_exhausted; + + /* Move current contents up to the end of the buffer. */ + dirp = memmove (tmp + dirroom + (allocated - oldsize), + tmp + dirroom, + oldsize - dirroom); + dir = tmp; + } + } + dirp -= linklen; + memcpy (dirp, linkbuf, linklen); + break; + } + } +# endif +#endif } if (dirstream && __closedir (dirstream) != 0)