From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS22989 209.51.188.0/24 X-Spam-Status: No, score=-3.7 required=3.0 tests=AWL,BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_PASS shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dcvr.yhbt.net (Postfix) with ESMTPS id 0B3A51F461 for ; Thu, 4 Jul 2019 10:45:06 +0000 (UTC) Received: from localhost ([::1]:44482 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.86_2) (envelope-from ) id 1hizEe-0000Cr-57 for normalperson@yhbt.net; Thu, 04 Jul 2019 06:45:04 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:50640) by lists.gnu.org with esmtp (Exim 4.86_2) (envelope-from ) id 1hizEa-00008z-K6 for bug-gnulib@gnu.org; Thu, 04 Jul 2019 06:45:01 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hizEZ-00062f-6M for bug-gnulib@gnu.org; Thu, 04 Jul 2019 06:45:00 -0400 Received: from mail.magicbluesmoke.com ([82.195.144.49]:43994) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1hizEX-0005tD-G3; Thu, 04 Jul 2019 06:44:59 -0400 Received: from localhost.localdomain (unknown [109.77.223.150]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.magicbluesmoke.com (Postfix) with ESMTPSA id 41862AB39; Thu, 4 Jul 2019 11:44:49 +0100 (IST) Subject: Re: [PATCH] stat: don't explicitly request file size for filenames To: Andreas Dilger , "coreutils@gnu.org" References: <7AC7AA2A-EDEA-4A80-9A5A-02FAA2D823EF@whamcloud.com> From: =?UTF-8?Q?P=c3=a1draig_Brady?= Message-ID: <5a1b87fe-e265-6b0f-66eb-20ad1ba9c6e5@draigBrady.com> Date: Thu, 4 Jul 2019 11:44:48 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0 MIME-Version: 1.0 In-Reply-To: <7AC7AA2A-EDEA-4A80-9A5A-02FAA2D823EF@whamcloud.com> Content-Type: multipart/mixed; boundary="------------079E0002121F958B4D9D9A35" X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 82.195.144.49 X-BeenThere: bug-gnulib@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Gnulib discussion list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: bug-gnulib , Jeff Layton Errors-To: bug-gnulib-bounces+normalperson=yhbt.net@gnu.org Sender: "bug-gnulib" This is a multi-part message in MIME format. --------------079E0002121F958B4D9D9A35 Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable On 03/07/19 21:24, Andreas Dilger wrote: > When calling 'stat -c %N' to print the filename, don't explicitly > request the size of the file via statx(), as it may add overhead on > some filesystems. The size is only needed to optimize an allocation > for the relatively rare case of reading a symlink name, and the worst > effect is a somewhat-too-large temporary buffer may be allocated for > areadlink_with_size(), or internal retries if buffer is too small. >=20 > The file size will be returned by statx() on most filesystems, even > if not requested, unless the filesystem considers this to be too > expensive for that file, in which case the tradeoff is worthwhile. >=20 > * src/stat.c: Don't explicitly request STATX_SIZE for filenames. > Start with a 1KB buffer for areadlink_with_size() if st_size unset. > --- > src/stat.c | 6 ++++-- > 1 file changed, 4 insertions(+), 2 deletions(-) >=20 > diff --git a/src/stat.c b/src/stat.c > index ec0bb7d..c887013 100644 > --- a/src/stat.c > +++ b/src/stat.c > @@ -1282,7 +1282,7 @@ fmt_to_mask (char fmt) > switch (fmt) > { > case 'N': > - return STATX_MODE|STATX_SIZE; > + return STATX_MODE; > case 'd': > case 'D': > return STATX_MODE; > @@ -1491,7 +1491,9 @@ print_stat (char *pformat, size_t prefix_len, uns= igned int m, > out_string (pformat, prefix_len, quoteN (filename)); > if (S_ISLNK (statbuf->st_mode)) > { > - char *linkname =3D areadlink_with_size (filename, statbuf->s= t_size); > + /* if statx() didn't set size, most symlinks are under 1KB *= / > + char *linkname =3D areadlink_with_size (filename, statbuf->s= t_size ?: > + 1023); It would be nice to have areadlink_with_size treat 0 as auto select some = lower bound. There is already logic there, and it would be generally helpful, as st_size can often be 0, as shown with: $ strace -e readlink stat -c %N /proc/$$/cwd readlink("/proc/9036/cwd", "/", 1) =3D 1 readlink("/proc/9036/cwd", "/h", 2) =3D 2 readlink("/proc/9036/cwd", "/hom", 4) =3D 4 readlink("/proc/9036/cwd", "/home/pa", 8) =3D 8 readlink("/proc/9036/cwd", "/home/padraig", 16) =3D 13 With the attached gnulib diff, we get: $ strace -e readlink git/coreutils/src/stat -c %N /proc/$$/cwd readlink("/proc/12512/cwd", "/home/padraig", 1024) =3D 13 I'll push both later. thanks! P=E1draig --------------079E0002121F958B4D9D9A35 Content-Type: text/x-patch; name="areadlink-zero-size.diff" Content-Transfer-Encoding: quoted-printable Content-Disposition: attachment; filename="areadlink-zero-size.diff" diff --git a/lib/areadlink-with-size.c b/lib/areadlink-with-size.c index eacad3f..2fbe51c 100644 --- a/lib/areadlink-with-size.c +++ b/lib/areadlink-with-size.c @@ -36,14 +36,15 @@ check, so it's OK to guess too small on hosts where there is no arbitrary limit to symbolic link length. */ #ifndef SYMLINK_MAX -# define SYMLINK_MAX 1024 +# define SYMLINK_MAX 1023 #endif =20 #define MAXSIZE (SIZE_MAX < SSIZE_MAX ? SIZE_MAX : SSIZE_MAX) =20 /* Call readlink to get the symbolic link value of FILE. SIZE is a hint as to how long the link is expected to be; - typically it is taken from st_size. It need not be correct. + typically it is taken from st_size. It need not be correct, + and a value of 0 (or more than 8Ki) will select an appropriate lower = bound. Return a pointer to that NUL-terminated string in malloc'd storage. If readlink fails, malloc fails, or if the link value is longer than SSIZE_MAX, return NULL (caller may use errno to diagnose). */ @@ -61,7 +62,7 @@ areadlink_with_size (char const *file, size_t size) : INITIAL_LIMIT_BOUND); =20 /* The initial buffer size for the link value. */ - size_t buf_size =3D size < initial_limit ? size + 1 : initial_limit; + size_t buf_size =3D size && size < initial_limit ? size + 1 : initial_= limit; =20 while (1) { --------------079E0002121F958B4D9D9A35--