From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS22989 209.51.188.0/24 X-Spam-Status: No, score=-3.7 required=3.0 tests=AWL,BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_PASS shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dcvr.yhbt.net (Postfix) with ESMTPS id 273DD1F461 for ; Sat, 6 Jul 2019 19:11:10 +0000 (UTC) Received: from localhost ([::1]:32814 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.86_2) (envelope-from ) id 1hjq5T-0004UV-T1 for normalperson@yhbt.net; Sat, 06 Jul 2019 15:11:07 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:43849) by lists.gnu.org with esmtp (Exim 4.86_2) (envelope-from ) id 1hjq5Q-0004RC-CP for bug-gnulib@gnu.org; Sat, 06 Jul 2019 15:11:05 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hjq5P-0004am-5q for bug-gnulib@gnu.org; Sat, 06 Jul 2019 15:11:04 -0400 Received: from mail.magicbluesmoke.com ([82.195.144.49]:47606) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1hjq5O-0004ZP-S5 for bug-gnulib@gnu.org; Sat, 06 Jul 2019 15:11:03 -0400 Received: from localhost.localdomain (unknown [109.76.134.172]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.magicbluesmoke.com (Postfix) with ESMTPSA id 034C7A316; Sat, 6 Jul 2019 20:10:59 +0100 (IST) Subject: [PATCH] areadlink-with-size: guess a buffer size with 0 size To: Paul Eggert References: <7AC7AA2A-EDEA-4A80-9A5A-02FAA2D823EF@whamcloud.com> <5a1b87fe-e265-6b0f-66eb-20ad1ba9c6e5@draigBrady.com> <5f23c469-89b6-1280-a8a4-a9b0653f186a@cs.ucla.edu> From: =?UTF-8?Q?P=c3=a1draig_Brady?= Message-ID: <360612c8-356b-f139-507f-2282c68fae3f@draigBrady.com> Date: Sat, 6 Jul 2019 20:10:59 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0 MIME-Version: 1.0 In-Reply-To: <5f23c469-89b6-1280-a8a4-a9b0653f186a@cs.ucla.edu> Content-Type: multipart/mixed; boundary="------------5C5866DE809F07C336986744" X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 82.195.144.49 X-BeenThere: bug-gnulib@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Gnulib discussion list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: bug-gnulib , Andreas Dilger Errors-To: bug-gnulib-bounces+normalperson=yhbt.net@gnu.org Sender: "bug-gnulib" This is a multi-part message in MIME format. --------------5C5866DE809F07C336986744 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable On 06/07/19 00:53, Paul Eggert wrote: > P=C3=A1draig Brady wrote: >> It would be nice to have areadlink_with_size treat 0 as auto select so= me lower bound. >=20 > Yes, that sounds good. However, I didn't see why that would entail chan= ging=20 > SYMLINK_MAX from 1024 to 1023, or why the patch would affect the docume= nted API. >=20 > How about the attached patch instead? When the guessed size is zero it = typically=20 > avoids a realloc by using a small stack buffer. Your patch has the advantage of allocating the exact right sized buffer in the usual case, but the disadvantage of CPU overhead in string length = determination, and some extra code complexity in the separate small buffer handling. Given Bruno's interim patch of shrinking to the exact sized buffer, I've push the attached simpler patch that uses a starting buffer of size 128 (suggested by Andreas), when SIZE=3D=3D0 is specified. cheers, P=C3=A1draig --------------5C5866DE809F07C336986744 Content-Type: text/x-patch; name="areadlink-zero-size.patch" Content-Transfer-Encoding: quoted-printable Content-Disposition: attachment; filename="areadlink-zero-size.patch" =46rom 0ccc444f3d2dc3ad1b4d682f7d8403633942ed39 Mon Sep 17 00:00:00 2001 From: =3D?UTF-8?q?P=3DC3=3DA1draig=3D20Brady?=3D Date: Sat, 6 Jul 2019 19:43:11 +0100 Subject: [PATCH] areadlink-with-size: guess a buffer size with 0 size The size is usually taken from st_size, which can be zero, resulting in inefficient operation as seen with: $ strace -e readlink stat -c %N /proc/$$/cwd readlink("/proc/9036/cwd", "/", 1) =3D 1 readlink("/proc/9036/cwd", "/h", 2) =3D 2 readlink("/proc/9036/cwd", "/hom", 4) =3D 4 readlink("/proc/9036/cwd", "/home/pa", 8) =3D 8 readlink("/proc/9036/cwd", "/home/padraig", 16) =3D 13 Instead let zero select an initial memory allocation of 128 bytes, which most symlinks fit within. * lib/areadlink-with-size.c (areadlink_with_size): Start with a 128 byte buffer, for SIZE =3D=3D 0. * lib/areadlinkat-with-size.c (areadlinkat_with_size): Likewise. --- ChangeLog | 11 +++++++++++ lib/areadlink-with-size.c | 3 ++- lib/areadlinkat-with-size.c | 3 ++- 3 files changed, 15 insertions(+), 2 deletions(-) diff --git a/ChangeLog b/ChangeLog index 885907d..0b06131 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,3 +1,14 @@ +2019-07-06 P=C3=A1draig Brady + + areadlink-with-size: guess a buffer size with 0 size + The size is usually taken from st_size, which can be zero, + resulting in inefficient operation. + Instead let zero select an initial memory allocation + of 128 bytes, which most symlinks fit within. + * lib/areadlink-with-size.c (areadlink_with_size): + Start with a 128 byte buffer, for SIZE =3D=3D 0. + * lib/areadlinkat-with-size.c (areadlinkat_with_size): Likewise. + 2019-07-06 Konstantin Kharlamov =20 Replace manually crafted hex regexes with [:xdigit:] diff --git a/lib/areadlink-with-size.c b/lib/areadlink-with-size.c index 364cc08..b9cd05c 100644 --- a/lib/areadlink-with-size.c +++ b/lib/areadlink-with-size.c @@ -61,7 +61,8 @@ areadlink_with_size (char const *file, size_t size) : INITIAL_LIMIT_BOUND); =20 /* The initial buffer size for the link value. */ - size_t buf_size =3D size < initial_limit ? size + 1 : initial_limit; + size_t buf_size =3D (size =3D=3D 0 ? 128 + : size < initial_limit ? size + 1 : initial_limit);= =20 while (1) { diff --git a/lib/areadlinkat-with-size.c b/lib/areadlinkat-with-size.c index 5b2bccc..d39096f 100644 --- a/lib/areadlinkat-with-size.c +++ b/lib/areadlinkat-with-size.c @@ -66,7 +66,8 @@ areadlinkat_with_size (int fd, char const *file, size_t= size) : INITIAL_LIMIT_BOUND); =20 /* The initial buffer size for the link value. */ - size_t buf_size =3D size < initial_limit ? size + 1 : initial_limit; + size_t buf_size =3D (size =3D=3D 0 ? 128 + : size < initial_limit ? size + 1 : initial_limit);= =20 while (1) { --=20 2.9.3 --------------5C5866DE809F07C336986744--