From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-Status: No, score=-5.5 required=3.0 tests=AWL,BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,NICE_REPLY_A, RCVD_IN_DNSWL_MED,RCVD_IN_MSPIKE_H4,RCVD_IN_MSPIKE_WL,SPF_HELO_PASS, SPF_PASS shortcircuit=no autolearn=unavailable autolearn_force=no version=3.4.2 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dcvr.yhbt.net (Postfix) with ESMTPS id 9DCA11F4B4 for ; Sat, 2 Jan 2021 00:04:19 +0000 (UTC) Received: from localhost ([::1]:46288 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kvUP3-0007NX-L6 for normalperson@yhbt.net; Fri, 01 Jan 2021 19:04:17 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:49958) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kvUOw-0007NH-L7 for bug-gnulib@gnu.org; Fri, 01 Jan 2021 19:04:10 -0500 Received: from zimbra.cs.ucla.edu ([131.179.128.68]:47128) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kvUOt-0002Q3-U0 for bug-gnulib@gnu.org; Fri, 01 Jan 2021 19:04:09 -0500 Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id AE5FB16006F; Fri, 1 Jan 2021 16:04:04 -0800 (PST) Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id 8taflPQ-tbUs; Fri, 1 Jan 2021 16:04:03 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 65639160114; Fri, 1 Jan 2021 16:04:03 -0800 (PST) X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id xcANB2Dbd8ns; Fri, 1 Jan 2021 16:04:03 -0800 (PST) Received: from [192.168.1.9] (cpe-23-243-218-95.socal.res.rr.com [23.243.218.95]) by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id 370E616006F; Fri, 1 Jan 2021 16:04:03 -0800 (PST) To: Adhemerval Zanella References: <20201229193454.34558-1-adhemerval.zanella@linaro.org> <20201229193454.34558-5-adhemerval.zanella@linaro.org> <502b6d2d-1139-ca9d-14cf-00082adc915e@linaro.org> From: Paul Eggert Organization: UCLA Computer Science Department Subject: Re: [PATCH v3 4/6] stdlib: Sync canonicalize with gnulib [BZ #10635] [BZ #26592] [BZ #26341] [BZ #24970] Message-ID: <275283e0-70ee-5ea4-e63d-d0f1d1393667@cs.ucla.edu> Date: Fri, 1 Jan 2021 16:04:02 -0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.5.0 MIME-Version: 1.0 In-Reply-To: <502b6d2d-1139-ca9d-14cf-00082adc915e@linaro.org> Content-Type: multipart/mixed; boundary="------------9322ED6FCFCBEBC8E4E11ECB" Content-Language: en-US Received-SPF: pass client-ip=131.179.128.68; envelope-from=eggert@cs.ucla.edu; helo=zimbra.cs.ucla.edu X-Spam_score_int: -68 X-Spam_score: -6.9 X-Spam_bar: ------ X-Spam_report: (-6.9 / 5.0 requ) BAYES_00=-1.9, NICE_REPLY_A=-2.749, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: bug-gnulib@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Gnulib discussion list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: libc-alpha@sourceware.org, bug-gnulib@gnu.org Errors-To: bug-gnulib-bounces+normalperson=yhbt.net@gnu.org Sender: "bug-gnulib" This is a multi-part message in MIME format. --------------9322ED6FCFCBEBC8E4E11ECB Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: quoted-printable On 12/30/20 5:10 AM, Adhemerval Zanella wrote: >> it is just really >> a small optimization that adds code complexity on a somewhat convolute= d >> code. The code is indeed simpler without the NARROW_ADDRESSES optimization, so=20 I removed that optimization by installing the attached patch into Gnulib. >> For ENAMETOOLONG, I think this is the right error code: it enforces >> that we do not support internal objects longer that PTRDIFF_MAX. This sounds backwards, as the code returns ENOMEM every other place it=20 tries to create an internal object longer than PTRDIFF_MAX - these=20 ENOMEM checks are in the malloc calls invoked by scratch_buffer_grow and=20 scratch_buffer_grow_preserve. It would be odd for canonicalize_file_name=20 to return ENAMETOOLONG for this one particular way of creating a=20 too-large object, while at the same time it returns ENOMEM for all the=20 other ways. Besides, ENAMETOOLONG is the POSIX error code for exceeding NAME_MAX or=20 PATH_MAX, which is not what is happening here. In Gnulib and other GNU apps we've long used the tradition that ENOMEM=20 means you've run out of memory, regardless of whether it's because your=20 heap or your address space is too small. This is a good tradition and=20 it'd be good to use it here too. >> I think it should be a fair assumption to make it on internal code, su= ch >> as realpath Yes, staying less than PTRDIFF_MAX is a vital assumption on internal=20 objects. I'd go even further and say it's important for user-supplied=20 objects, too, as so much code relies on pointer subtraction and we can't=20 realistically prohibit that within glibc. > (this is another reason why I think NARROW_ADDRESSES is not=20 > necessary). Unfortunately, if we merely assume every object has at most PTRDIFF_MAX=20 bytes, we still must check for overflow when adding the sizes of two=20 objects. The NARROW_ADDRESSES optimization would have let us avoid that=20 unnecessary check on 64-bit machines. > And your fix (from 93e0186d4) does not really solve the issue, since > now that len is a size_t the overflow check won't catch the potentially > allocation larger than PTRDIFF_MAX (the realpath will still fail with > ENOMEM though). Sure, which means the code is doing the right thing: it's failing with=20 ENOMEM because it ran out of memory. There is no need for an extra=20 PTRDIFF_MAX check in canonicalize.c if malloc (via scratch_buffer_grow)=20 already does the check. > Wouldn't the below be simpler? >=20 > size_t len =3D strlen (end); > if (len > IDX_MAX || INT_ADD_OVERFLOW ((idx_t) len, n)) > { > __set_errno (ENAMETOOLONG); > goto error_nomem; > } It's not simpler than the attached Gnulib patch, because it contains an=20 unnecessary comparison to IDX_MAX and an unnecessary cast to idx_t. --------------9322ED6FCFCBEBC8E4E11ECB Content-Type: text/x-patch; charset=UTF-8; name="0001-canonicalize-remove-NARROW_ADDRESSES-optimization.patch" Content-Transfer-Encoding: quoted-printable Content-Disposition: attachment; filename*0="0001-canonicalize-remove-NARROW_ADDRESSES-optimization.patch" =46rom 8f6b9b66be6672bed1045c27e606dd9fcedcf022 Mon Sep 17 00:00:00 2001 From: Paul Eggert Date: Fri, 1 Jan 2021 15:54:43 -0800 Subject: [PATCH] canonicalize: remove NARROW_ADDRESSES optimization * lib/canonicalize-lgpl.c, lib/canonicalize.c (NARROW_ADDRESSES): Remove, and remove all uses, as the optimization is arguably not worth the extra complexity. Suggested by Adhemerval Zanella in: https://sourceware.org/pipermail/libc-alpha/2020-December/121203.html --- ChangeLog | 8 ++++++++ lib/canonicalize-lgpl.c | 6 +----- lib/canonicalize.c | 6 +----- 3 files changed, 10 insertions(+), 10 deletions(-) diff --git a/ChangeLog b/ChangeLog index 2d498a5e9..fc45e1176 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,3 +1,11 @@ +2021-01-01 Paul Eggert + + canonicalize: remove NARROW_ADDRESSES optimization + * lib/canonicalize-lgpl.c, lib/canonicalize.c (NARROW_ADDRESSES): + Remove, and remove all uses, as the optimization is arguably not + worth the extra complexity. Suggested by Adhemerval Zanella in: + https://sourceware.org/pipermail/libc-alpha/2020-December/121203.html + 2021-01-01 Bruno Haible =20 stddef: Try harder to get max_align_t defined on OpenBSD. diff --git a/lib/canonicalize-lgpl.c b/lib/canonicalize-lgpl.c index 560e24288..698f9ede2 100644 --- a/lib/canonicalize-lgpl.c +++ b/lib/canonicalize-lgpl.c @@ -85,10 +85,6 @@ # define IF_LINT(Code) /* empty */ #endif =20 -/* True if adding two valid object sizes might overflow idx_t. - As a practical matter, this cannot happen on 64-bit machines. */ -enum { NARROW_ADDRESSES =3D IDX_MAX >> 31 >> 31 =3D=3D 0 }; - #ifndef DOUBLE_SLASH_IS_DISTINCT_ROOT # define DOUBLE_SLASH_IS_DISTINCT_ROOT false #endif @@ -343,7 +339,7 @@ realpath_stk (const char *name, char *resolved, if (end_in_extra_buffer) end_idx =3D end - extra_buf; size_t len =3D strlen (end); - if (NARROW_ADDRESSES && INT_ADD_OVERFLOW (len, n)) + if (INT_ADD_OVERFLOW (len, n)) { __set_errno (ENOMEM); goto error_nomem; diff --git a/lib/canonicalize.c b/lib/canonicalize.c index cc32260a8..3a1c8098b 100644 --- a/lib/canonicalize.c +++ b/lib/canonicalize.c @@ -42,10 +42,6 @@ # define IF_LINT(Code) /* empty */ #endif =20 -/* True if adding two valid object sizes might overflow idx_t. - As a practical matter, this cannot happen on 64-bit machines. */ -enum { NARROW_ADDRESSES =3D IDX_MAX >> 31 >> 31 =3D=3D 0 }; - #ifndef DOUBLE_SLASH_IS_DISTINCT_ROOT # define DOUBLE_SLASH_IS_DISTINCT_ROOT false #endif @@ -393,7 +389,7 @@ canonicalize_filename_mode_stk (const char *name, can= onicalize_mode_t can_mode, if (end_in_extra_buffer) end_idx =3D end - extra_buf; size_t len =3D strlen (end); - if (NARROW_ADDRESSES && INT_ADD_OVERFLOW (len, n)) + if (INT_ADD_OVERFLOW (len, n)) xalloc_die (); while (extra_buffer.length <=3D len + n) { --=20 2.27.0 --------------9322ED6FCFCBEBC8E4E11ECB--