bug-gnulib@gnu.org mirror (unofficial)
 help / color / mirror / Atom feed
* [PATCH 0/8] Remove alloca usage from glob
@ 2021-01-05 18:58 Adhemerval Zanella
  2021-01-05 18:58 ` [PATCH 1/8] malloc: Add specialized dynarray for C strings Adhemerval Zanella
                   ` (8 more replies)
  0 siblings, 9 replies; 12+ messages in thread
From: Adhemerval Zanella @ 2021-01-05 18:58 UTC (permalink / raw)
  To: libc-alpha, Paul Eggert, bug-gnulib

This is a respin on an old patchset [1] which is no longer on patchwork.
The main change is just a rebase against master since glibc version is
now on sync with gnulib.

I am sending it also to gnulib to get feedback from the project,
specifically for the part not specific for the system support by glibc
and extra requirements.

The idea of removing the alloca allows a slight better code generation,
simplifies the boilerplate code to avoid the unbounded alloca usage,
and it plays better with security compiler mitigation tools (such as the
ones for stack clash).

On x86_64 with gcc 9.2.1 build glob for shared object, the total stack
usage went from:

glob.c:261:1:next_brace_sub     8       static
glob.c:1205:1:prefix_array      96      static
glob.c:1185:1:collated_compare  8       static
glob.c:250:1:is_dir.isra.0      160     static
glob.c:1259:1:glob_in_dir       864     dynamic
glob.c:297:1:__glob     1360    dynamic

to

glob.c:231:1:next_brace_sub     8       static
dynarray-skeleton.c:203:1:globnames_array_free  32      static
glob.c:1131:1:prefix_array      96      static
glob.c:1111:1:collated_compare  8       static
dynarray-skeleton.c:275:1:globnames_array_add__ 48      static
dynarray-skeleton.c:373:1:char_array_resize.part.0      48      static
char_array-skeleton.c:274:1:char_array_replace_str_pos  48      static
char_array-skeleton.c:204:1:char_array_crop     32      static
char_array-skeleton.c:240:1:char_array_append_str       48      static
char_array-skeleton.c:118:1:char_array_set_str_size     32      static
char_array-skeleton.c:161:1:char_array_init_str 32      static
glob.c:220:1:is_dir.isra.0      160     static
glob.c:1200:1:glob_in_dir       1088    static
glob.c:267:1:__glob     1824    static

(it was obtained with -fstack-usage gcc option).

It is a slight increase and due GNU extension where glob might be called
recursivelly internally the total stack usage might be still slight
larger than current implementation.  But I still think it is a worth
tradeoff.

[1] https://sourceware.org/pipermail/libc-alpha/2018-February/091326.html

Adhemerval Zanella (8):
  malloc: Add specialized dynarray for C strings
  posix: Use char_array for internal glob dirname
  posix: Remove alloca usage for GLOB_BRACE on glob
  posix: Remove alloca usage on glob dirname
  posix: Use dynarray for globname in glob
  posix: Remove alloca usage on glob user_name
  posix: Use char_array for home_dir in glob
  posix: Remove all alloca usage in glob

 malloc/Makefile                    |   4 +-
 malloc/Versions                    |   7 +
 malloc/char_array-impl.c           |  57 +++
 malloc/char_array-skeleton.c       | 288 +++++++++++++
 malloc/char_array.h                |  53 +++
 malloc/dynarray.h                  |   9 +
 malloc/dynarray_overflow_failure.c |  31 ++
 malloc/malloc-internal.h           |  14 +
 malloc/tst-char_array.c            | 112 ++++++
 posix/glob.c                       | 622 +++++++++++------------------
 10 files changed, 817 insertions(+), 380 deletions(-)
 create mode 100644 malloc/char_array-impl.c
 create mode 100644 malloc/char_array-skeleton.c
 create mode 100644 malloc/char_array.h
 create mode 100644 malloc/dynarray_overflow_failure.c
 create mode 100644 malloc/tst-char_array.c

-- 
2.25.1



^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/8] malloc: Add specialized dynarray for C strings
  2021-01-05 18:58 [PATCH 0/8] Remove alloca usage from glob Adhemerval Zanella
@ 2021-01-05 18:58 ` Adhemerval Zanella
  2021-01-05 18:58 ` [PATCH 2/8] posix: Use char_array for internal glob dirname Adhemerval Zanella
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 12+ messages in thread
From: Adhemerval Zanella @ 2021-01-05 18:58 UTC (permalink / raw)
  To: libc-alpha, Paul Eggert, bug-gnulib

This patch adds an specialized dynarray to manage C strings using the
dynarray internal implementation.  It uses some private fields from
dynarray and thus it provided specific files to access and manage
the internal string buffer.

   For instance:

   struct char_array str;

   // str == "testing"
   char_array_init_str (&str, "testing");

   // c == 's'
   char c = char_array_pos (&str, 2);

   // str = "testing2"
   char_array_set_str (&str, "testing2");

   // str = "testig2"
   char_array_erase (&str, 5);

   // str = "123testig2";
   char_array_prepend_str (&str, "123");

   // len = 10;
   size_t len = char_array_length (&str);

   // str = "123testig2456";
   char_array_append_str (&str, "456");

   // str = "123test789";
   char_array_replace_str_pos (&str, 7, "789", 3);

The provided function are not extensive and meant mainly to be use in
subsequent glob implementation cleanup.  For internal object consistency
only the function provided by char_array.c should be used, including
internal object manipulation.

To check for possible overflows in internal size manipulation a new
function, check_add_wrapv_size_t, is added on malloc-internal.  It basically
return whether the addition of two size_t overflows.

Checked on x86_64-linux-gnu.

	* malloc/Makefile (test-internal): Add tst-char_array.
	(routines): Add dynarray_overflow_failure and char_array-impl.
	* malloc/Versions [GLIBC_PRIVATE] (libc): Add
	__libc_dynarray_overflow_failure, __char_array_set_str_size,
	__char_array_erase, __char_array_prepend_str_size, and
	__char_array_replace_str_pos.
	* malloc/char_array-impl.c: New file.
	* malloc/char_array-skeleton.c: Likewise.
	* malloc/char_array.h: Likewise.
	* malloc/tst-char-array.c: Likewise.
	* malloc/dynarray_overflow_failure.c: Likewise.
	* malloc/malloc-internal.h (check_add_overflow_size_t): New function.

Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
---
 malloc/Makefile                    |   4 +-
 malloc/Versions                    |   7 +
 malloc/char_array-impl.c           |  57 ++++++
 malloc/char_array-skeleton.c       | 288 +++++++++++++++++++++++++++++
 malloc/char_array.h                |  53 ++++++
 malloc/dynarray.h                  |   9 +
 malloc/dynarray_overflow_failure.c |  31 ++++
 malloc/malloc-internal.h           |  14 ++
 malloc/tst-char_array.c            | 112 +++++++++++
 9 files changed, 574 insertions(+), 1 deletion(-)
 create mode 100644 malloc/char_array-impl.c
 create mode 100644 malloc/char_array-skeleton.c
 create mode 100644 malloc/char_array.h
 create mode 100644 malloc/dynarray_overflow_failure.c
 create mode 100644 malloc/tst-char_array.c

diff --git a/malloc/Makefile b/malloc/Makefile
index 583bbefb0d..850104a763 100644
--- a/malloc/Makefile
+++ b/malloc/Makefile
@@ -54,6 +54,7 @@ tests-internal += \
 	 tst-dynarray \
 	 tst-dynarray-fail \
 	 tst-dynarray-at-fail \
+	 tst-char_array
 
 ifneq (no,$(have-tunables))
 tests += tst-malloc-usable-tunables tst-mxfast
@@ -77,7 +78,7 @@ routines = malloc morecore mcheck mtrace obstack reallocarray \
   scratch_buffer_dupfree \
   scratch_buffer_grow scratch_buffer_grow_preserve \
   scratch_buffer_set_array_size \
-  dynarray_at_failure \
+  dynarray_at_failure dynarray_overflow_failure \
   dynarray_emplace_enlarge \
   dynarray_finalize \
   dynarray_resize \
@@ -87,6 +88,7 @@ routines = malloc morecore mcheck mtrace obstack reallocarray \
   alloc_buffer_copy_bytes  \
   alloc_buffer_copy_string \
   alloc_buffer_create_failure \
+  char_array-impl
 
 install-lib := libmcheck.a
 non-lib.a := libmcheck.a
diff --git a/malloc/Versions b/malloc/Versions
index 6693c46ee2..52d7a97388 100644
--- a/malloc/Versions
+++ b/malloc/Versions
@@ -85,6 +85,7 @@ libc {
 
     # dynarray support
     __libc_dynarray_at_failure;
+    __libc_dynarray_overflow_failure;
     __libc_dynarray_emplace_enlarge;
     __libc_dynarray_finalize;
     __libc_dynarray_resize;
@@ -96,5 +97,11 @@ libc {
     __libc_alloc_buffer_copy_bytes;
     __libc_alloc_buffer_copy_string;
     __libc_alloc_buffer_create_failure;
+
+    # char_array support
+    __char_array_set_str_size;
+    __char_array_erase;
+    __char_array_prepend_str_size;
+    __char_array_replace_str_pos;
   }
 }
diff --git a/malloc/char_array-impl.c b/malloc/char_array-impl.c
new file mode 100644
index 0000000000..3aef1f756d
--- /dev/null
+++ b/malloc/char_array-impl.c
@@ -0,0 +1,57 @@
+/* Specialized dynarray for C strings.  Implementation file.
+   Copyright (C) 2018 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include <malloc/char_array.h>
+
+void
+__char_array_set_str_size (struct dynarray_header *header, const char *str,
+			   size_t size)
+{
+  *((char *) mempcpy (header->array, str, size)) = '\0';
+  header->used = size + 1;
+}
+libc_hidden_def (__char_array_set_str_size)
+
+void
+__char_array_erase (struct dynarray_header *header, size_t pos)
+{
+  char *ppos = header->array + pos;
+  char *lpos = header->array + header->used;
+  ptrdiff_t size = lpos - ppos;
+  memmove (ppos, ppos + 1, size);
+  header->used--;
+}
+libc_hidden_def (__char_array_erase)
+
+void
+__char_array_prepend_str_size (struct dynarray_header *header,
+			       const char *str, size_t size, size_t used)
+{
+  memmove (header->array + size, header->array, used);
+  memcpy (header->array, str, size);
+}
+libc_hidden_def (__char_array_prepend_str_size)
+
+void
+__char_array_replace_str_pos (struct dynarray_header *header, size_t pos,
+			      const char *str, size_t len)
+{
+  char *start = header->array + pos;
+  *(char *) mempcpy (start, str, len) = '\0';
+}
+libc_hidden_def (__char_array_replace_str_pos)
diff --git a/malloc/char_array-skeleton.c b/malloc/char_array-skeleton.c
new file mode 100644
index 0000000000..4d9e4c0d4e
--- /dev/null
+++ b/malloc/char_array-skeleton.c
@@ -0,0 +1,288 @@
+/* Specialized dynarray for C strings.
+   Copyright (C) 2018 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+/* This file provides a dynamic C string with an initial stack allocated
+   buffer.  Since it is based on dynarray, it provides dynamic size
+   expansion and heap usage for large strings.
+
+   The following parameters are optional:
+
+   CHAR_ARRAY_INITIAL_SIZE
+      The size of the statically allocated array (default is 256).  It will
+      be used to define DYNARRAY_INITIAL_SIZE.
+
+   The following functions are provided:
+
+   bool char_array_init_empty (struct char_array *);
+   bool char_array_init_str (struct char_array *, const char *);
+   bool char_array_init_str_size (struct char_array *, const char *, size_t);
+   bool char_array_is_empty (struct char_array *);
+   const char *char_array_str (struct char_array *);
+   char char_array_pos (struct char_array *, size_t);
+   size_t char_array_length (struct char_array *);
+   bool char_array_set_str (struct char_array *, const char *);
+   bool char_array_set_str_size (struct char_array *, const char *, size_t);
+   void char_array_erase (struct char_array *, size_t);
+   bool char_array_crop (struct char_array *, size_t);
+   bool char_array_prepend_str (struct char_array *, const char *);
+   bool char_array_append_str (struct char_array *, const char *);
+   bool char_array_append_char (struct char_array *array, char c);
+   bool char_array_replace_str_pos (struct char_array *, size_t, const char *,
+				    size_t);
+
+   For instance:
+
+   struct char_array str;
+
+   // str == "testing"
+   char_array_init_str (&str, "testing");
+
+   // c == 's'
+   char c = char_array_pos (&str, 2);
+
+   // str = "testing2"
+   char_array_set_str (&str, "testing2");
+
+   // str = "testig2"
+   char_array_erase (&str, 5);
+
+   // str = "123testig2";
+   char_array_prepend_str (&str, "123");
+
+   // len = 10;
+   size_t len = char_array_length (&str);
+
+   // str = "123testig2456";
+   char_array_append_str (&str, "456");
+
+   // str = "123test789";
+   char_array_replace_str_pos (&str, 7, "789", 3);
+ */
+
+#define DYNARRAY_STRUCT            char_array
+#define DYNARRAY_ELEMENT           char
+#define DYNARRAY_PREFIX            char_array_
+#ifndef CHAR_ARRAY_INITIAL_SIZE
+# define CHAR_ARRAY_INITIAL_SIZE 256
+#endif
+#define DYNARRAY_INITIAL_SIZE  CHAR_ARRAY_INITIAL_SIZE
+#include <malloc/dynarray-skeleton.c>
+
+#include <malloc/malloc-internal.h>
+#include <malloc/char_array.h>
+
+/* Return a const char for the internal C string handled by 'array'.  */
+__attribute__ ((unused, nonnull (1)))
+static const char *
+char_array_str (const struct char_array *array)
+{
+  return array->dynarray_header.array;
+}
+
+/* Return the character at position 'pos' from the char_array 'array'.  */
+__attribute__ ((unused, nonnull (1)))
+static char
+char_array_pos (struct char_array *array, size_t pos)
+{
+  return *char_array_at (array, pos);
+}
+
+/* Calculate the length of the string, excluding the terminating null.  */
+__attribute__ ((unused, nonnull (1)))
+static size_t
+char_array_length (const struct char_array *array)
+{
+  /* Exclude the final '\0'.  */
+  return array->dynarray_header.used - 1;
+}
+
+/* Copy up 'size' bytes from string 'str' to char_array 'array'.  A final
+   '\0' is appended in the char_array.  */
+__attribute__ ((unused, nonnull (1, 2)))
+static bool
+char_array_set_str_size (struct char_array *array, const char *str,
+			 size_t size)
+{
+  size_t newsize;
+  if (check_add_overflow_size_t (size, 1, &newsize))
+    __libc_dynarray_overflow_failure (size, 1);
+
+  if (!char_array_resize (array, newsize))
+    return false;
+
+  __char_array_set_str_size (&array->dynarray_abstract, str, size);
+  return true;
+}
+
+/* Copy the contents of string 'str' to char_array 'array', including the
+   final '\0'.  */
+__attribute__ ((unused, nonnull (1, 2)))
+static bool
+char_array_set_str (struct char_array *array, const char *str)
+{
+  return char_array_set_str_size (array, str, strlen (str));
+}
+
+__attribute__ ((unused, nonnull (1, 2)))
+static bool
+char_array_set_array (struct char_array *array, const struct char_array *orig)
+{
+  return char_array_set_str_size (array, char_array_str (orig),
+				  char_array_length (orig));
+}
+
+/* Initialize the char_array 'array' and sets it to an empty string ("").  */
+__attribute__ ((unused, nonnull (1)))
+static bool
+char_array_init_empty (struct char_array *array)
+{
+  char_array_init (array);
+  return char_array_set_str (array, "");
+}
+
+/* Initialize the char_array 'array' and copy the content of string 'str'.  */
+__attribute__ ((unused, nonnull (1, 2)))
+static bool
+char_array_init_str (struct char_array *array, const char *str)
+{
+  char_array_init (array);
+  return char_array_set_str (array, str);
+}
+
+/* Initialize the char_array 'array' and copy the content of string 'str'
+   up to 'size' characteres.  */
+__attribute__ ((unused, nonnull (1, 2)))
+static bool
+char_array_init_str_size (struct char_array *array, const char *str,
+			  size_t size)
+{
+  char_array_init (array);
+  return char_array_set_str_size (array, str, size);
+}
+
+/* Return if the char_array contain any characteres.  */
+__attribute__ ((unused, nonnull (1)))
+static bool
+char_array_is_empty (struct char_array *array)
+{
+  return *char_array_begin (array) == '\0';
+}
+
+/* Remove the byte at position 'pos' from char_array 'array'.  The contents
+   are moved internally if the position is not at the end of the internal
+   buffer.  */
+__attribute__ ((unused, nonnull (1)))
+static bool
+char_array_erase (struct char_array *array, size_t pos)
+{
+  if (pos >= array->dynarray_header.used - 1)
+    return false;
+
+  __char_array_erase (&array->dynarray_abstract, pos);
+  return true;
+}
+
+/* Resize the char_array 'array' to size 'count' maintaining the ending
+   '\0' byte.  */
+__attribute__ ((unused, nonnull (1)))
+static bool
+char_array_crop (struct char_array *array, size_t size)
+{
+  if (size >= (array->dynarray_header.used - 1)
+      || !char_array_resize (array, size + 1))
+    return false;
+
+  array->dynarray_header.array[size] = '\0';
+  return true;
+}
+
+/* Prepend the contents of string 'str' to char_array 'array', including the
+   final '\0' byte.  */
+__attribute__ ((unused, nonnull (1, 2)))
+static bool
+char_array_prepend_str (struct char_array *array, const char *str)
+{
+  size_t size = strlen (str);
+  /* Resizing the array might change its used elements and we need below
+     to correct copy the elements.  */
+  size_t used = array->dynarray_header.used;
+
+  size_t newsize;
+  if (check_add_overflow_size_t (used, size, &newsize))
+    __libc_dynarray_overflow_failure (used, size);
+
+  /* Make room for the string and copy it.  */
+  if (!char_array_resize (array, newsize))
+    return false;
+  __char_array_prepend_str_size (&array->dynarray_abstract, str, size, used);
+  return true;
+}
+
+/* Append the contents of string 'str' to char_array 'array, including the
+   final '\0' byte.  */
+__attribute__ ((unused, nonnull (1, 2)))
+static bool
+char_array_append_str (struct char_array *array, const char *str)
+{
+  size_t size = strlen (str);
+  /* Resizing the array might change its used elements and it used it below
+     to correct copy the elements.  */
+  size_t used = array->dynarray_header.used - 1;
+
+  /* 'used' does account for final '\0', so there is no need to add
+     an extra element to calculate the final required size.  */
+  size_t newsize;
+  if (check_add_overflow_size_t (used + 1, size, &newsize))
+    __libc_dynarray_overflow_failure (used + 1, size);
+
+  if (!char_array_resize (array, newsize))
+    return false;
+
+  /* Start to append at '\0' up to string length and add a final '\0'.  */
+  *(char*) mempcpy (array->dynarray_header.array + used, str, size) = '\0';
+  return true;
+}
+
+/* Append the character 'c' on char_array 'array'.  */
+__attribute__ ((unused, nonnull (1)))
+static bool
+char_array_append_char (struct char_array *array, char c)
+{
+  return char_array_append_str (array, (const char[]) { c, '\0' });
+}
+
+/* Replace the contents starting of position 'pos' of char_array 'array'
+   with the contents of string 'str' up to 'len' bytes.  A final '\0'
+   is appended in the string.  */
+__attribute__ ((unused, nonnull (1, 3)))
+static bool
+char_array_replace_str_pos (struct char_array *array, size_t pos,
+                            const char *str, size_t len)
+{
+  if (pos > array->dynarray_header.used)
+    __libc_dynarray_at_failure (array->dynarray_header.used, pos);
+
+  size_t newsize;
+  if (check_add_overflow_size_t (pos, len, &newsize)
+      || check_add_overflow_size_t (newsize, 1, &newsize)
+      || !char_array_resize (array, newsize))
+    return false;
+
+  __char_array_replace_str_pos (&array->dynarray_abstract, pos, str, len);
+  return true;
+}
diff --git a/malloc/char_array.h b/malloc/char_array.h
new file mode 100644
index 0000000000..93df0d9fbd
--- /dev/null
+++ b/malloc/char_array.h
@@ -0,0 +1,53 @@
+/* Specialized dynarray for C strings.  Shared definitions.
+   Copyright (C) 2018 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#ifndef _CHAR_ARRAY_H
+#define _CHAR_ARRAY_H
+
+#include <malloc/dynarray.h>
+
+/* Internal funciton.  Set the dynarray to the content of the string STR up
+   to SIZE bytes.  The dynarray must be resized previously.  */
+void __char_array_set_str_size (struct dynarray_header *, const char *str,
+				size_t size);
+
+/* Internal function.  Remove the character at position POS from dynarray.
+   The position must be a valid one.  */
+void __char_array_erase (struct dynarray_header *, size_t pos);
+
+/* Internal function.  Prepend the content of string STR up to SIZE bytes to
+   dynarray by moving USED bytes forward.  The dynarray must be resized
+   previously.  */
+void __char_array_prepend_str_size (struct dynarray_header *,
+				    const char *str, size_t size,
+				    size_t used);
+
+/* Internal function.  Replace the content of dynarray starting at position
+   POS with the content of string STR up to LEN bytes.  The dynarray must
+   be resize previously and STR must contain at least LEN bytes.  */
+void __char_array_replace_str_pos (struct dynarray_header *, size_t pos,
+				   const char *str, size_t len);
+
+#ifndef _ISOMAC
+libc_hidden_proto (__char_array_set_str_size)
+libc_hidden_proto (__char_array_erase)
+libc_hidden_proto (__char_array_prepend_str_size)
+libc_hidden_proto (__char_array_replace_str_pos)
+#endif
+
+#endif
diff --git a/malloc/dynarray.h b/malloc/dynarray.h
index 24ca1620d2..ddff3e7855 100644
--- a/malloc/dynarray.h
+++ b/malloc/dynarray.h
@@ -168,12 +168,21 @@ bool __libc_dynarray_finalize (struct dynarray_header *list, void *scratch,
 void __libc_dynarray_at_failure (size_t size, size_t index)
   __attribute__ ((noreturn));
 
+/* Internal function.  Terminate the process after an overflow in
+   new size allocation.  SIZE is the current number of elements in
+   dynamic array and INCR is the new elements to add on current
+   size.  */
+void __libc_dynarray_overflow_failure (size_t size, size_t incr)
+  __attribute__ ((noreturn));
+
 #ifndef _ISOMAC
 libc_hidden_proto (__libc_dynarray_emplace_enlarge)
 libc_hidden_proto (__libc_dynarray_resize)
 libc_hidden_proto (__libc_dynarray_resize_clear)
 libc_hidden_proto (__libc_dynarray_finalize)
 libc_hidden_proto (__libc_dynarray_at_failure)
+libc_hidden_proto (__libc_dynarray_overflow_failure)
+
 #endif
 
 #endif /* _DYNARRAY_H */
diff --git a/malloc/dynarray_overflow_failure.c b/malloc/dynarray_overflow_failure.c
new file mode 100644
index 0000000000..2bcef25efd
--- /dev/null
+++ b/malloc/dynarray_overflow_failure.c
@@ -0,0 +1,31 @@
+/* Report an dynamic array size overflow condition.
+   Copyright (C) 2018 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include <dynarray.h>
+#include <stdio.h>
+
+void
+__libc_dynarray_overflow_failure (size_t size, size_t incr)
+{
+  char buf[200];
+  __snprintf (buf, sizeof (buf), "Fatal glibc error: "
+              "new size overflows (old %zu and increment %zu)\n",
+              size, incr);
+ __libc_fatal (buf);
+}
+libc_hidden_def (__libc_dynarray_overflow_failure)
diff --git a/malloc/malloc-internal.h b/malloc/malloc-internal.h
index 258f29584e..2d13868ed9 100644
--- a/malloc/malloc-internal.h
+++ b/malloc/malloc-internal.h
@@ -77,4 +77,18 @@ void __malloc_arena_thread_freeres (void) attribute_hidden;
 /* Activate a standard set of debugging hooks. */
 void __malloc_check_init (void) attribute_hidden;
 
+/* Set *R = A + B.  Return true if the answer is mathematically incorrect due
+   to overflow; in this case, *R is the low order bits of the correct
+   answer.  */
+static inline bool
+check_add_overflow_size_t (size_t a, size_t b, size_t *r)
+{
+#if __GNUC__ >= 5
+  return __builtin_add_overflow (a, b, r);
+#else
+  *r = a + b;
+  return *r < a;
+#endif
+}
+
 #endif /* _MALLOC_INTERNAL_H */
diff --git a/malloc/tst-char_array.c b/malloc/tst-char_array.c
new file mode 100644
index 0000000000..745ecf4c98
--- /dev/null
+++ b/malloc/tst-char_array.c
@@ -0,0 +1,112 @@
+/* Test for char_array.
+   Copyright (C) 2018 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include <string.h>
+
+#include <malloc/char_array-skeleton.c>
+
+#include <malloc.h>
+#include <mcheck.h>
+#include <stdint.h>
+#include <support/check.h>
+#include <support/support.h>
+
+static int
+do_test (void)
+{
+  mtrace ();
+
+  {
+    struct char_array str;
+    TEST_VERIFY_EXIT (char_array_init_empty (&str) == true);
+    TEST_VERIFY_EXIT (char_array_length (&str) == 0);
+    TEST_VERIFY_EXIT (char_array_is_empty (&str) == true);
+    TEST_VERIFY_EXIT (strcmp (char_array_str (&str), "") == 0);
+    char_array_free (&str);
+  }
+
+  {
+    struct char_array str;
+    TEST_VERIFY_EXIT (char_array_init_str (&str, "testing"));
+    TEST_VERIFY_EXIT (char_array_length (&str) == strlen ("testing"));
+    TEST_VERIFY_EXIT (char_array_pos (&str, 2) == 's');
+    TEST_VERIFY_EXIT (char_array_is_empty (&str) == false);
+    TEST_VERIFY_EXIT (strcmp (char_array_str (&str), "testing") == 0);
+    char_array_free (&str);
+  }
+
+  {
+    struct char_array str;
+    TEST_VERIFY_EXIT (char_array_init_str_size (&str, "testing", 4));
+    TEST_VERIFY_EXIT (char_array_length (&str) == 4);
+    TEST_VERIFY_EXIT (char_array_pos (&str, 2) == 's');
+    TEST_VERIFY_EXIT (char_array_is_empty (&str) == false);
+    TEST_VERIFY_EXIT (strcmp (char_array_str (&str), "test") == 0);
+    char_array_free (&str);
+  }
+
+  {
+    struct char_array str;
+    TEST_VERIFY_EXIT (char_array_init_str (&str, "testing"));
+    TEST_VERIFY_EXIT (char_array_set_str (&str, "abcdef"));
+    TEST_VERIFY_EXIT (strcmp (char_array_str (&str), "abcdef") == 0);
+    TEST_VERIFY_EXIT (char_array_set_str_size (&str, "abcdef", 4));
+    TEST_VERIFY_EXIT (strcmp (char_array_str (&str), "abcd") == 0);
+    char_array_free (&str);
+  }
+
+  {
+    struct char_array str;
+    TEST_VERIFY_EXIT (char_array_init_str (&str, "testing"));
+    TEST_VERIFY_EXIT (char_array_erase (&str, 4) == true);
+    TEST_VERIFY_EXIT (char_array_length (&str) == strlen ("testing") - 1);
+    TEST_VERIFY_EXIT (strcmp (char_array_str (&str), "testng") == 0);
+    TEST_VERIFY_EXIT (char_array_erase (&str, char_array_length (&str))
+		      == false);
+    TEST_VERIFY_EXIT (char_array_length (&str) == strlen ("testing") - 1);
+    TEST_VERIFY_EXIT (char_array_erase (&str, char_array_length (&str) - 1)
+		      == true);
+    TEST_VERIFY_EXIT (char_array_length (&str) == strlen ("testing") - 2);
+    TEST_VERIFY_EXIT (strcmp (char_array_str (&str), "testn") == 0);
+    char_array_free (&str);
+  }
+
+  {
+    struct char_array str;
+    TEST_VERIFY_EXIT (char_array_init_str (&str, "test"));
+    TEST_VERIFY_EXIT (char_array_prepend_str (&str, "123"));
+    TEST_VERIFY_EXIT (strcmp (char_array_str (&str), "123test") == 0);
+    TEST_VERIFY_EXIT (char_array_length (&str) == strlen ("123test"));
+    TEST_VERIFY_EXIT (char_array_append_str (&str, "456"));
+    TEST_VERIFY_EXIT (strcmp (char_array_str (&str), "123test456") == 0);
+    TEST_VERIFY_EXIT (char_array_length (&str) == strlen ("123test456"));
+    TEST_VERIFY_EXIT (char_array_replace_str_pos (&str, 7, "789", 3));
+    TEST_VERIFY_EXIT (strcmp (char_array_str (&str), "123test789") == 0);
+    TEST_VERIFY_EXIT (char_array_length (&str) == strlen ("123test789"));
+    TEST_VERIFY_EXIT (char_array_crop (&str, 7));
+    TEST_VERIFY_EXIT (char_array_length (&str) == 7);
+    TEST_VERIFY_EXIT (strcmp (char_array_str (&str), "123test") == 0);
+    TEST_VERIFY_EXIT (char_array_append_char (&str, '4'));
+    TEST_VERIFY_EXIT (strcmp (char_array_str (&str), "123test4") == 0);
+    char_array_free (&str);
+  }
+
+  return 0;
+}
+
+#include <support/test-driver.c>
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/8] posix: Use char_array for internal glob dirname
  2021-01-05 18:58 [PATCH 0/8] Remove alloca usage from glob Adhemerval Zanella
  2021-01-05 18:58 ` [PATCH 1/8] malloc: Add specialized dynarray for C strings Adhemerval Zanella
@ 2021-01-05 18:58 ` Adhemerval Zanella
  2021-03-23 16:08   ` Arjun Shankar
  2021-01-05 18:58 ` [PATCH 3/8] posix: Remove alloca usage for GLOB_BRACE on glob Adhemerval Zanella
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 12+ messages in thread
From: Adhemerval Zanella @ 2021-01-05 18:58 UTC (permalink / raw)
  To: libc-alpha, Paul Eggert, bug-gnulib

This is the first patch of the set to remove alloca usage on glob
implementation.  Internal path to search for file might expand to a
non static directory derived from pattern for some difference cases
(GLOB_NOESCAPE, GNU GLOB_TILDE) and to allow a non-static dirname
path glob uses a lot of boilerplate code to manage the buffer (which
is either allocated using alloca or malloc depending both to size
requested and the total alloca_used).

The patch changes to use the char_array struct with the default size
(256 bytes).  It simplifies all the allocation code by using char_array
one and every internal buffer access is done using char_array provided
functions.  No functional changes are expected.

Checked on x86_64-linux-gnu.
---
 posix/glob.c | 275 +++++++++++++++++++++------------------------------
 1 file changed, 111 insertions(+), 164 deletions(-)

diff --git a/posix/glob.c b/posix/glob.c
index 32c88e5d15..8c6e1914c5 100644
--- a/posix/glob.c
+++ b/posix/glob.c
@@ -85,6 +85,7 @@
 #include <flexmember.h>
 #include <glob_internal.h>
 #include <scratch_buffer.h>
+#include <malloc/char_array-skeleton.c>
 \f
 static const char *next_brace_sub (const char *begin, int flags) __THROWNL;
 
@@ -298,16 +299,15 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
         glob_t *pglob)
 {
   const char *filename;
-  char *dirname = NULL;
   size_t dirlen;
   int status;
   size_t oldcount;
   int meta;
-  int dirname_modified;
-  int malloc_dirname = 0;
   glob_t dirs;
   int retval = 0;
   size_t alloca_used = 0;
+  struct char_array dirname;
+  bool dirname_modified;
 
   if (pattern == NULL || pglob == NULL || (flags & ~__GLOB_FLAGS) != 0)
     {
@@ -315,6 +315,10 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
       return -1;
     }
 
+  /* Default char array is stack allocated, so there is no need to check
+     if setting the initial '\0' succeeds.  */
+  char_array_init_empty (&dirname);
+
   /* POSIX requires all slashes to be matched.  This means that with
      a trailing slash we must match only directories.  */
   if (pattern[0] && pattern[strlen (pattern) - 1] == '/')
@@ -335,12 +339,12 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
           size_t i;
 
           if (pglob->gl_offs >= ~((size_t) 0) / sizeof (char *))
-            return GLOB_NOSPACE;
+            goto err_nospace;
 
           pglob->gl_pathv = (char **) malloc ((pglob->gl_offs + 1)
                                               * sizeof (char *));
           if (pglob->gl_pathv == NULL)
-            return GLOB_NOSPACE;
+            goto err_nospace;
 
           for (i = 0; i <= pglob->gl_offs; ++i)
             pglob->gl_pathv[i] = NULL;
@@ -392,7 +396,7 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
             {
               onealt = malloc (pattern_len);
               if (onealt == NULL)
-                return GLOB_NOSPACE;
+                goto err_nospace;
             }
 
           /* We know the prefix for all sub-patterns.  */
@@ -454,7 +458,8 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
                       globfree (pglob);
                       pglob->gl_pathc = 0;
                     }
-                  return result;
+                  retval = result;
+                  goto out;
                 }
 
               if (*next == '}')
@@ -471,9 +476,10 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
 
           if (pglob->gl_pathc != firstc)
             /* We found some entries.  */
-            return 0;
+            retval = 0;
           else if (!(flags & (GLOB_NOCHECK|GLOB_NOMAGIC)))
-            return GLOB_NOMATCH;
+            retval = GLOB_NOMATCH;
+          goto out;
         }
     }
 
@@ -492,14 +498,15 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
     filename = strchr (pattern, ':');
 #endif /* __MSDOS__ || WINDOWS32 */
 
-  dirname_modified = 0;
+  dirname_modified = false;
   if (filename == NULL)
     {
       /* This can mean two things: a simple name or "~name".  The latter
          case is nothing but a notation for a directory.  */
       if ((flags & (GLOB_TILDE|GLOB_TILDE_CHECK)) && pattern[0] == '~')
         {
-          dirname = (char *) pattern;
+         if (!char_array_set_str (&dirname, pattern))
+           goto err_nospace;
           dirlen = strlen (pattern);
 
           /* Set FILENAME to NULL as a special flag.  This is ugly but
@@ -516,7 +523,8 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
             }
 
           filename = pattern;
-          dirname = (char *) ".";
+         if (!char_array_set_str (&dirname, "."))
+           goto err_nospace;
           dirlen = 0;
         }
     }
@@ -525,13 +533,13 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
                && (flags & GLOB_NOESCAPE) == 0))
     {
       /* "/pattern" or "\\/pattern".  */
-      dirname = (char *) "/";
+      if (!char_array_set_str (&dirname, "/"))
+       goto err_nospace;
       dirlen = 1;
       ++filename;
     }
   else
     {
-      char *newp;
       dirlen = filename - pattern;
 #if defined __MSDOS__ || defined WINDOWS32
       if (*filename == ':'
@@ -545,31 +553,26 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
           /* For now, disallow wildcards in the drive spec, to
              prevent infinite recursion in glob.  */
           if (__glob_pattern_p (drive_spec, !(flags & GLOB_NOESCAPE)))
-            return GLOB_NOMATCH;
+           {
+             retval = GLOB_NOMATCH;
+             goto out;
+           }
           /* If this is "d:pattern", we need to copy ':' to DIRNAME
              as well.  If it's "d:/pattern", don't remove the slash
              from "d:/", since "d:" and "d:/" are not the same.*/
         }
 #endif
 
-      if (glob_use_alloca (alloca_used, dirlen + 1))
-        newp = alloca_account (dirlen + 1, alloca_used);
-      else
-        {
-          newp = malloc (dirlen + 1);
-          if (newp == NULL)
-            return GLOB_NOSPACE;
-          malloc_dirname = 1;
-        }
-      *((char *) mempcpy (newp, pattern, dirlen)) = '\0';
-      dirname = newp;
+      if (!char_array_set_str_size (&dirname, pattern, dirlen))
+       goto err_nospace;
       ++filename;
 
 #if defined __MSDOS__ || defined WINDOWS32
       bool drive_root = (dirlen > 1
-                         && (dirname[dirlen - 1] == ':'
-                             || (dirlen > 2 && dirname[dirlen - 2] == ':'
-                                 && dirname[dirlen - 1] == '/')));
+                         && (char_array_pos (&dirname, dirlen - 1) != ':'
+                             || (dirlen > 2
+                             && char_array_pos (&dirname, dirlen - 2) != ':'
+                                 && char_array_pos (&dirname, dirlen - 1) != '/')));
 #else
       bool drive_root = false;
 #endif
@@ -578,20 +581,24 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
         /* "pattern/".  Expand "pattern", appending slashes.  */
         {
           int orig_flags = flags;
-          if (!(flags & GLOB_NOESCAPE) && dirname[dirlen - 1] == '\\')
+          if (!(flags & GLOB_NOESCAPE)
+              && char_array_pos (&dirname, dirlen - 1) == '\\')
             {
               /* "pattern\\/".  Remove the final backslash if it hasn't
                  been quoted.  */
-              char *p = (char *) &dirname[dirlen - 1];
-
-              while (p > dirname && p[-1] == '\\') --p;
-              if ((&dirname[dirlen] - p) & 1)
+              size_t p = dirlen - 1;
+              while (p > 0 && char_array_pos (&dirname, p - 1) == '\\') --p;
+              if ((dirlen - p) & 1)
                 {
-                  *(char *) &dirname[--dirlen] = '\0';
+                  /* Since we are shrinking the array, there is no need to
+                     check the function return.  */
+                  dirlen -= 1;
+                  char_array_crop (&dirname, dirlen);
                   flags &= ~(GLOB_NOCHECK | GLOB_NOMAGIC);
                 }
             }
-          int val = __glob (dirname, flags | GLOB_MARK, errfunc, pglob);
+          int val = __glob (char_array_str (&dirname), flags | GLOB_MARK,
+                            errfunc, pglob);
           if (val == 0)
             pglob->gl_flags = ((pglob->gl_flags & ~GLOB_MARK)
                                | (flags & GLOB_MARK));
@@ -608,11 +615,14 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
         }
     }
 
-  if ((flags & (GLOB_TILDE|GLOB_TILDE_CHECK)) && dirname[0] == '~')
+  if ((flags & (GLOB_TILDE|GLOB_TILDE_CHECK))
+      && char_array_pos (&dirname, 0) == '~')
     {
-      if (dirname[1] == '\0' || dirname[1] == '/'
-          || (!(flags & GLOB_NOESCAPE) && dirname[1] == '\\'
-              && (dirname[2] == '\0' || dirname[2] == '/')))
+      if (char_array_pos (&dirname, 1) == '\0'
+         || char_array_pos (&dirname, 1) == '/'
+         || (!(flags & GLOB_NOESCAPE) && char_array_pos (&dirname, 1) == '\\'
+             && (char_array_pos (&dirname, 2) == '\0'
+                || char_array_pos (&dirname, 2) == '/')))
         {
           /* Look up home directory.  */
           char *home_dir = getenv ("HOME");
@@ -663,10 +673,7 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
                   if (err != ERANGE)
                     break;
                   if (!scratch_buffer_grow (&s))
-                    {
-                      retval = GLOB_NOSPACE;
-                      goto out;
-                    }
+                  goto err_nospace;
                 }
               if (err == 0)
                 {
@@ -675,10 +682,7 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
                 }
               scratch_buffer_free (&s);
               if (err == 0 && home_dir == NULL)
-                {
-                  retval = GLOB_NOSPACE;
-                  goto out;
-                }
+              goto err_nospace;
 #endif /* WINDOWS32 */
             }
           if (home_dir == NULL || home_dir[0] == '\0')
@@ -697,53 +701,26 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
                 }
             }
           /* Now construct the full directory.  */
-          if (dirname[1] == '\0')
+          if (char_array_pos (&dirname, 1) == '\0')
             {
-              if (__glibc_unlikely (malloc_dirname))
-                free (dirname);
-
-              dirname = home_dir;
-              dirlen = strlen (dirname);
-              malloc_dirname = malloc_home_dir;
+              if (!char_array_set_str (&dirname, home_dir))
+                goto err_nospace;
+              dirlen = char_array_size (&dirname) - 1;
             }
           else
             {
-              char *newp;
-              size_t home_len = strlen (home_dir);
-              int use_alloca = glob_use_alloca (alloca_used, home_len + dirlen);
-              if (use_alloca)
-                newp = alloca_account (home_len + dirlen, alloca_used);
-              else
-                {
-                  newp = malloc (home_len + dirlen);
-                  if (newp == NULL)
-                    {
-                      if (__glibc_unlikely (malloc_home_dir))
-                        free (home_dir);
-                      retval = GLOB_NOSPACE;
-                      goto out;
-                    }
-                }
-
-              mempcpy (mempcpy (newp, home_dir, home_len),
-                       &dirname[1], dirlen);
-
-              if (__glibc_unlikely (malloc_dirname))
-                free (dirname);
-
-              dirname = newp;
-              dirlen += home_len - 1;
-              malloc_dirname = !use_alloca;
-
-              if (__glibc_unlikely (malloc_home_dir))
-                free (home_dir);
+              /* Replaces '~' by the obtained HOME dir.  */
+              char_array_erase (&dirname, 0);
+              if (!char_array_prepend_str (&dirname, home_dir))
+               goto err_nospace;
             }
-          dirname_modified = 1;
+          dirname_modified = true;
         }
       else
         {
 #ifndef WINDOWS32
-          char *end_name = strchr (dirname, '/');
+          char *dirnamestr = char_array_at (&dirname, 0);
+          char *end_name = strchr (dirnamestr, '/');
           char *user_name;
           int malloc_user_name = 0;
           char *unescape = NULL;
@@ -752,23 +729,23 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
             {
               if (end_name == NULL)
                 {
-                  unescape = strchr (dirname, '\\');
+                unescape = strchr (dirnamestr, '\\');
                   if (unescape)
                     end_name = strchr (unescape, '\0');
                 }
               else
-                unescape = memchr (dirname, '\\', end_name - dirname);
+              unescape = memchr (dirnamestr, '\\', end_name - dirnamestr);
             }
           if (end_name == NULL)
-            user_name = dirname + 1;
+            user_name = dirnamestr + 1;
           else
             {
               char *newp;
-              if (glob_use_alloca (alloca_used, end_name - dirname))
-                newp = alloca_account (end_name - dirname, alloca_used);
+              if (glob_use_alloca (alloca_used, end_name - dirnamestr))
+                newp = alloca_account (end_name - dirnamestr, alloca_used);
               else
                 {
-                  newp = malloc (end_name - dirname);
+                  newp = malloc (end_name - dirnamestr);
                   if (newp == NULL)
                     {
                       retval = GLOB_NOSPACE;
@@ -778,8 +755,8 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
                 }
               if (unescape != NULL)
                 {
-                  char *p = mempcpy (newp, dirname + 1,
-                                     unescape - dirname - 1);
+                  char *p = mempcpy (newp, dirnamestr + 1,
+                                     unescape - dirnamestr - 1);
                   char *q = unescape;
                   while (q != end_name)
                     {
@@ -801,8 +778,9 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
                   *p = '\0';
                 }
               else
-                *((char *) mempcpy (newp, dirname + 1, end_name - dirname - 1))
-                  = '\0';
+                *((char *) mempcpy (newp, dirnamestr + 1,
+                                    end_name - dirnamestr - 1))
+                   = '\0';
               user_name = newp;
             }
 
@@ -835,39 +813,14 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
             /* If we found a home directory use this.  */
             if (p != NULL)
               {
-                size_t home_len = strlen (p->pw_dir);
-                size_t rest_len = end_name == NULL ? 0 : strlen (end_name);
-                /* dirname contains end_name; we can't free it now.  */
-                char *prev_dirname =
-                  (__glibc_unlikely (malloc_dirname) ? dirname : NULL);
-                char *d;
-
-                malloc_dirname = 0;
-
-                if (glob_use_alloca (alloca_used, home_len + rest_len + 1))
-                  dirname = alloca_account (home_len + rest_len + 1,
-                                            alloca_used);
-                else
+                if (!char_array_set_str (&dirname, p->pw_dir))
                   {
-                    dirname = malloc (home_len + rest_len + 1);
-                    if (dirname == NULL)
-                      {
-                        free (prev_dirname);
-                        scratch_buffer_free (&pwtmpbuf);
-                        retval = GLOB_NOSPACE;
-                        goto out;
-                      }
-                    malloc_dirname = 1;
+                    scratch_buffer_free (&pwtmpbuf);
+                    goto err_nospace;
                   }
-                d = mempcpy (dirname, p->pw_dir, home_len);
-                if (end_name != NULL)
-                  d = mempcpy (d, end_name, rest_len);
-                *d = '\0';
-
-                free (prev_dirname);
 
-                dirlen = home_len + rest_len;
-                dirname_modified = 1;
+                dirlen = strlen (p->pw_dir);
+                dirname_modified = true;
               }
             else
               {
@@ -908,37 +861,33 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
         goto nospace;
       pglob->gl_pathv = new_gl_pathv;
 
-      if (flags & GLOB_MARK && is_dir (dirname, flags, pglob))
+      if (flags & GLOB_MARK
+          && is_dir (char_array_str (&dirname), flags, pglob))
         {
           char *p;
           pglob->gl_pathv[newcount] = malloc (dirlen + 2);
           if (pglob->gl_pathv[newcount] == NULL)
             goto nospace;
-          p = mempcpy (pglob->gl_pathv[newcount], dirname, dirlen);
+          p = mempcpy (pglob->gl_pathv[newcount],
+                       char_array_str (&dirname), dirlen);
           p[0] = '/';
           p[1] = '\0';
-          if (__glibc_unlikely (malloc_dirname))
-            free (dirname);
         }
       else
         {
-          if (__glibc_unlikely (malloc_dirname))
-            pglob->gl_pathv[newcount] = dirname;
-          else
-            {
-              pglob->gl_pathv[newcount] = strdup (dirname);
-              if (pglob->gl_pathv[newcount] == NULL)
-                goto nospace;
-            }
+          pglob->gl_pathv[newcount] = strdup (char_array_str (&dirname));
+          if (pglob->gl_pathv[newcount] == NULL)
+            goto nospace;
         }
       pglob->gl_pathv[++newcount] = NULL;
       ++pglob->gl_pathc;
       pglob->gl_flags = flags;
 
-      return 0;
+      goto out;
     }
 
-  meta = __glob_pattern_type (dirname, !(flags & GLOB_NOESCAPE));
+  meta = __glob_pattern_type (char_array_str (&dirname),
+                              !(flags & GLOB_NOESCAPE));
   /* meta is 1 if correct glob pattern containing metacharacters.
      If meta has bit (1 << 2) set, it means there was an unterminated
      [ which we handle the same, using fnmatch.  Broken unterminated
@@ -951,15 +900,15 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
          the pattern in each directory found.  */
       size_t i;
 
-      if (!(flags & GLOB_NOESCAPE) && dirlen > 0 && dirname[dirlen - 1] == '\\')
+      if (!(flags & GLOB_NOESCAPE) && dirlen > 0
+          && char_array_pos (&dirname, dirlen - 1) == '\\')
         {
           /* "foo\\/bar".  Remove the final backslash from dirname
              if it has not been quoted.  */
-          char *p = (char *) &dirname[dirlen - 1];
-
-          while (p > dirname && p[-1] == '\\') --p;
-          if ((&dirname[dirlen] - p) & 1)
-            *(char *) &dirname[--dirlen] = '\0';
+          size_t p = dirlen - 1;
+          while (p > 0 && char_array_pos (&dirname, p - 1) == '\\') --p;
+          if ((dirlen - p) & 1)
+            char_array_crop (&dirname, --dirlen);
         }
 
       if (__glibc_unlikely ((flags & GLOB_ALTDIRFUNC) != 0))
@@ -973,7 +922,7 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
           dirs.gl_lstat = pglob->gl_lstat;
         }
 
-      status = __glob (dirname,
+      status = __glob (char_array_str (&dirname),
                        ((flags & (GLOB_ERR | GLOB_NOESCAPE | GLOB_ALTDIRFUNC))
                         | GLOB_NOSORT | GLOB_ONLYDIR),
                        errfunc, &dirs);
@@ -1020,8 +969,7 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
               globfree (&dirs);
               globfree (pglob);
               pglob->gl_pathc = 0;
-              retval = GLOB_NOSPACE;
-              goto out;
+              goto err_nospace;
             }
         }
 
@@ -1043,8 +991,7 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
                 {
                 nospace2:
                   globfree (&dirs);
-                  retval = GLOB_NOSPACE;
-                  goto out;
+                  goto err_nospace;
                 }
 
               new_gl_pathv = realloc (pglob->gl_pathv,
@@ -1059,8 +1006,7 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
                   globfree (&dirs);
                   globfree (pglob);
                   pglob->gl_pathc = 0;
-                  retval = GLOB_NOSPACE;
-                  goto out;
+                  goto err_nospace;
                 }
 
               ++pglob->gl_pathc;
@@ -1086,7 +1032,7 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
 
       if (meta & GLOBPAT_BACKSLASH)
         {
-          char *p = strchr (dirname, '\\'), *q;
+          char *p = strchr (char_array_str (&dirname), '\\'), *q;
           /* We need to unescape the dirname string.  It is certainly
              allocated by alloca, as otherwise filename would be NULL
              or dirname wouldn't contain backslashes.  */
@@ -1103,12 +1049,12 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
               ++q;
             }
           while (*p++ != '\0');
-          dirname_modified = 1;
+          dirname_modified = true;
         }
       if (dirname_modified)
         flags &= ~(GLOB_NOCHECK | GLOB_NOMAGIC);
-      status = glob_in_dir (filename, dirname, flags, errfunc, pglob,
-                            alloca_used);
+      status = glob_in_dir (filename, char_array_str (&dirname), flags,
+                            errfunc, pglob, alloca_used);
       if (status != 0)
         {
           if (status == GLOB_NOMATCH && flags != orig_flags
@@ -1126,14 +1072,13 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
       if (dirlen > 0)
         {
           /* Stick the directory on the front of each name.  */
-          if (prefix_array (dirname,
+          if (prefix_array (char_array_str (&dirname),
                             &pglob->gl_pathv[old_pathc + pglob->gl_offs],
                             pglob->gl_pathc - old_pathc))
             {
               globfree (pglob);
               pglob->gl_pathc = 0;
-              retval = GLOB_NOSPACE;
-              goto out;
+              goto err_nospace;
             }
         }
     }
@@ -1152,8 +1097,7 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
               {
                 globfree (pglob);
                 pglob->gl_pathc = 0;
-                retval = GLOB_NOSPACE;
-                goto out;
+                goto err_nospace;
               }
             strcpy (&new[len - 2], "/");
             pglob->gl_pathv[i] = new;
@@ -1169,10 +1113,13 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
     }
 
  out:
-  if (__glibc_unlikely (malloc_dirname))
-    free (dirname);
 
+  char_array_free (&dirname);
   return retval;
+
+ err_nospace:
+  char_array_free (&dirname);
+  return GLOB_NOSPACE;
 }
 #if defined _LIBC && !defined __glob
 versioned_symbol (libc, __glob, glob, GLIBC_2_27);
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 3/8] posix: Remove alloca usage for GLOB_BRACE on glob
  2021-01-05 18:58 [PATCH 0/8] Remove alloca usage from glob Adhemerval Zanella
  2021-01-05 18:58 ` [PATCH 1/8] malloc: Add specialized dynarray for C strings Adhemerval Zanella
  2021-01-05 18:58 ` [PATCH 2/8] posix: Use char_array for internal glob dirname Adhemerval Zanella
@ 2021-01-05 18:58 ` Adhemerval Zanella
  2021-01-05 18:58 ` [PATCH 4/8] posix: Remove alloca usage on glob dirname Adhemerval Zanella
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 12+ messages in thread
From: Adhemerval Zanella @ 2021-01-05 18:58 UTC (permalink / raw)
  To: libc-alpha, Paul Eggert, bug-gnulib

GNU GLOB_BRACE internal implementation constructs a new expression and
calls glob recursively.  It then requires a possible large temporary
buffer place the new pattern.

This patch removes the alloca/malloc usage and replaces it with
char_array.

Checked on x86_64-linux-gnu.
---
 posix/glob.c | 53 ++++++++++++++++++++++++++++------------------------
 1 file changed, 29 insertions(+), 24 deletions(-)

diff --git a/posix/glob.c b/posix/glob.c
index 8c6e1914c5..41e1aeacad 100644
--- a/posix/glob.c
+++ b/posix/glob.c
@@ -382,25 +382,23 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
           /* Allocate working buffer large enough for our work.  Note that
              we have at least an opening and closing brace.  */
           size_t firstc;
-          char *alt_start;
           const char *p;
           const char *next;
           const char *rest;
           size_t rest_len;
-          char *onealt;
-          size_t pattern_len = strlen (pattern) - 1;
-          int alloca_onealt = glob_use_alloca (alloca_used, pattern_len);
-          if (alloca_onealt)
-            onealt = alloca_account (pattern_len, alloca_used);
-          else
-            {
-              onealt = malloc (pattern_len);
-              if (onealt == NULL)
-                goto err_nospace;
-            }
+          struct char_array onealt;
 
           /* We know the prefix for all sub-patterns.  */
-          alt_start = mempcpy (onealt, pattern, begin - pattern);
+          ptrdiff_t onealtlen = begin - pattern;
+          if (!char_array_init_str_size (&onealt, pattern, onealtlen))
+            {
+              if (!(flags & GLOB_APPEND))
+                {
+                  pglob->gl_pathc = 0;
+                  pglob->gl_pathv = NULL;
+                }
+              goto err_nospace;
+            }
 
           /* Find the first sub-pattern and at the same time find the
              rest after the closing brace.  */
@@ -408,9 +406,7 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
           if (next == NULL)
             {
               /* It is an invalid expression.  */
-            illegal_brace:
-              if (__glibc_unlikely (!alloca_onealt))
-                free (onealt);
+              char_array_free (&onealt);
               flags &= ~GLOB_BRACE;
               goto no_brace;
             }
@@ -421,8 +417,12 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
             {
               rest = next_brace_sub (rest + 1, flags);
               if (rest == NULL)
-                /* It is an illegal expression.  */
-                goto illegal_brace;
+                {
+                  /* It is an illegal expression.  */
+                  char_array_free (&onealt);
+                  flags &= ~GLOB_BRACE;
+                  goto no_brace;
+                }
             }
           /* Please note that we now can be sure the brace expression
              is well-formed.  */
@@ -441,9 +441,16 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
               int result;
 
               /* Construct the new glob expression.  */
-              mempcpy (mempcpy (alt_start, p, next - p), rest, rest_len);
+              ptrdiff_t nextlen = next - p;
+              if (!char_array_replace_str_pos (&onealt, onealtlen, p, nextlen)
+                  || !char_array_replace_str_pos (&onealt, onealtlen + nextlen,
+                                                  rest, rest_len))
+                {
+                  char_array_free (&onealt);
+                  goto err_nospace;
+                }
 
-              result = __glob (onealt,
+              result = __glob (char_array_str (&onealt),
                                ((flags & ~(GLOB_NOCHECK | GLOB_NOMAGIC))
                                 | GLOB_APPEND),
                                errfunc, pglob);
@@ -451,8 +458,7 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
               /* If we got an error, return it.  */
               if (result && result != GLOB_NOMATCH)
                 {
-                  if (__glibc_unlikely (!alloca_onealt))
-                    free (onealt);
+                  char_array_free (&onealt);
                   if (!(flags & GLOB_APPEND))
                     {
                       globfree (pglob);
@@ -471,8 +477,7 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
               assert (next != NULL);
             }
 
-          if (__glibc_unlikely (!alloca_onealt))
-            free (onealt);
+          char_array_free (&onealt);
 
           if (pglob->gl_pathc != firstc)
             /* We found some entries.  */
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 4/8] posix: Remove alloca usage on glob dirname
  2021-01-05 18:58 [PATCH 0/8] Remove alloca usage from glob Adhemerval Zanella
                   ` (2 preceding siblings ...)
  2021-01-05 18:58 ` [PATCH 3/8] posix: Remove alloca usage for GLOB_BRACE on glob Adhemerval Zanella
@ 2021-01-05 18:58 ` Adhemerval Zanella
  2021-01-05 18:58 ` [PATCH 5/8] posix: Use dynarray for globname in glob Adhemerval Zanella
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 12+ messages in thread
From: Adhemerval Zanella @ 2021-01-05 18:58 UTC (permalink / raw)
  To: libc-alpha, Paul Eggert, bug-gnulib

This patch replaces the alloca/malloc usage for dirname creation
by the char_array struct.

Checked on x86_64-linux-gnu.
---
 posix/glob.c | 28 +++++++++-------------------
 1 file changed, 9 insertions(+), 19 deletions(-)

diff --git a/posix/glob.c b/posix/glob.c
index 41e1aeacad..d734ca977c 100644
--- a/posix/glob.c
+++ b/posix/glob.c
@@ -1212,7 +1212,6 @@ glob_in_dir (const char *pattern, const char *directory, int flags,
              int (*errfunc) (const char *, int),
              glob_t *pglob, size_t alloca_used)
 {
-  size_t dirlen = strlen (directory);
   void *stream = NULL;
 # define GLOBNAMES_MEMBERS(nnames) \
     struct globnames *next; size_t count; char *name[nnames];
@@ -1244,32 +1243,23 @@ glob_in_dir (const char *pattern, const char *directory, int flags,
     }
   else if (meta == GLOBPAT_NONE)
     {
-      size_t patlen = strlen (pattern);
-      size_t fullsize;
-      bool alloca_fullname
-        = (! size_add_wrapv (dirlen + 1, patlen + 1, &fullsize)
-           && glob_use_alloca (alloca_used, fullsize));
-      char *fullname;
-      if (alloca_fullname)
-        fullname = alloca_account (fullsize, alloca_used);
-      else
+      struct char_array fullname;
+
+      if (!char_array_init_str (&fullname, directory)
+          || !char_array_append_str (&fullname, "/")
+          || !char_array_append_str (&fullname, pattern))
         {
-          fullname = malloc (fullsize);
-          if (fullname == NULL)
-            return GLOB_NOSPACE;
+          char_array_free (&fullname);
+          return GLOB_NOSPACE;
         }
 
-      mempcpy (mempcpy (mempcpy (fullname, directory, dirlen),
-                        "/", 1),
-               pattern, patlen + 1);
-      if (glob_lstat (pglob, flags, fullname) == 0
+      if (glob_lstat (pglob, flags, char_array_str (&fullname)) == 0
           || errno == EOVERFLOW)
         /* We found this file to be existing.  Now tell the rest
            of the function to copy this name into the result.  */
         flags |= GLOB_NOCHECK;
 
-      if (__glibc_unlikely (!alloca_fullname))
-        free (fullname);
+      char_array_free (&fullname);
     }
   else
     {
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 5/8] posix: Use dynarray for globname in glob
  2021-01-05 18:58 [PATCH 0/8] Remove alloca usage from glob Adhemerval Zanella
                   ` (3 preceding siblings ...)
  2021-01-05 18:58 ` [PATCH 4/8] posix: Remove alloca usage on glob dirname Adhemerval Zanella
@ 2021-01-05 18:58 ` Adhemerval Zanella
  2021-01-05 18:58 ` [PATCH 6/8] posix: Remove alloca usage on glob user_name Adhemerval Zanella
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 12+ messages in thread
From: Adhemerval Zanella @ 2021-01-05 18:58 UTC (permalink / raw)
  To: libc-alpha, Paul Eggert, bug-gnulib

This patch uses dynarray at glob internal glob_in_dir function to manage
the various matched patterns.  It simplify and removes all the boilerplate
buffer managements required.  It also removes the glob_use_alloca, since
it is not used anymore.

Checked on x86_64-linux-gnu.
---
 posix/glob.c | 126 +++++++++++++++------------------------------------
 1 file changed, 37 insertions(+), 89 deletions(-)

diff --git a/posix/glob.c b/posix/glob.c
index d734ca977c..d199204931 100644
--- a/posix/glob.c
+++ b/posix/glob.c
@@ -1203,6 +1203,21 @@ prefix_array (const char *dirname, char **array, size_t n)
   return 0;
 }
 
+struct globnames_result
+{
+  char **names;
+  size_t length;
+};
+
+/* Create a dynamic array for C string representing the glob name found.  */
+#define DYNARRAY_STRUCT            globnames_array
+#define DYNARRAY_ELEMENT_FREE(ptr) free (*ptr)
+#define DYNARRAY_ELEMENT           char *
+#define DYNARRAY_PREFIX            globnames_array_
+#define DYNARRAY_FINAL_TYPE        struct globnames_result
+#define DYNARRAY_INITIAL_SIZE      64
+#include <malloc/dynarray-skeleton.c>
+
 /* Like 'glob', but PATTERN is a final pathname component,
    and matches are searched for in DIRECTORY.
    The GLOB_NOSORT bit in FLAGS is ignored.  No sorting is ever done.
@@ -1213,25 +1228,13 @@ glob_in_dir (const char *pattern, const char *directory, int flags,
              glob_t *pglob, size_t alloca_used)
 {
   void *stream = NULL;
-# define GLOBNAMES_MEMBERS(nnames) \
-    struct globnames *next; size_t count; char *name[nnames];
-  struct globnames { GLOBNAMES_MEMBERS (FLEXIBLE_ARRAY_MEMBER) };
-  struct { GLOBNAMES_MEMBERS (64) } init_names_buf;
-  struct globnames *init_names = (struct globnames *) &init_names_buf;
-  struct globnames *names = init_names;
-  struct globnames *names_alloca = init_names;
+  struct globnames_array globnames;
   size_t nfound = 0;
-  size_t cur = 0;
   int meta;
   int save;
   int result;
 
-  alloca_used += sizeof init_names_buf;
-
-  init_names->next = NULL;
-  init_names->count = ((sizeof init_names_buf
-                        - offsetof (struct globnames, name))
-                       / sizeof init_names->name[0]);
+  globnames_array_init (&globnames);
 
   meta = __glob_pattern_type (pattern, !(flags & GLOB_NOESCAPE));
   if (meta == GLOBPAT_NONE && (flags & (GLOB_NOCHECK|GLOB_NOMAGIC)))
@@ -1308,34 +1311,10 @@ glob_in_dir (const char *pattern, const char *directory, int flags,
 
               if (fnmatch (pattern, d.name, fnm_flags) == 0)
                 {
-                  if (cur == names->count)
-                    {
-                      struct globnames *newnames;
-                      size_t count = names->count * 2;
-                      size_t nameoff = offsetof (struct globnames, name);
-                      size_t size = FLEXSIZEOF (struct globnames, name,
-                                                count * sizeof (char *));
-                      if ((SIZE_MAX - nameoff) / 2 / sizeof (char *)
-                          < names->count)
-                        goto memory_error;
-                      if (glob_use_alloca (alloca_used, size))
-                        newnames = names_alloca
-                          = alloca_account (size, alloca_used);
-                      else if ((newnames = malloc (size))
-                               == NULL)
-                        goto memory_error;
-                      newnames->count = count;
-                      newnames->next = names;
-                      names = newnames;
-                      cur = 0;
-                    }
-                  names->name[cur] = strdup (d.name);
-                  if (names->name[cur] == NULL)
-                    goto memory_error;
-                  ++cur;
-                  ++nfound;
-                  if (SIZE_MAX - pglob->gl_offs <= nfound)
+                  globnames_array_add (&globnames, strdup (d.name));
+                  if (globnames_array_has_failed (&globnames))
                     goto memory_error;
+                  nfound++;
                 }
             }
         }
@@ -1345,10 +1324,13 @@ glob_in_dir (const char *pattern, const char *directory, int flags,
     {
       size_t len = strlen (pattern);
       nfound = 1;
-      names->name[cur] = malloc (len + 1);
-      if (names->name[cur] == NULL)
+      char *newp = malloc (len + 1);
+      if (newp == NULL)
         goto memory_error;
-      *((char *) mempcpy (names->name[cur++], pattern, len)) = '\0';
+      *((char *) mempcpy (newp, pattern, len)) = '\0';
+      globnames_array_add (&globnames, newp);
+      if (globnames_array_has_failed (&globnames))
+       goto memory_error;
     }
 
   result = GLOB_NOMATCH;
@@ -1369,59 +1351,25 @@ glob_in_dir (const char *pattern, const char *directory, int flags,
       if (new_gl_pathv == NULL)
         {
         memory_error:
-          while (1)
-            {
-              struct globnames *old = names;
-              for (size_t i = 0; i < cur; ++i)
-                free (names->name[i]);
-              names = names->next;
-              /* NB: we will not leak memory here if we exit without
-                 freeing the current block assigned to OLD.  At least
-                 the very first block is always allocated on the stack
-                 and this is the block assigned to OLD here.  */
-              if (names == NULL)
-                {
-                  assert (old == init_names);
-                  break;
-                }
-              cur = names->count;
-              if (old == names_alloca)
-                names_alloca = names;
-              else
-                free (old);
-            }
+          globnames_array_free (&globnames);
           result = GLOB_NOSPACE;
         }
       else
         {
-          while (1)
+          struct globnames_result ret = { .names = 0, .length = -1 };
+          if (!globnames_array_finalize (&globnames, &ret))
+            result = GLOB_NOSPACE;
+          else
             {
-              struct globnames *old = names;
-              for (size_t i = 0; i < cur; ++i)
+              for (size_t i = 0; i < ret.length; ++i)
                 new_gl_pathv[pglob->gl_offs + pglob->gl_pathc++]
-                  = names->name[i];
-              names = names->next;
-              /* NB: we will not leak memory here if we exit without
-                 freeing the current block assigned to OLD.  At least
-                 the very first block is always allocated on the stack
-                 and this is the block assigned to OLD here.  */
-              if (names == NULL)
-                {
-                  assert (old == init_names);
-                  break;
-                }
-              cur = names->count;
-              if (old == names_alloca)
-                names_alloca = names;
-              else
-                free (old);
+                  = ret.names[i];
+              pglob->gl_pathv = new_gl_pathv;
+              pglob->gl_pathv[pglob->gl_offs + pglob->gl_pathc] = NULL;
+              pglob->gl_flags = flags;
             }
 
-          pglob->gl_pathv = new_gl_pathv;
-
-          pglob->gl_pathv[pglob->gl_offs + pglob->gl_pathc] = NULL;
-
-          pglob->gl_flags = flags;
+          free (ret.names);
         }
     }
 
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 6/8] posix: Remove alloca usage on glob user_name
  2021-01-05 18:58 [PATCH 0/8] Remove alloca usage from glob Adhemerval Zanella
                   ` (4 preceding siblings ...)
  2021-01-05 18:58 ` [PATCH 5/8] posix: Use dynarray for globname in glob Adhemerval Zanella
@ 2021-01-05 18:58 ` Adhemerval Zanella
  2021-01-05 18:58 ` [PATCH 7/8] posix: Use char_array for home_dir in glob Adhemerval Zanella
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 12+ messages in thread
From: Adhemerval Zanella @ 2021-01-05 18:58 UTC (permalink / raw)
  To: libc-alpha, Paul Eggert, bug-gnulib

This patch uses dynarray at glob internal user name manipulation for
GLOB_TILDE.  It simplifies it and removes all the boilerplate buffer
managements required.  It also removes the glob_use_alloca, since
it is not used anymore.

Checked on x86_64-linux-gnu.
---
 posix/glob.c | 86 ++++++++++++++++------------------------------------
 1 file changed, 26 insertions(+), 60 deletions(-)

diff --git a/posix/glob.c b/posix/glob.c
index d199204931..c126b4501d 100644
--- a/posix/glob.c
+++ b/posix/glob.c
@@ -42,13 +42,13 @@
 
 #ifndef WINDOWS32
 # include <pwd.h>
+# include <alloca.h>
 #endif
 
 #include <errno.h>
 #include <dirent.h>
 #include <stdlib.h>
 #include <string.h>
-#include <alloca.h>
 
 #ifdef _LIBC
 # undef strdup
@@ -215,29 +215,6 @@ glob_lstat (glob_t *pglob, int flags, const char *fullname)
           : LSTAT64 (fullname, &ust.st64));
 }
 
-/* Set *R = A + B.  Return true if the answer is mathematically
-   incorrect due to overflow; in this case, *R is the low order
-   bits of the correct answer.  */
-
-static bool
-size_add_wrapv (size_t a, size_t b, size_t *r)
-{
-#if 7 <= __GNUC__ && !defined __ICC
-  return __builtin_add_overflow (a, b, r);
-#else
-  *r = a + b;
-  return *r < a;
-#endif
-}
-
-static bool
-glob_use_alloca (size_t alloca_used, size_t len)
-{
-  size_t size;
-  return (!size_add_wrapv (alloca_used, len, &size)
-          && __libc_use_alloca (size));
-}
-
 static int glob_in_dir (const char *pattern, const char *directory,
                         int flags, int (*errfunc) (const char *, int),
                         glob_t *pglob, size_t alloca_used);
@@ -724,11 +701,10 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
       else
         {
 #ifndef WINDOWS32
-          char *dirnamestr = char_array_at (&dirname, 0);
-          char *end_name = strchr (dirnamestr, '/');
-          char *user_name;
-          int malloc_user_name = 0;
-          char *unescape = NULL;
+          const char *dirnamestr = char_array_str (&dirname);
+          const char *end_name = strchr (dirnamestr, '/');
+          struct char_array user_name;
+          const char *unescape = NULL;
 
           if (!(flags & GLOB_NOESCAPE))
             {
@@ -742,27 +718,19 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
               unescape = memchr (dirnamestr, '\\', end_name - dirnamestr);
             }
           if (end_name == NULL)
-            user_name = dirnamestr + 1;
+            {
+              if (!char_array_init_str (&user_name, dirnamestr + 1))
+                goto err_nospace;
+            }
           else
             {
-              char *newp;
-              if (glob_use_alloca (alloca_used, end_name - dirnamestr))
-                newp = alloca_account (end_name - dirnamestr, alloca_used);
-              else
-                {
-                  newp = malloc (end_name - dirnamestr);
-                  if (newp == NULL)
-                    {
-                      retval = GLOB_NOSPACE;
-                      goto out;
-                    }
-                  malloc_user_name = 1;
-                }
               if (unescape != NULL)
                 {
-                  char *p = mempcpy (newp, dirnamestr + 1,
-                                     unescape - dirnamestr - 1);
-                  char *q = unescape;
+                  ptrdiff_t name_len = unescape - dirnamestr - 1;
+                  if (!char_array_init_str_size (&user_name, dirnamestr + 1,
+                                                 name_len))
+                    goto err_nospace;
+                  const char *q = unescape;
                   while (q != end_name)
                     {
                       if (*q == '\\')
@@ -773,20 +741,21 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
                                  but "~fo\\o\\/" unescape to user_name
                                  "foo".  */
                               if (filename == NULL)
-                                *p++ = '\\';
+                                char_array_append_char (&user_name, '\\');
                               break;
                             }
                           ++q;
                         }
-                      *p++ = *q++;
+                      char_array_append_char (&user_name, *q++);
                     }
-                  *p = '\0';
                 }
               else
-                *((char *) mempcpy (newp, dirnamestr + 1,
-                                    end_name - dirnamestr - 1))
-                   = '\0';
-              user_name = newp;
+                {
+                  ptrdiff_t name_len = end_name - dirnamestr - 1;
+                  if (!char_array_init_str_size (&user_name, dirnamestr + 1,
+                                                 name_len))
+                    goto err_nospace;
+                }
             }
 
           /* Look up specific user's home directory.  */
@@ -798,7 +767,7 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
 #  if defined HAVE_GETPWNAM_R || defined _LIBC
             struct passwd pwbuf;
 
-            while (getpwnam_r (user_name, &pwbuf,
+            while (getpwnam_r (char_array_str (&user_name), &pwbuf,
                                pwtmpbuf.data, pwtmpbuf.length, &p)
                    == ERANGE)
               {
@@ -809,11 +778,10 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
                   }
               }
 #  else
-            p = getpwnam (user_name);
+            p = getpwnam (char_array_str (&user_name));
 #  endif
 
-            if (__glibc_unlikely (malloc_user_name))
-              free (user_name);
+            char_array_free (&user_name);
 
             /* If we found a home directory use this.  */
             if (p != NULL)
@@ -1038,9 +1006,7 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
       if (meta & GLOBPAT_BACKSLASH)
         {
           char *p = strchr (char_array_str (&dirname), '\\'), *q;
-          /* We need to unescape the dirname string.  It is certainly
-             allocated by alloca, as otherwise filename would be NULL
-             or dirname wouldn't contain backslashes.  */
+          /* We need to unescape the dirname string.  */
           q = p;
           do
             {
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 7/8] posix: Use char_array for home_dir in glob
  2021-01-05 18:58 [PATCH 0/8] Remove alloca usage from glob Adhemerval Zanella
                   ` (5 preceding siblings ...)
  2021-01-05 18:58 ` [PATCH 6/8] posix: Remove alloca usage on glob user_name Adhemerval Zanella
@ 2021-01-05 18:58 ` Adhemerval Zanella
  2021-01-05 18:58 ` [PATCH 8/8] posix: Remove all alloca usage " Adhemerval Zanella
  2021-01-13 19:36 ` [PATCH 0/8] Remove alloca usage from glob Paul Eggert
  8 siblings, 0 replies; 12+ messages in thread
From: Adhemerval Zanella @ 2021-01-05 18:58 UTC (permalink / raw)
  To: libc-alpha, Paul Eggert, bug-gnulib

This patch uses dynarray at glob internal home directory ame
manipulation for GLOB_TILDE.  It simplifies it and removes all the
boilerplate buffer managements required.

Checked x86_64-linux-gnu.
---
 posix/glob.c | 69 ++++++++++++++++++++++++++++++++--------------------
 1 file changed, 42 insertions(+), 27 deletions(-)

diff --git a/posix/glob.c b/posix/glob.c
index c126b4501d..b6727ee884 100644
--- a/posix/glob.c
+++ b/posix/glob.c
@@ -606,10 +606,15 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
              && (char_array_pos (&dirname, 2) == '\0'
                 || char_array_pos (&dirname, 2) == '/')))
         {
+          struct char_array home_dir;
+
+          const char *home_env = getenv ("HOME");
+          home_env = home_env == NULL ? "" : home_env;
+          if (!char_array_init_str (&home_dir, home_env))
+            goto err_nospace;
+
+          if (char_array_is_empty (&home_dir))
           /* Look up home directory.  */
-          char *home_dir = getenv ("HOME");
-          int malloc_home_dir = 0;
-          if (home_dir == NULL || home_dir[0] == '\0')
             {
 #ifdef WINDOWS32
               /* Windows NT defines HOMEDRIVE and HOMEPATH.  But give
@@ -619,16 +624,21 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
 
               if (home_drive != NULL && home_path != NULL)
                 {
-                  size_t home_drive_len = strlen (home_drive);
-                  size_t home_path_len = strlen (home_path);
-                  char *mem = alloca (home_drive_len + home_path_len + 1);
-
-                  memcpy (mem, home_drive, home_drive_len);
-                  memcpy (mem + home_drive_len, home_path, home_path_len + 1);
-                  home_dir = mem;
+                  if (!char_array_set_str (&home_dir, home_drive)
+                      || !char_array_append_str (&home_dir, home_path))
+                   {
+                     char_array_free (&home_dir);
+                     goto err_nospace;
+                   }
                 }
               else
-                home_dir = "c:/users/default"; /* poor default */
+                {
+                  if (!char_array_set_str (&home_dir, "c:/users/default"))
+                    {
+                      char_array_free (&home_dir);
+                      goto err_nospace;
+                    }
+                }
 #else
               int err;
               struct passwd *p;
@@ -657,44 +667,49 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
                   if (!scratch_buffer_grow (&s))
                   goto err_nospace;
                 }
-              if (err == 0)
-                {
-                  home_dir = strdup (p->pw_dir);
-                  malloc_home_dir = 1;
-                }
+              if (err == 0 && !char_array_set_str (&home_dir, p->pw_dir))
+                err = 1;
               scratch_buffer_free (&s);
-              if (err == 0 && home_dir == NULL)
-              goto err_nospace;
+              if (err == 0)
+                goto err_nospace;
 #endif /* WINDOWS32 */
             }
-          if (home_dir == NULL || home_dir[0] == '\0')
+          if (char_array_is_empty (&home_dir))
             {
-              if (__glibc_unlikely (malloc_home_dir))
-                free (home_dir);
               if (flags & GLOB_TILDE_CHECK)
                 {
+                  char_array_free (&home_dir);
                   retval = GLOB_NOMATCH;
                   goto out;
                 }
               else
                 {
-                  home_dir = (char *) "~"; /* No luck.  */
-                  malloc_home_dir = 0;
+                  if (!char_array_set_str (&home_dir, "~"))
+                    {
+                      char_array_free (&home_dir);
+                      goto err_nospace;
+                    }
                 }
             }
           /* Now construct the full directory.  */
+          bool e = true;
           if (char_array_pos (&dirname, 1) == '\0')
             {
-              if (!char_array_set_str (&dirname, home_dir))
-                goto err_nospace;
+              e = char_array_set_str (&dirname, char_array_str (&home_dir));
               dirlen = char_array_size (&dirname) - 1;
             }
           else
             {
               /* Replaces '~' by the obtained HOME dir.  */
               char_array_erase (&dirname, 0);
-              if (!char_array_prepend_str (&dirname, home_dir))
-               goto err_nospace;
+              e = char_array_prepend_str (&dirname,
+                                          char_array_str (&home_dir));
+            }
+          if (e == false)
+            {
+              char_array_free (&dirname);
+              char_array_free (&home_dir);
+              goto err_nospace;
             }
           dirname_modified = true;
         }
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 8/8] posix: Remove all alloca usage in glob
  2021-01-05 18:58 [PATCH 0/8] Remove alloca usage from glob Adhemerval Zanella
                   ` (6 preceding siblings ...)
  2021-01-05 18:58 ` [PATCH 7/8] posix: Use char_array for home_dir in glob Adhemerval Zanella
@ 2021-01-05 18:58 ` Adhemerval Zanella
  2021-01-13 19:36 ` [PATCH 0/8] Remove alloca usage from glob Paul Eggert
  8 siblings, 0 replies; 12+ messages in thread
From: Adhemerval Zanella @ 2021-01-05 18:58 UTC (permalink / raw)
  To: libc-alpha, Paul Eggert, bug-gnulib

With alloca usage removal from glob this patch wraps it up by removing
all the alloca defines and macros usage.

Checked on x86_64-linux-gnu.
---
 posix/glob.c | 23 ++++++++---------------
 1 file changed, 8 insertions(+), 15 deletions(-)

diff --git a/posix/glob.c b/posix/glob.c
index b6727ee884..c7a89a5298 100644
--- a/posix/glob.c
+++ b/posix/glob.c
@@ -42,7 +42,6 @@
 
 #ifndef WINDOWS32
 # include <pwd.h>
-# include <alloca.h>
 #endif
 
 #include <errno.h>
@@ -74,7 +73,6 @@
 # define __fxstatat64(_, d, f, st, flag) fstatat (d, f, st, flag)
 # define struct_stat64          struct stat
 # ifndef __MVS__
-#  define __alloca              alloca
 # endif
 # define __readdir              readdir
 # define COMPILE_GLOB64
@@ -184,12 +182,6 @@ convert_dirent64 (const struct dirent64 *source)
 # ifdef GNULIB_defined_closedir
 #  undef closedir
 # endif
-
-/* Just use malloc.  */
-# define __libc_use_alloca(n) false
-# define alloca_account(len, avar) ((void) (len), (void) (avar), (void *) 0)
-# define extend_alloca_account(buf, len, newlen, avar) \
-    ((void) (buf), (void) (len), (void) (newlen), (void) (avar), (void *) 0)
 #endif
 
 static int
@@ -217,7 +209,7 @@ glob_lstat (glob_t *pglob, int flags, const char *fullname)
 
 static int glob_in_dir (const char *pattern, const char *directory,
                         int flags, int (*errfunc) (const char *, int),
-                        glob_t *pglob, size_t alloca_used);
+                        glob_t *pglob);
 static int prefix_array (const char *prefix, char **array, size_t n) __THROWNL;
 static int collated_compare (const void *, const void *) __THROWNL;
 
@@ -282,7 +274,6 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
   int meta;
   glob_t dirs;
   int retval = 0;
-  size_t alloca_used = 0;
   struct char_array dirname;
   bool dirname_modified;
 
@@ -530,11 +521,13 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
           char *drive_spec;
 
           ++dirlen;
-          drive_spec = __alloca (dirlen + 1);
+          drive_spec = malloca (dirlen + 1);
           *((char *) mempcpy (drive_spec, pattern, dirlen)) = '\0';
           /* For now, disallow wildcards in the drive spec, to
              prevent infinite recursion in glob.  */
-          if (__glob_pattern_p (drive_spec, !(flags & GLOB_NOESCAPE)))
+          int r = __glob_pattern_p (drive_spec, !(flags & GLOB_NOESCAPE));
+          free (drive_spec);
+          if (r != 0)
            {
              retval = GLOB_NOMATCH;
              goto out;
@@ -935,7 +928,7 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
           status = glob_in_dir (filename, dirs.gl_pathv[i],
                                 ((flags | GLOB_APPEND)
                                  & ~(GLOB_NOCHECK | GLOB_NOMAGIC)),
-                                errfunc, pglob, alloca_used);
+                                errfunc, pglob);
           if (status == GLOB_NOMATCH)
             /* No matches in this directory.  Try the next.  */
             continue;
@@ -1040,7 +1033,7 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
       if (dirname_modified)
         flags &= ~(GLOB_NOCHECK | GLOB_NOMAGIC);
       status = glob_in_dir (filename, char_array_str (&dirname), flags,
-                            errfunc, pglob, alloca_used);
+                            errfunc, pglob);
       if (status != 0)
         {
           if (status == GLOB_NOMATCH && flags != orig_flags
@@ -1206,7 +1199,7 @@ struct globnames_result
 static int
 glob_in_dir (const char *pattern, const char *directory, int flags,
              int (*errfunc) (const char *, int),
-             glob_t *pglob, size_t alloca_used)
+             glob_t *pglob)
 {
   void *stream = NULL;
   struct globnames_array globnames;
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/8] Remove alloca usage from glob
  2021-01-05 18:58 [PATCH 0/8] Remove alloca usage from glob Adhemerval Zanella
                   ` (7 preceding siblings ...)
  2021-01-05 18:58 ` [PATCH 8/8] posix: Remove all alloca usage " Adhemerval Zanella
@ 2021-01-13 19:36 ` Paul Eggert
  8 siblings, 0 replies; 12+ messages in thread
From: Paul Eggert @ 2021-01-13 19:36 UTC (permalink / raw)
  To: Adhemerval Zanella; +Cc: bug-gnulib, libc-alpha

[-- Attachment #1: Type: text/plain, Size: 1699 bytes --]

On 1/5/21 10:58 AM, Adhemerval Zanella wrote:
> The idea of removing the alloca allows a slight better code generation,
> simplifies the boilerplate code to avoid the unbounded alloca usage,
> and it plays better with security compiler mitigation tools (such as the
> ones for stack clash).

Instead of complicating dynarray by adding char_array stuff, I suggest 
adding any primitives just to glob for now, as it's not clear that 
they're generally useful.

I do see a problem with the proposed patch set, in that it creates 
several different dynarrays when glob really needs only one or two. The 
existing code is full of memory-allocation gotchas and would be 
simplified if it treated the memory it allocates as a first-class part 
of the problem rather than as some sort of cranky subsidiary that needs 
to be babied and its diaper changed at random intervals.

Attached is a draft set of patches against Gnulib commit 
6a00fdb4bb105697aa27ba97ef7ec33287790ad3 which gives a hint about the 
sort of thing that I mean here. This patch set is not complete (it does 
only the equivalent of the first four of the patches you proposed) and 
it's not exactly the form that I want, so I haven't installed it into 
Gnulib. However, I hope it shows the sort of thing I have in mind. So 
far, it's needed only one scratch buffer.

In this patch set, glob.c continues to use scratch_buffer.h because 
scratch buffers are good enough for all the changes needed so far. I 
suppose dynarrays will be helpful for later patches and that we can 
switch to them as needed, but I wanted to focus on the actual problem 
first rather than worrying about scratch buffers vs dynarrays.

[-- Attachment #2: 0001-glob-use-scratch_buffer-for-internal-glob-dirname.patch --]
[-- Type: text/x-patch, Size: 29786 bytes --]

From 7bd5987cf0ed6668d34a921866c94b112594e714 Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Tue, 12 Jan 2021 17:24:17 -0800
Subject: [PATCH 1/3] glob: use scratch_buffer for internal glob dirname

This is an alternative implementation of a patch for glibc
proposed by Adhemerval Zanella in:
https://sourceware.org/pipermail/libc-alpha/2021-January/121344.html
* lib/glob.c (scratch_string): New static function.
(__glob): Just init and free a scratch buffer, and let glob_buf
do the real work.  This simplifies memory allocation cleanup.
(glob_buf): New  static function, like the old __glob but with
a new struct scratch_buffer arg.  Use the scratch buffer instead
of alloca/malloc for dirname, drive_spec [__MSDOS__ || WINDOWS32 only].
home_dir, user_name.  Use bool for internal booleans.
---
 ChangeLog  |  14 ++
 lib/glob.c | 432 +++++++++++++++++++----------------------------------
 2 files changed, 171 insertions(+), 275 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 1e589aac3..bf235ce49 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,17 @@
+2021-01-12  Paul Eggert  <eggert@cs.ucla.edu>
+
+	glob: use scratch_buffer for internal glob dirname
+	This is an alternative implementation of a patch for glibc
+	proposed by Adhemerval Zanella in:
+	https://sourceware.org/pipermail/libc-alpha/2021-January/121344.html
+	* lib/glob.c (scratch_string): New static function.
+	(__glob): Just init and free a scratch buffer, and let glob_buf
+	do the real work.  This simplifies memory allocation cleanup.
+	(glob_buf): New  static function, like the old __glob but with
+	a new struct scratch_buffer arg.  Use the scratch buffer instead
+	of alloca/malloc for dirname, drive_spec [__MSDOS__ || WINDOWS32 only].
+	home_dir, user_name.  Use bool for internal booleans.
+
 2021-01-08  Paul Eggert  <eggert@cs.ucla.edu>
 
 	dynarray: work even if ‘free’ is replaced
diff --git a/lib/glob.c b/lib/glob.c
index 32c88e5d1..44d2fdd5f 100644
--- a/lib/glob.c
+++ b/lib/glob.c
@@ -237,6 +237,8 @@ glob_use_alloca (size_t alloca_used, size_t len)
           && __libc_use_alloca (size));
 }
 
+static int glob_buf (char const *, int, int (*) (char const *, int), glob_t *,
+                     struct scratch_buffer *);
 static int glob_in_dir (const char *pattern, const char *directory,
                         int flags, int (*errfunc) (const char *, int),
                         glob_t *pglob, size_t alloca_used);
@@ -244,6 +246,20 @@ static int prefix_array (const char *prefix, char **array, size_t n) __THROWNL;
 static int collated_compare (const void *, const void *) __THROWNL;
 
 
+/* Using GLOBBUF for storage, return a string with contents equal to
+   BUF with length BUFLEN.  Terminate the string with a null byte.
+   Return NULL on memory allocation failure.  */
+static char *
+scratch_string (struct scratch_buffer *globbuf, char const *buf, size_t buflen)
+{
+  if (!scratch_buffer_set_array_size (globbuf, buflen + 1, 1))
+    return NULL;
+  char *copy = globbuf->data;
+  char *copy_end = mempcpy (copy, buf, buflen);
+  *copy_end = '\0';
+  return copy;
+}
+
 /* Return true if FILENAME is a directory or a symbolic link to a directory.
    Use FLAGS and PGLOB to resolve the filename.  */
 static bool
@@ -290,23 +306,32 @@ next_brace_sub (const char *cp, int flags)
    it is called with the pathname that caused the error, and the
    'errno' value from the failing call; if it returns non-zero
    'glob' returns GLOB_ABORTED; if it returns zero, the error is ignored.
-   If memory cannot be allocated for PGLOB, GLOB_NOSPACE is returned.
+   If memory cannot be allocated, return GLOB_NOSPACE.
    Otherwise, 'glob' returns zero.  */
 int
 GLOB_ATTRIBUTE
 __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
         glob_t *pglob)
+{
+  struct scratch_buffer globbuf;
+  scratch_buffer_init (&globbuf);
+  int result = glob_buf (pattern, flags, errfunc, pglob, &globbuf);
+  scratch_buffer_free (&globbuf);
+  return result;
+}
+
+/* Like 'glob', but with an additional scratch buffer arg GLOBBUF.  */
+static int
+glob_buf (const char *pattern, int flags, int (*errfunc) (const char *, int),
+          glob_t *pglob, struct scratch_buffer *globbuf)
 {
   const char *filename;
-  char *dirname = NULL;
+  char *dirname;
   size_t dirlen;
   int status;
   size_t oldcount;
   int meta;
-  int dirname_modified;
-  int malloc_dirname = 0;
   glob_t dirs;
-  int retval = 0;
   size_t alloca_used = 0;
 
   if (pattern == NULL || pglob == NULL || (flags & ~__GLOB_FLAGS) != 0)
@@ -492,20 +517,21 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
     filename = strchr (pattern, ':');
 #endif /* __MSDOS__ || WINDOWS32 */
 
-  dirname_modified = 0;
+  bool dirname_modified = false;
   if (filename == NULL)
     {
       /* This can mean two things: a simple name or "~name".  The latter
          case is nothing but a notation for a directory.  */
       if ((flags & (GLOB_TILDE|GLOB_TILDE_CHECK)) && pattern[0] == '~')
         {
-          dirname = (char *) pattern;
           dirlen = strlen (pattern);
+          dirname = scratch_string (globbuf, pattern, dirlen);
+          if (dirname == NULL)
+            return GLOB_NOSPACE;
 
-          /* Set FILENAME to NULL as a special flag.  This is ugly but
+          /* Keep FILENAME == NULL as a special flag.  This is ugly but
              other solutions would require much more code.  We test for
              this special case below.  */
-          filename = NULL;
         }
       else
         {
@@ -516,7 +542,7 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
             }
 
           filename = pattern;
-          dirname = (char *) ".";
+          dirname = scratch_string (globbuf, ".", 1);
           dirlen = 0;
         }
     }
@@ -525,47 +551,34 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
                && (flags & GLOB_NOESCAPE) == 0))
     {
       /* "/pattern" or "\\/pattern".  */
-      dirname = (char *) "/";
+      dirname = scratch_string (globbuf, "/", 1);
       dirlen = 1;
       ++filename;
     }
   else
     {
-      char *newp;
       dirlen = filename - pattern;
+      dirname = scratch_string (globbuf, pattern, dirlen);
+      if (dirname == NULL)
+        return GLOB_NOSPACE;
+      ++filename;
+
 #if defined __MSDOS__ || defined WINDOWS32
-      if (*filename == ':'
-          || (filename > pattern + 1 && filename[-1] == ':'))
+      if (filename[-1] == ':' || (2 < dirlen && filename[-2] == ':'))
         {
-          char *drive_spec;
-
           ++dirlen;
-          drive_spec = __alloca (dirlen + 1);
-          *((char *) mempcpy (drive_spec, pattern, dirlen)) = '\0';
+          dirname = scratch_string (globbuf, pattern, dirlen);
+          if (dirname == NULL)
+            return GLOB_NOSPACE;
           /* For now, disallow wildcards in the drive spec, to
              prevent infinite recursion in glob.  */
-          if (__glob_pattern_p (drive_spec, !(flags & GLOB_NOESCAPE)))
+          if (__glob_pattern_p (dirname, !(flags & GLOB_NOESCAPE)))
             return GLOB_NOMATCH;
           /* If this is "d:pattern", we need to copy ':' to DIRNAME
              as well.  If it's "d:/pattern", don't remove the slash
              from "d:/", since "d:" and "d:/" are not the same.*/
         }
-#endif
 
-      if (glob_use_alloca (alloca_used, dirlen + 1))
-        newp = alloca_account (dirlen + 1, alloca_used);
-      else
-        {
-          newp = malloc (dirlen + 1);
-          if (newp == NULL)
-            return GLOB_NOSPACE;
-          malloc_dirname = 1;
-        }
-      *((char *) mempcpy (newp, pattern, dirlen)) = '\0';
-      dirname = newp;
-      ++filename;
-
-#if defined __MSDOS__ || defined WINDOWS32
       bool drive_root = (dirlen > 1
                          && (dirname[dirlen - 1] == ':'
                              || (dirlen > 2 && dirname[dirlen - 2] == ':'
@@ -582,12 +595,12 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
             {
               /* "pattern\\/".  Remove the final backslash if it hasn't
                  been quoted.  */
-              char *p = (char *) &dirname[dirlen - 1];
+              char *p = &dirname[dirlen - 1];
 
               while (p > dirname && p[-1] == '\\') --p;
               if ((&dirname[dirlen] - p) & 1)
                 {
-                  *(char *) &dirname[--dirlen] = '\0';
+                  dirname[--dirlen] = '\0';
                   flags &= ~(GLOB_NOCHECK | GLOB_NOMAGIC);
                 }
             }
@@ -603,8 +616,7 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
               oldcount = pglob->gl_pathc + pglob->gl_offs;
               goto no_matches;
             }
-          retval = val;
-          goto out;
+          return val;
         }
     }
 
@@ -615,8 +627,7 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
               && (dirname[2] == '\0' || dirname[2] == '/')))
         {
           /* Look up home directory.  */
-          char *home_dir = getenv ("HOME");
-          int malloc_home_dir = 0;
+          char const *home_dir = getenv ("HOME");
           if (home_dir == NULL || home_dir[0] == '\0')
             {
 #ifdef WINDOWS32
@@ -629,8 +640,10 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
                 {
                   size_t home_drive_len = strlen (home_drive);
                   size_t home_path_len = strlen (home_path);
-                  char *mem = alloca (home_drive_len + home_path_len + 1);
-
+                  size_t needed = home_drive_len + home_path_len + 1;
+                  if (!scratch_buffer_set_array_size (globbuf, needed, 1))
+                    return GLOB_NOSPACE;
+                  char *mem = globbuf->data;
                   memcpy (mem, home_drive, home_drive_len);
                   memcpy (mem + home_drive_len, home_path, home_path_len + 1);
                   home_dir = mem;
@@ -638,236 +651,128 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
               else
                 home_dir = "c:/users/default"; /* poor default */
 #else
-              int err;
-              struct passwd *p;
-              struct passwd pwbuf;
-              struct scratch_buffer s;
-              scratch_buffer_init (&s);
               while (true)
                 {
-                  p = NULL;
-                  err = __getlogin_r (s.data, s.length);
+                  struct passwd *p = NULL;
+                  char *sdata = globbuf->data;
+                  size_t pwsize = globbuf->length / 2;
+                  size_t loginsize = globbuf->length - pwsize;
+                  int err = __getlogin_r (sdata, loginsize);
                   if (err == 0)
                     {
 # if defined HAVE_GETPWNAM_R || defined _LIBC
-                      size_t ssize = strlen (s.data) + 1;
-                      char *sdata = s.data;
-                      err = getpwnam_r (sdata, &pwbuf, sdata + ssize,
-                                        s.length - ssize, &p);
+                      struct passwd pwbuf;
+                      err = getpwnam_r (sdata, &pwbuf,
+                                        sdata + loginsize, pwsize, &p);
 # else
-                      p = getpwnam (s.data);
+                      errno = 0;
+                      p = getpwnam (sdata);
                       if (p == NULL)
                         err = errno;
+                      else if (globbuf->length <= strlen (p->pw_dir))
+                        {
+                          p = NULL;
+                          err = ERANGE;
+                        }
 # endif
+                      if (p != NULL)
+                        home_dir = strcpy (sdata, p->pw_dir);
                     }
                   if (err != ERANGE)
                     break;
-                  if (!scratch_buffer_grow (&s))
-                    {
-                      retval = GLOB_NOSPACE;
-                      goto out;
-                    }
-                }
-              if (err == 0)
-                {
-                  home_dir = strdup (p->pw_dir);
-                  malloc_home_dir = 1;
-                }
-              scratch_buffer_free (&s);
-              if (err == 0 && home_dir == NULL)
-                {
-                  retval = GLOB_NOSPACE;
-                  goto out;
+                  if (!scratch_buffer_grow (globbuf))
+                    return GLOB_NOSPACE;
                 }
 #endif /* WINDOWS32 */
             }
           if (home_dir == NULL || home_dir[0] == '\0')
             {
-              if (__glibc_unlikely (malloc_home_dir))
-                free (home_dir);
               if (flags & GLOB_TILDE_CHECK)
-                {
-                  retval = GLOB_NOMATCH;
-                  goto out;
-                }
-              else
-                {
-                  home_dir = (char *) "~"; /* No luck.  */
-                  malloc_home_dir = 0;
-                }
-            }
-          /* Now construct the full directory.  */
-          if (dirname[1] == '\0')
-            {
-              if (__glibc_unlikely (malloc_dirname))
-                free (dirname);
+                return GLOB_NOMATCH;
 
-              dirname = home_dir;
-              dirlen = strlen (dirname);
-              malloc_dirname = malloc_home_dir;
+              home_dir = "~"; /* No luck.  */
             }
-          else
-            {
-              char *newp;
-              size_t home_len = strlen (home_dir);
-              int use_alloca = glob_use_alloca (alloca_used, home_len + dirlen);
-              if (use_alloca)
-                newp = alloca_account (home_len + dirlen, alloca_used);
-              else
-                {
-                  newp = malloc (home_len + dirlen);
-                  if (newp == NULL)
-                    {
-                      if (__glibc_unlikely (malloc_home_dir))
-                        free (home_dir);
-                      retval = GLOB_NOSPACE;
-                      goto out;
-                    }
-                }
 
-              mempcpy (mempcpy (newp, home_dir, home_len),
-                       &dirname[1], dirlen);
-
-              if (__glibc_unlikely (malloc_dirname))
-                free (dirname);
-
-              dirname = newp;
-              dirlen += home_len - 1;
-              malloc_dirname = !use_alloca;
-
-              if (__glibc_unlikely (malloc_home_dir))
-                free (home_dir);
-            }
-          dirname_modified = 1;
+          /* Now construct the full directory.  */
+          size_t home_len = strlen (home_dir);
+          if (home_dir == globbuf->data)
+            home_dir = NULL;
+          while (globbuf->length < home_len + dirlen)
+            if (!scratch_buffer_grow_preserve (globbuf))
+              return GLOB_NOSPACE;
+          dirname = globbuf->data;
+          if (home_dir != NULL)
+            memcpy (dirname, home_dir, home_len);
+          char *dirend = mempcpy (dirname + home_len, &pattern[1], dirlen - 1);
+          *dirend = '\0';
+          dirlen += home_len - 1;
+          dirname_modified = true;
         }
       else
         {
 #ifndef WINDOWS32
           char *end_name = strchr (dirname, '/');
-          char *user_name;
-          int malloc_user_name = 0;
-          char *unescape = NULL;
-
-          if (!(flags & GLOB_NOESCAPE))
-            {
-              if (end_name == NULL)
-                {
-                  unescape = strchr (dirname, '\\');
-                  if (unescape)
-                    end_name = strchr (unescape, '\0');
-                }
-              else
-                unescape = memchr (dirname, '\\', end_name - dirname);
-            }
           if (end_name == NULL)
-            user_name = dirname + 1;
-          else
+            end_name = dirname + dirlen;
+          size_t end_offset = end_name - dirname;
+          size_t user_size = 0;
+          for (size_t i = 1; i < end_offset;
+               dirname[user_size++] = dirname[i++])
             {
-              char *newp;
-              if (glob_use_alloca (alloca_used, end_name - dirname))
-                newp = alloca_account (end_name - dirname, alloca_used);
-              else
+              if (! (flags & GLOB_NOESCAPE) && dirname[i] == '\\')
                 {
-                  newp = malloc (end_name - dirname);
-                  if (newp == NULL)
+                  i++;
+                  if (i == end_offset)
                     {
-                      retval = GLOB_NOSPACE;
-                      goto out;
+                      /* "~fo\\o\\" unescapes to user name "foo\\",
+                         but "~fo\\o\\/" unescapes to user name "foo".  */
+                      if (filename == NULL)
+                        dirname[user_size++] = '\\';
+                      break;
                     }
-                  malloc_user_name = 1;
                 }
-              if (unescape != NULL)
-                {
-                  char *p = mempcpy (newp, dirname + 1,
-                                     unescape - dirname - 1);
-                  char *q = unescape;
-                  while (q != end_name)
-                    {
-                      if (*q == '\\')
-                        {
-                          if (q + 1 == end_name)
-                            {
-                              /* "~fo\\o\\" unescape to user_name "foo\\",
-                                 but "~fo\\o\\/" unescape to user_name
-                                 "foo".  */
-                              if (filename == NULL)
-                                *p++ = '\\';
-                              break;
-                            }
-                          ++q;
-                        }
-                      *p++ = *q++;
-                    }
-                  *p = '\0';
-                }
-              else
-                *((char *) mempcpy (newp, dirname + 1, end_name - dirname - 1))
-                  = '\0';
-              user_name = newp;
             }
+          dirname[user_size++] = '\0';
 
           /* Look up specific user's home directory.  */
           {
             struct passwd *p;
-            struct scratch_buffer pwtmpbuf;
-            scratch_buffer_init (&pwtmpbuf);
+            ptrdiff_t home_offset = -1;
 
 #  if defined HAVE_GETPWNAM_R || defined _LIBC
             struct passwd pwbuf;
 
-            while (getpwnam_r (user_name, &pwbuf,
-                               pwtmpbuf.data, pwtmpbuf.length, &p)
+            while (getpwnam_r (dirname, &pwbuf, dirname + user_size,
+                               globbuf->length - user_size, &p)
                    == ERANGE)
               {
-                if (!scratch_buffer_grow (&pwtmpbuf))
-                  {
-                    retval = GLOB_NOSPACE;
-                    goto out;
-                  }
+                if (!scratch_buffer_grow_preserve (globbuf))
+                  return GLOB_NOSPACE;
+                dirname = globbuf->data;
               }
+            if (p != NULL)
+              home_offset = p->pw_dir - dirname;
 #  else
-            p = getpwnam (user_name);
+            p = getpwnam (dirname);
 #  endif
 
-            if (__glibc_unlikely (malloc_user_name))
-              free (user_name);
-
-            /* If we found a home directory use this.  */
             if (p != NULL)
               {
+                /* Use the home directory that we found.  */
                 size_t home_len = strlen (p->pw_dir);
-                size_t rest_len = end_name == NULL ? 0 : strlen (end_name);
-                /* dirname contains end_name; we can't free it now.  */
-                char *prev_dirname =
-                  (__glibc_unlikely (malloc_dirname) ? dirname : NULL);
-                char *d;
-
-                malloc_dirname = 0;
-
-                if (glob_use_alloca (alloca_used, home_len + rest_len + 1))
-                  dirname = alloca_account (home_len + rest_len + 1,
-                                            alloca_used);
-                else
-                  {
-                    dirname = malloc (home_len + rest_len + 1);
-                    if (dirname == NULL)
-                      {
-                        free (prev_dirname);
-                        scratch_buffer_free (&pwtmpbuf);
-                        retval = GLOB_NOSPACE;
-                        goto out;
-                      }
-                    malloc_dirname = 1;
-                  }
-                d = mempcpy (dirname, p->pw_dir, home_len);
-                if (end_name != NULL)
-                  d = mempcpy (d, end_name, rest_len);
-                *d = '\0';
-
-                free (prev_dirname);
-
+                size_t rest_len = dirlen - end_offset;
                 dirlen = home_len + rest_len;
-                dirname_modified = 1;
+                while (globbuf->length <= dirlen)
+                  if (!scratch_buffer_grow_preserve (globbuf))
+                    return GLOB_NOSPACE;
+                dirname = globbuf->data;
+                char *home = (home_offset < 0 ? p->pw_dir
+                              : dirname + home_offset);
+                memmove (dirname, home, home_len);
+                char *dirend = mempcpy (dirname + home_len,
+                                        pattern + end_offset, rest_len);
+                *dirend = '\0';
+                dirname_modified = true;
               }
             else
               {
@@ -875,11 +780,12 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
                   {
                     /* We have to regard it as an error if we cannot find the
                        home directory.  */
-                    retval = GLOB_NOMATCH;
-                    goto out;
+                    return GLOB_NOMATCH;
                   }
+
+                /* Restore the original tilde pattern.  */
+                dirname = scratch_string (globbuf, pattern, dirlen);
               }
-            scratch_buffer_free (&pwtmpbuf);
           }
 #endif /* !WINDOWS32 */
         }
@@ -898,8 +804,7 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
           free (pglob->gl_pathv);
           pglob->gl_pathv = NULL;
           pglob->gl_pathc = 0;
-          retval = GLOB_NOSPACE;
-          goto out;
+          return GLOB_NOSPACE;
         }
 
       new_gl_pathv = realloc (pglob->gl_pathv,
@@ -910,27 +815,20 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
 
       if (flags & GLOB_MARK && is_dir (dirname, flags, pglob))
         {
-          char *p;
-          pglob->gl_pathv[newcount] = malloc (dirlen + 2);
-          if (pglob->gl_pathv[newcount] == NULL)
-            goto nospace;
-          p = mempcpy (pglob->gl_pathv[newcount], dirname, dirlen);
-          p[0] = '/';
-          p[1] = '\0';
-          if (__glibc_unlikely (malloc_dirname))
-            free (dirname);
-        }
-      else
-        {
-          if (__glibc_unlikely (malloc_dirname))
-            pglob->gl_pathv[newcount] = dirname;
-          else
+          if (dirlen + 1 == globbuf->length)
             {
-              pglob->gl_pathv[newcount] = strdup (dirname);
-              if (pglob->gl_pathv[newcount] == NULL)
+              if (!scratch_buffer_grow_preserve (globbuf))
                 goto nospace;
+              dirname = globbuf->data;
             }
+          dirname[dirlen++] = '/';
+          dirname[dirlen] = '\0';
         }
+      pglob->gl_pathv[newcount] = scratch_buffer_dupfree (globbuf, dirlen + 1);
+      if (pglob->gl_pathv[newcount] == NULL)
+        goto nospace;
+      scratch_buffer_init (globbuf);
+
       pglob->gl_pathv[++newcount] = NULL;
       ++pglob->gl_pathc;
       pglob->gl_flags = flags;
@@ -955,11 +853,11 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
         {
           /* "foo\\/bar".  Remove the final backslash from dirname
              if it has not been quoted.  */
-          char *p = (char *) &dirname[dirlen - 1];
+          char *p = &dirname[dirlen - 1];
 
           while (p > dirname && p[-1] == '\\') --p;
-          if ((&dirname[dirlen] - p) & 1)
-            *(char *) &dirname[--dirlen] = '\0';
+          dirlen -= (&dirname[dirlen] - p) & 1;
+          dirname[dirlen] = '\0';
         }
 
       if (__glibc_unlikely ((flags & GLOB_ALTDIRFUNC) != 0))
@@ -980,10 +878,7 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
       if (status != 0)
         {
           if ((flags & GLOB_NOCHECK) == 0 || status != GLOB_NOMATCH)
-            {
-              retval = status;
-              goto out;
-            }
+            return status;
           goto no_matches;
         }
 
@@ -1008,8 +903,7 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
               globfree (&dirs);
               globfree (pglob);
               pglob->gl_pathc = 0;
-              retval = status;
-              goto out;
+              return status;
             }
 
           /* Stick the directory on the front of each name.  */
@@ -1020,8 +914,7 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
               globfree (&dirs);
               globfree (pglob);
               pglob->gl_pathc = 0;
-              retval = GLOB_NOSPACE;
-              goto out;
+              return GLOB_NOSPACE;
             }
         }
 
@@ -1043,8 +936,7 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
                 {
                 nospace2:
                   globfree (&dirs);
-                  retval = GLOB_NOSPACE;
-                  goto out;
+                  return GLOB_NOSPACE;
                 }
 
               new_gl_pathv = realloc (pglob->gl_pathv,
@@ -1059,8 +951,7 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
                   globfree (&dirs);
                   globfree (pglob);
                   pglob->gl_pathc = 0;
-                  retval = GLOB_NOSPACE;
-                  goto out;
+                  return GLOB_NOSPACE;
                 }
 
               ++pglob->gl_pathc;
@@ -1072,8 +963,7 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
           else
             {
               globfree (&dirs);
-              retval = GLOB_NOMATCH;
-              goto out;
+              return GLOB_NOMATCH;
             }
         }
 
@@ -1086,10 +976,8 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
 
       if (meta & GLOBPAT_BACKSLASH)
         {
+          /* Unescape DIRNAME.  */
           char *p = strchr (dirname, '\\'), *q;
-          /* We need to unescape the dirname string.  It is certainly
-             allocated by alloca, as otherwise filename would be NULL
-             or dirname wouldn't contain backslashes.  */
           q = p;
           do
             {
@@ -1103,7 +991,7 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
               ++q;
             }
           while (*p++ != '\0');
-          dirname_modified = 1;
+          dirname_modified = true;
         }
       if (dirname_modified)
         flags &= ~(GLOB_NOCHECK | GLOB_NOMAGIC);
@@ -1119,8 +1007,7 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
               flags = orig_flags;
               goto no_matches;
             }
-          retval = status;
-          goto out;
+          return status;
         }
 
       if (dirlen > 0)
@@ -1132,8 +1019,7 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
             {
               globfree (pglob);
               pglob->gl_pathc = 0;
-              retval = GLOB_NOSPACE;
-              goto out;
+              return GLOB_NOSPACE;
             }
         }
     }
@@ -1152,8 +1038,7 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
               {
                 globfree (pglob);
                 pglob->gl_pathc = 0;
-                retval = GLOB_NOSPACE;
-                goto out;
+                return GLOB_NOSPACE;
               }
             strcpy (&new[len - 2], "/");
             pglob->gl_pathv[i] = new;
@@ -1168,12 +1053,9 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
              sizeof (char *), collated_compare);
     }
 
- out:
-  if (__glibc_unlikely (malloc_dirname))
-    free (dirname);
-
-  return retval;
+  return 0;
 }
+
 #if defined _LIBC && !defined __glob
 versioned_symbol (libc, __glob, glob, GLIBC_2_27);
 libc_hidden_ver (__glob, glob)
-- 
2.27.0


[-- Attachment #3: 0002-glob-use-scratch_buffer-for-GLOB_BRACE.patch --]
[-- Type: text/x-patch, Size: 5491 bytes --]

From eaaf408c6dce7108a4ab47c9fad5009dda64bc04 Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Tue, 12 Jan 2021 20:59:38 -0800
Subject: [PATCH 2/3] glob: use scratch_buffer for GLOB_BRACE

This is an alternative implementation of a patch for glibc
proposed by Adhemerval Zanella in:
https://sourceware.org/pipermail/libc-alpha/2021-January/121345.html
* lib/glob.c (glob_buf): Calculate pattern length at start.
Use GLOBBUF for temporaries when analyzing brace expressions.
---
 ChangeLog  |  7 +++++++
 lib/glob.c | 40 +++++++++++++++++-----------------------
 2 files changed, 24 insertions(+), 23 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index bf235ce49..e0a168e7b 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,5 +1,12 @@
 2021-01-12  Paul Eggert  <eggert@cs.ucla.edu>
 
+	glob: use scratch_buffer for GLOB_BRACE
+	This is an alternative implementation of a patch for glibc
+	proposed by Adhemerval Zanella in:
+	https://sourceware.org/pipermail/libc-alpha/2021-January/121345.html
+	* lib/glob.c (glob_buf): Calculate pattern length at start.
+	Use GLOBBUF for temporaries when analyzing brace expressions.
+
 	glob: use scratch_buffer for internal glob dirname
 	This is an alternative implementation of a patch for glibc
 	proposed by Adhemerval Zanella in:
diff --git a/lib/glob.c b/lib/glob.c
index 44d2fdd5f..304baebf6 100644
--- a/lib/glob.c
+++ b/lib/glob.c
@@ -342,7 +342,8 @@ glob_buf (const char *pattern, int flags, int (*errfunc) (const char *, int),
 
   /* POSIX requires all slashes to be matched.  This means that with
      a trailing slash we must match only directories.  */
-  if (pattern[0] && pattern[strlen (pattern) - 1] == '/')
+  size_t pattern_len = strlen (pattern);
+  if (pattern_len != 0 && pattern[pattern_len - 1] == '/')
     flags |= GLOB_ONLYDIR;
 
   if (!(flags & GLOB_DOOFFS))
@@ -409,16 +410,13 @@ glob_buf (const char *pattern, int flags, int (*errfunc) (const char *, int),
           const char *rest;
           size_t rest_len;
           char *onealt;
-          size_t pattern_len = strlen (pattern) - 1;
-          int alloca_onealt = glob_use_alloca (alloca_used, pattern_len);
-          if (alloca_onealt)
-            onealt = alloca_account (pattern_len, alloca_used);
-          else
-            {
-              onealt = malloc (pattern_len);
-              if (onealt == NULL)
-                return GLOB_NOSPACE;
-            }
+
+          /* Allocate room for the pattern with any sub-pattern
+             subtituted for the brace expression.  For simplicity, use
+             the pattern length; this is more than enough room.  */
+          if (!scratch_buffer_set_array_size (globbuf, pattern_len, 1))
+            return GLOB_NOSPACE;
+          onealt = globbuf->data;
 
           /* We know the prefix for all sub-patterns.  */
           alt_start = mempcpy (onealt, pattern, begin - pattern);
@@ -429,9 +427,6 @@ glob_buf (const char *pattern, int flags, int (*errfunc) (const char *, int),
           if (next == NULL)
             {
               /* It is an invalid expression.  */
-            illegal_brace:
-              if (__glibc_unlikely (!alloca_onealt))
-                free (onealt);
               flags &= ~GLOB_BRACE;
               goto no_brace;
             }
@@ -442,12 +437,16 @@ glob_buf (const char *pattern, int flags, int (*errfunc) (const char *, int),
             {
               rest = next_brace_sub (rest + 1, flags);
               if (rest == NULL)
-                /* It is an illegal expression.  */
-                goto illegal_brace;
+                {
+                  /* It is an invalid expression.  */
+                  flags &= ~GLOB_BRACE;
+                  goto no_brace;
+                }
             }
           /* Please note that we now can be sure the brace expression
              is well-formed.  */
-          rest_len = strlen (++rest) + 1;
+          rest_len = pattern + pattern_len - rest;
+          rest++;
 
           /* We have a brace expression.  BEGIN points to the opening {,
              NEXT points past the terminator of the first element, and END
@@ -472,8 +471,6 @@ glob_buf (const char *pattern, int flags, int (*errfunc) (const char *, int),
               /* If we got an error, return it.  */
               if (result && result != GLOB_NOMATCH)
                 {
-                  if (__glibc_unlikely (!alloca_onealt))
-                    free (onealt);
                   if (!(flags & GLOB_APPEND))
                     {
                       globfree (pglob);
@@ -491,9 +488,6 @@ glob_buf (const char *pattern, int flags, int (*errfunc) (const char *, int),
               assert (next != NULL);
             }
 
-          if (__glibc_unlikely (!alloca_onealt))
-            free (onealt);
-
           if (pglob->gl_pathc != firstc)
             /* We found some entries.  */
             return 0;
@@ -524,7 +518,7 @@ glob_buf (const char *pattern, int flags, int (*errfunc) (const char *, int),
          case is nothing but a notation for a directory.  */
       if ((flags & (GLOB_TILDE|GLOB_TILDE_CHECK)) && pattern[0] == '~')
         {
-          dirlen = strlen (pattern);
+          dirlen = pattern_len;
           dirname = scratch_string (globbuf, pattern, dirlen);
           if (dirname == NULL)
             return GLOB_NOSPACE;
-- 
2.27.0


[-- Attachment #4: 0003-glob-use-scratch_buffer-for-glob_in_dir-GLOBPAT_NONE.patch --]
[-- Type: text/x-patch, Size: 5498 bytes --]

From 2a6bd9769d8eca95d63365bc8131b2ebc88294e6 Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Tue, 12 Jan 2021 22:31:35 -0800
Subject: [PATCH 3/3] glob: use scratch_buffer for glob_in_dir GLOBPAT_NONE

This is an alternative implementation of a patch for glibc
proposed by Adhemerval Zanella in:
https://sourceware.org/pipermail/libc-alpha/2021-January/121347.html
* lib/glob.c (glob_in_dir): Replace DIRECTORY arg with GLOBBUG +
DIRLEN args, to make it easy to append to it.  All callers
changed.
---
 ChangeLog  |  8 ++++++++
 lib/glob.c | 41 +++++++++++++++++------------------------
 2 files changed, 25 insertions(+), 24 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index e0a168e7b..2cab81b60 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,5 +1,13 @@
 2021-01-12  Paul Eggert  <eggert@cs.ucla.edu>
 
+	glob: use scratch_buffer for glob_in_dir GLOBPAT_NONE
+	This is an alternative implementation of a patch for glibc
+	proposed by Adhemerval Zanella in:
+	https://sourceware.org/pipermail/libc-alpha/2021-January/121347.html
+	* lib/glob.c (glob_in_dir): Replace DIRECTORY arg with GLOBBUG +
+	DIRLEN args, to make it easy to append to it.  All callers
+	changed.
+
 	glob: use scratch_buffer for GLOB_BRACE
 	This is an alternative implementation of a patch for glibc
 	proposed by Adhemerval Zanella in:
diff --git a/lib/glob.c b/lib/glob.c
index 304baebf6..f3c229035 100644
--- a/lib/glob.c
+++ b/lib/glob.c
@@ -239,7 +239,8 @@ glob_use_alloca (size_t alloca_used, size_t len)
 
 static int glob_buf (char const *, int, int (*) (char const *, int), glob_t *,
                      struct scratch_buffer *);
-static int glob_in_dir (const char *pattern, const char *directory,
+static int glob_in_dir (const char *pattern,
+                        struct scratch_buffer *globbuf, size_t dirlen,
                         int flags, int (*errfunc) (const char *, int),
                         glob_t *pglob, size_t alloca_used);
 static int prefix_array (const char *prefix, char **array, size_t n) __THROWNL;
@@ -884,7 +885,9 @@ glob_buf (const char *pattern, int flags, int (*errfunc) (const char *, int),
           size_t old_pathc;
 
           old_pathc = pglob->gl_pathc;
-          status = glob_in_dir (filename, dirs.gl_pathv[i],
+          size_t dirs_i_len = strlen (dirs.gl_pathv[i]);
+          scratch_string (globbuf, dirs.gl_pathv[i], dirs_i_len);
+          status = glob_in_dir (filename, globbuf, dirs_i_len,
                                 ((flags | GLOB_APPEND)
                                  & ~(GLOB_NOCHECK | GLOB_NOMAGIC)),
                                 errfunc, pglob, alloca_used);
@@ -989,7 +992,7 @@ glob_buf (const char *pattern, int flags, int (*errfunc) (const char *, int),
         }
       if (dirname_modified)
         flags &= ~(GLOB_NOCHECK | GLOB_NOMAGIC);
-      status = glob_in_dir (filename, dirname, flags, errfunc, pglob,
+      status = glob_in_dir (filename, globbuf, dirlen, flags, errfunc, pglob,
                             alloca_used);
       if (status != 0)
         {
@@ -1128,15 +1131,16 @@ prefix_array (const char *dirname, char **array, size_t n)
 }
 
 /* Like 'glob', but PATTERN is a final pathname component,
-   and matches are searched for in DIRECTORY.
+   and matches are searched for in the directory name in GLOBBUF.
    The GLOB_NOSORT bit in FLAGS is ignored.  No sorting is ever done.
    The GLOB_APPEND flag is assumed to be set (always appends).  */
 static int
-glob_in_dir (const char *pattern, const char *directory, int flags,
+glob_in_dir (const char *pattern, struct scratch_buffer *globbuf,
+             size_t dirlen, int flags,
              int (*errfunc) (const char *, int),
              glob_t *pglob, size_t alloca_used)
 {
-  size_t dirlen = strlen (directory);
+  char *directory = globbuf->data;
   void *stream = NULL;
 # define GLOBNAMES_MEMBERS(nnames) \
     struct globnames *next; size_t count; char *name[nnames];
@@ -1169,31 +1173,20 @@ glob_in_dir (const char *pattern, const char *directory, int flags,
   else if (meta == GLOBPAT_NONE)
     {
       size_t patlen = strlen (pattern);
-      size_t fullsize;
-      bool alloca_fullname
-        = (! size_add_wrapv (dirlen + 1, patlen + 1, &fullsize)
-           && glob_use_alloca (alloca_used, fullsize));
-      char *fullname;
-      if (alloca_fullname)
-        fullname = alloca_account (fullsize, alloca_used);
-      else
+      while (globbuf->length - dirlen < patlen + 2)
         {
-          fullname = malloc (fullsize);
-          if (fullname == NULL)
+          if (!scratch_buffer_grow_preserve (globbuf))
             return GLOB_NOSPACE;
+          directory = globbuf->data;
         }
-
-      mempcpy (mempcpy (mempcpy (fullname, directory, dirlen),
-                        "/", 1),
-               pattern, patlen + 1);
-      if (glob_lstat (pglob, flags, fullname) == 0
+      directory[dirlen] = '/';
+      strcpy (directory + dirlen + 1, pattern);
+      if (glob_lstat (pglob, flags, directory) == 0
           || errno == EOVERFLOW)
         /* We found this file to be existing.  Now tell the rest
            of the function to copy this name into the result.  */
         flags |= GLOB_NOCHECK;
-
-      if (__glibc_unlikely (!alloca_fullname))
-        free (fullname);
+      directory[dirlen] = '\0';
     }
   else
     {
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/8] posix: Use char_array for internal glob dirname
  2021-01-05 18:58 ` [PATCH 2/8] posix: Use char_array for internal glob dirname Adhemerval Zanella
@ 2021-03-23 16:08   ` Arjun Shankar
  2021-03-24 17:39     ` Adhemerval Zanella
  0 siblings, 1 reply; 12+ messages in thread
From: Arjun Shankar @ 2021-03-23 16:08 UTC (permalink / raw)
  To: Adhemerval Zanella; +Cc: bug-gnulib, Paul Eggert, libc-alpha

Hi Adhemerval,

> This is the first patch of the set to remove alloca usage on glob
> implementation.  Internal path to search for file might expand to a
> non static directory derived from pattern for some difference cases
> (GLOB_NOESCAPE, GNU GLOB_TILDE) and to allow a non-static dirname
> path glob uses a lot of boilerplate code to manage the buffer (which
> is either allocated using alloca or malloc depending both to size
> requested and the total alloca_used).
> 
> The patch changes to use the char_array struct with the default size
> (256 bytes).  It simplifies all the allocation code by using char_array
> one and every internal buffer access is done using char_array provided
> functions.  No functional changes are expected.

I've been going through this patch series. Here are my comments on the
first one. I have mostly been looking at this from the point of view of
making sure the logic remains equivalent, and not the way you tackled
the problem itself.

In summary, I found a comparison reversed, a missed concatenation, and
some minor indent issues. Other than that, this patch looks good to me.

Details:

> diff --git a/posix/glob.c b/posix/glob.c
> index 32c88e5d15..8c6e1914c5 100644
> --- a/posix/glob.c
> +++ b/posix/glob.c
> @@ -85,6 +85,7 @@
>  #include <flexmember.h>
>  #include <glob_internal.h>
>  #include <scratch_buffer.h>
> +#include <malloc/char_array-skeleton.c>

Include the new dynamic character array implementation. OK.

Note:
The patch that introduces char_array-skeleton.c will need a slight adjustment
after de0e1b45b0ab (malloc: Sync dynarray with gnulib) due to the removal
of the anonymous union.

>  static const char *next_brace_sub (const char *begin, int flags) __THROWNL;
>  
> @@ -298,16 +299,15 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
>          glob_t *pglob)
>  {
>    const char *filename;
> -  char *dirname = NULL;
>    size_t dirlen;
>    int status;
>    size_t oldcount;
>    int meta;
> -  int dirname_modified;
> -  int malloc_dirname = 0;
>    glob_t dirs;
>    int retval = 0;
>    size_t alloca_used = 0;
> +  struct char_array dirname;
> +  bool dirname_modified;

dirname changes type, dirname_modified should be a boolean, and malloc_dirname
is now redundant.

OK.
 
>    if (pattern == NULL || pglob == NULL || (flags & ~__GLOB_FLAGS) != 0)
>      {
> @@ -315,6 +315,10 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
>        return -1;
>      }
>  
> +  /* Default char array is stack allocated, so there is no need to check
> +     if setting the initial '\0' succeeds.  */
> +  char_array_init_empty (&dirname);
> +
>    /* POSIX requires all slashes to be matched.  This means that with
>       a trailing slash we must match only directories.  */
>    if (pattern[0] && pattern[strlen (pattern) - 1] == '/')

OK.

> @@ -335,12 +339,12 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
>            size_t i;
>  
>            if (pglob->gl_offs >= ~((size_t) 0) / sizeof (char *))
> -            return GLOB_NOSPACE;
> +            goto err_nospace;
>  
>            pglob->gl_pathv = (char **) malloc ((pglob->gl_offs + 1)
>                                                * sizeof (char *));

err_nospace frees dirname and returns GLOB_NOSPACE.
So the code is equivalent.
OK.

>            if (pglob->gl_pathv == NULL)
> -            return GLOB_NOSPACE;
> +            goto err_nospace;
>  
>            for (i = 0; i <= pglob->gl_offs; ++i)
>              pglob->gl_pathv[i] = NULL;
> @@ -392,7 +396,7 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
>              {
>                onealt = malloc (pattern_len);
>                if (onealt == NULL)
> -                return GLOB_NOSPACE;
> +                goto err_nospace;
>              }
>  
>            /* We know the prefix for all sub-patterns.  */

OK. Same.

> @@ -454,7 +458,8 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
>                        globfree (pglob);
>                        pglob->gl_pathc = 0;
>                      }
> -                  return result;
> +                  retval = result;
> +                  goto out;
>                  }
>  
>                if (*next == '}')

out frees dirname and returns retval.
So the code is equivalent.
OK.

> @@ -471,9 +476,10 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
>  
>            if (pglob->gl_pathc != firstc)
>              /* We found some entries.  */
> -            return 0;
> +            retval = 0;

We will continue at 'out', which will also return after freeing dirname.
So the code remains equivalent.
OK.

>            else if (!(flags & (GLOB_NOCHECK|GLOB_NOMAGIC)))
> -            return GLOB_NOMATCH;
> +            retval = GLOB_NOMATCH;
> +          goto out;
>          }
>      }
>  

OK. Same here.
 
> @@ -492,14 +498,15 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
>      filename = strchr (pattern, ':');
>  #endif /* __MSDOS__ || WINDOWS32 */
>  
> -  dirname_modified = 0;
> +  dirname_modified = false;
>    if (filename == NULL)
>      {

OK. It's declared as a boolean now.

>        /* This can mean two things: a simple name or "~name".  The latter
>           case is nothing but a notation for a directory.  */
>        if ((flags & (GLOB_TILDE|GLOB_TILDE_CHECK)) && pattern[0] == '~')
>          {
> -          dirname = (char *) pattern;
> +         if (!char_array_set_str (&dirname, pattern))
> +           goto err_nospace;
>            dirlen = strlen (pattern);
>  
>            /* Set FILENAME to NULL as a special flag.  This is ugly but

OK. It's still equivalent. Since char_array_set_str can lead to a failed
allocation, we add a check and exit with error if that happens.
Indent is a bit off, though.

> @@ -516,7 +523,8 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
>              }
>  
>            filename = pattern;
> -          dirname = (char *) ".";
> +         if (!char_array_set_str (&dirname, "."))
> +           goto err_nospace;
>            dirlen = 0;
>          }
>      }

OK. Same as last. Return an error if the allocation fails.
Indent is a bit off.

> @@ -525,13 +533,13 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
>                 && (flags & GLOB_NOESCAPE) == 0))
>      {
>        /* "/pattern" or "\\/pattern".  */
> -      dirname = (char *) "/";
> +      if (!char_array_set_str (&dirname, "/"))
> +       goto err_nospace;
>        dirlen = 1;
>        ++filename;
>      }

OK.

>    else
>      {
> -      char *newp;
>        dirlen = filename - pattern;
>  #if defined __MSDOS__ || defined WINDOWS32
>        if (*filename == ':'

OK. newp was used for malloc/alloca allocations.

> @@ -545,31 +553,26 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
>            /* For now, disallow wildcards in the drive spec, to
>               prevent infinite recursion in glob.  */
>            if (__glob_pattern_p (drive_spec, !(flags & GLOB_NOESCAPE)))
> -            return GLOB_NOMATCH;
> +           {
> +             retval = GLOB_NOMATCH;
> +             goto out;
> +           }

We need to deallocate before every return now. This does that.
OK.

>            /* If this is "d:pattern", we need to copy ':' to DIRNAME
>               as well.  If it's "d:/pattern", don't remove the slash
>               from "d:/", since "d:" and "d:/" are not the same.*/
>          }
>  #endif
>  
> -      if (glob_use_alloca (alloca_used, dirlen + 1))
> -        newp = alloca_account (dirlen + 1, alloca_used);
> -      else
> -        {
> -          newp = malloc (dirlen + 1);
> -          if (newp == NULL)
> -            return GLOB_NOSPACE;
> -          malloc_dirname = 1;
> -        }
> -      *((char *) mempcpy (newp, pattern, dirlen)) = '\0';
> -      dirname = newp;
> +      if (!char_array_set_str_size (&dirname, pattern, dirlen))
> +       goto err_nospace;
>        ++filename;

We used to alloca/malloc some space and copy pattern into it. We still do
the same but using a dynarray. So don't need malloc_dirname any more.
OK.

>  #if defined __MSDOS__ || defined WINDOWS32
>        bool drive_root = (dirlen > 1
> -                         && (dirname[dirlen - 1] == ':'
> -                             || (dirlen > 2 && dirname[dirlen - 2] == ':'
> -                                 && dirname[dirlen - 1] == '/')));
> +                         && (char_array_pos (&dirname, dirlen - 1) != ':'
> +                             || (dirlen > 2
> +                             && char_array_pos (&dirname, dirlen - 2) != ':'
> +                                 && char_array_pos (&dirname, dirlen - 1) != '/')));
>  #else
>        bool drive_root = false;
>  #endif

Looks like the comparisons have been reversed. I think they should be `=='
instead of `!='.

> @@ -578,20 +581,24 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
>          /* "pattern/".  Expand "pattern", appending slashes.  */
>          {
>            int orig_flags = flags;
> -          if (!(flags & GLOB_NOESCAPE) && dirname[dirlen - 1] == '\\')
> +          if (!(flags & GLOB_NOESCAPE)
> +              && char_array_pos (&dirname, dirlen - 1) == '\\')
>              {

OK.

>                /* "pattern\\/".  Remove the final backslash if it hasn't
>                   been quoted.  */
> -              char *p = (char *) &dirname[dirlen - 1];
> -
> -              while (p > dirname && p[-1] == '\\') --p;
> -              if ((&dirname[dirlen] - p) & 1)
> +              size_t p = dirlen - 1;
> +              while (p > 0 && char_array_pos (&dirname, p - 1) == '\\') --p;
> +              if ((dirlen - p) & 1)
>                  {
> -                  *(char *) &dirname[--dirlen] = '\0';
> +                  /* Since we are shrinking the array, there is no need to
> +                     check the function return.  */
> +                  dirlen -= 1;
> +                  char_array_crop (&dirname, dirlen);
>                    flags &= ~(GLOB_NOCHECK | GLOB_NOMAGIC);
>                  }
>              }

p was a pointer to the last character in dirname, and we looped on it going
backwards towards the start of dirname looking for a '\'.

Now, p is an index into dirname and we do the same thing.

Looks equivalent, and more readable.
OK.

> -          int val = __glob (dirname, flags | GLOB_MARK, errfunc, pglob);
> +          int val = __glob (char_array_str (&dirname), flags | GLOB_MARK,
> +                            errfunc, pglob);
>            if (val == 0)
>              pglob->gl_flags = ((pglob->gl_flags & ~GLOB_MARK)
>                                 | (flags & GLOB_MARK));

OK.

> @@ -608,11 +615,14 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
>          }
>      }
>  
> -  if ((flags & (GLOB_TILDE|GLOB_TILDE_CHECK)) && dirname[0] == '~')
> +  if ((flags & (GLOB_TILDE|GLOB_TILDE_CHECK))
> +      && char_array_pos (&dirname, 0) == '~')
>      {

OK.

> -      if (dirname[1] == '\0' || dirname[1] == '/'
> -          || (!(flags & GLOB_NOESCAPE) && dirname[1] == '\\'
> -              && (dirname[2] == '\0' || dirname[2] == '/')))
> +      if (char_array_pos (&dirname, 1) == '\0'
> +         || char_array_pos (&dirname, 1) == '/'
> +         || (!(flags & GLOB_NOESCAPE) && char_array_pos (&dirname, 1) == '\\'
> +             && (char_array_pos (&dirname, 2) == '\0'
> +                || char_array_pos (&dirname, 2) == '/')))

OK. They do the same thing.
Indent needs a fix.

>          {
>            /* Look up home directory.  */
>            char *home_dir = getenv ("HOME");
> @@ -663,10 +673,7 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
>                    if (err != ERANGE)
>                      break;
>                    if (!scratch_buffer_grow (&s))
> -                    {
> -                      retval = GLOB_NOSPACE;
> -                      goto out;
> -                    }
> +                  goto err_nospace;
>                  }
>                if (err == 0)
>                  {

OK.

> @@ -675,10 +682,7 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
>                  }
>                scratch_buffer_free (&s);
>                if (err == 0 && home_dir == NULL)
> -                {
> -                  retval = GLOB_NOSPACE;
> -                  goto out;
> -                }
> +              goto err_nospace;
>  #endif /* WINDOWS32 */
>              }
>            if (home_dir == NULL || home_dir[0] == '\0')

OK.

> @@ -697,53 +701,26 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
>                  }
>              }
>            /* Now construct the full directory.  */
> -          if (dirname[1] == '\0')
> +          if (char_array_pos (&dirname, 1) == '\0')
>              {

OK.

> -              if (__glibc_unlikely (malloc_dirname))
> -                free (dirname);
> -

OK. We don't use malloc for dirname any more.

> -              dirname = home_dir;
> -              dirlen = strlen (dirname);
> -              malloc_dirname = malloc_home_dir;
> +              if (!char_array_set_str (&dirname, home_dir))
> +                goto err_nospace;
> +              dirlen = char_array_size (&dirname) - 1;

Equivalent, and we don't use malloc_dirname any more so we throw away that
assignment.
OK.

>              }
>            else
>              {
> -              char *newp;
> -              size_t home_len = strlen (home_dir);
> -              int use_alloca = glob_use_alloca (alloca_used, home_len + dirlen);
> -              if (use_alloca)
> -                newp = alloca_account (home_len + dirlen, alloca_used);
> -              else
> -                {
> -                  newp = malloc (home_len + dirlen);
> -                  if (newp == NULL)
> -                    {
> -                      if (__glibc_unlikely (malloc_home_dir))
> -                        free (home_dir);
> -                      retval = GLOB_NOSPACE;
> -                      goto out;
> -                    }
> -                }

This code was allocating enough memory to concatenate home_dir and dirname.

> -              mempcpy (mempcpy (newp, home_dir, home_len),
> -                       &dirname[1], dirlen);

...Then concatenating it.

> -              if (__glibc_unlikely (malloc_dirname))
> -                free (dirname);
> -
> -              dirname = newp;
> -              dirlen += home_len - 1;
> -              malloc_dirname = !use_alloca;
> -
> -              if (__glibc_unlikely (malloc_home_dir))
> -                free (home_dir);

...And freeing any previously allocated memory.

> +              /* Replaces '~' by the obtained HOME dir.  */
> +              char_array_erase (&dirname, 0);
> +              if (!char_array_prepend_str (&dirname, home_dir))
> +               goto err_nospace;

Now we do it using a dynarray.
OK. Indent needs a fix.

>              }
> -          dirname_modified = 1;
> +          dirname_modified = true;
>          }

OK.

>        else
>          {
>  #ifndef WINDOWS32
> -          char *end_name = strchr (dirname, '/');
> +          char *dirnamestr = char_array_at (&dirname, 0);
> +          char *end_name = strchr (dirnamestr, '/');

Equivalent. OK.

>            char *user_name;
>            int malloc_user_name = 0;
>            char *unescape = NULL;
> @@ -752,23 +729,23 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
>              {
>                if (end_name == NULL)
>                  {
> -                  unescape = strchr (dirname, '\\');
> +                unescape = strchr (dirnamestr, '\\');
>                    if (unescape)
>                      end_name = strchr (unescape, '\0');
>                  }

OK. Indent needs a fix, and further down as well.

>                else
> -                unescape = memchr (dirname, '\\', end_name - dirname);
> +              unescape = memchr (dirnamestr, '\\', end_name - dirnamestr);
>              }

OK.

>            if (end_name == NULL)
> -            user_name = dirname + 1;
> +            user_name = dirnamestr + 1;

OK.

>            else
>              {
>                char *newp;
> -              if (glob_use_alloca (alloca_used, end_name - dirname))
> -                newp = alloca_account (end_name - dirname, alloca_used);
> +              if (glob_use_alloca (alloca_used, end_name - dirnamestr))
> +                newp = alloca_account (end_name - dirnamestr, alloca_used);
>                else
>                  {
> -                  newp = malloc (end_name - dirname);
> +                  newp = malloc (end_name - dirnamestr);
>                    if (newp == NULL)
>                      {
>                        retval = GLOB_NOSPACE;

OK. This newp is for user_name which is tackled in a separate patch.

> @@ -778,8 +755,8 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
>                  }
>                if (unescape != NULL)
>                  {
> -                  char *p = mempcpy (newp, dirname + 1,
> -                                     unescape - dirname - 1);
> +                  char *p = mempcpy (newp, dirnamestr + 1,
> +                                     unescape - dirnamestr - 1);
>                    char *q = unescape;
>                    while (q != end_name)
>                      {
> @@ -801,8 +778,9 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
>                    *p = '\0';
>                  }
>                else
> -                *((char *) mempcpy (newp, dirname + 1, end_name - dirname - 1))
> -                  = '\0';
> +                *((char *) mempcpy (newp, dirnamestr + 1,
> +                                    end_name - dirnamestr - 1))
> +                   = '\0';
>                user_name = newp;
>              }

Same. OK. There appears to be a changed line due to a stray whitespace.

> @@ -835,39 +813,14 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
>              /* If we found a home directory use this.  */
>              if (p != NULL)
>                {
> -                size_t home_len = strlen (p->pw_dir);
> -                size_t rest_len = end_name == NULL ? 0 : strlen (end_name);
> -                /* dirname contains end_name; we can't free it now.  */
> -                char *prev_dirname =
> -                  (__glibc_unlikely (malloc_dirname) ? dirname : NULL);
> -                char *d;
> -
> -                malloc_dirname = 0;
> -
> -                if (glob_use_alloca (alloca_used, home_len + rest_len + 1))
> -                  dirname = alloca_account (home_len + rest_len + 1,
> -                                            alloca_used);
> -                else
> +                if (!char_array_set_str (&dirname, p->pw_dir))
>                    {
> -                    dirname = malloc (home_len + rest_len + 1);
> -                    if (dirname == NULL)
> -                      {
> -                        free (prev_dirname);
> -                        scratch_buffer_free (&pwtmpbuf);
> -                        retval = GLOB_NOSPACE;
> -                        goto out;
> -                      }
> -                    malloc_dirname = 1;
> +                    scratch_buffer_free (&pwtmpbuf);
> +                    goto err_nospace;
>                    }
> -                d = mempcpy (dirname, p->pw_dir, home_len);
> -                if (end_name != NULL)
> -                  d = mempcpy (d, end_name, rest_len);
> -                *d = '\0';
> -
> -                free (prev_dirname);
>  
> -                dirlen = home_len + rest_len;
> -                dirname_modified = 1;
> +                dirlen = strlen (p->pw_dir);
> +                dirname_modified = true;
>                }

It appears that a concatenation (p->pw_dir and end_name) got missed here.

>              else
>                {
> @@ -908,37 +861,33 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
>          goto nospace;
>        pglob->gl_pathv = new_gl_pathv;
>  
> -      if (flags & GLOB_MARK && is_dir (dirname, flags, pglob))
> +      if (flags & GLOB_MARK
> +          && is_dir (char_array_str (&dirname), flags, pglob))

OK.

>          {
>            char *p;
>            pglob->gl_pathv[newcount] = malloc (dirlen + 2);
>            if (pglob->gl_pathv[newcount] == NULL)
>              goto nospace;
> -          p = mempcpy (pglob->gl_pathv[newcount], dirname, dirlen);
> +          p = mempcpy (pglob->gl_pathv[newcount],
> +                       char_array_str (&dirname), dirlen);

OK.

>            p[0] = '/';
>            p[1] = '\0';
> -          if (__glibc_unlikely (malloc_dirname))
> -            free (dirname);
>          }

OK.

>        else
>          {
> -          if (__glibc_unlikely (malloc_dirname))
> -            pglob->gl_pathv[newcount] = dirname;
> -          else
> -            {
> -              pglob->gl_pathv[newcount] = strdup (dirname);
> -              if (pglob->gl_pathv[newcount] == NULL)
> -                goto nospace;
> -            }
> +          pglob->gl_pathv[newcount] = strdup (char_array_str (&dirname));
> +          if (pglob->gl_pathv[newcount] == NULL)
> +            goto nospace;
>          }

OK.

>        pglob->gl_pathv[++newcount] = NULL;
>        ++pglob->gl_pathc;
>        pglob->gl_flags = flags;
>  
> -      return 0;
> +      goto out;
>      }

OK. 'out' so we can deallocate before returning.

>  
> -  meta = __glob_pattern_type (dirname, !(flags & GLOB_NOESCAPE));
> +  meta = __glob_pattern_type (char_array_str (&dirname),
> +                              !(flags & GLOB_NOESCAPE));

OK.

>    /* meta is 1 if correct glob pattern containing metacharacters.
>       If meta has bit (1 << 2) set, it means there was an unterminated
>       [ which we handle the same, using fnmatch.  Broken unterminated
> @@ -951,15 +900,15 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
>           the pattern in each directory found.  */
>        size_t i;
>  
> -      if (!(flags & GLOB_NOESCAPE) && dirlen > 0 && dirname[dirlen - 1] == '\\')
> +      if (!(flags & GLOB_NOESCAPE) && dirlen > 0
> +          && char_array_pos (&dirname, dirlen - 1) == '\\')

OK.

>          {
>            /* "foo\\/bar".  Remove the final backslash from dirname
>               if it has not been quoted.  */
> -          char *p = (char *) &dirname[dirlen - 1];
> -
> -          while (p > dirname && p[-1] == '\\') --p;
> -          if ((&dirname[dirlen] - p) & 1)
> -            *(char *) &dirname[--dirlen] = '\0';
> +          size_t p = dirlen - 1;
> +          while (p > 0 && char_array_pos (&dirname, p - 1) == '\\') --p;
> +          if ((dirlen - p) & 1)
> +            char_array_crop (&dirname, --dirlen);

Equivalent. We use an index instead of a pointer, and step back from the
end. OK.

>          }
>  
>        if (__glibc_unlikely ((flags & GLOB_ALTDIRFUNC) != 0))
> @@ -973,7 +922,7 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
>            dirs.gl_lstat = pglob->gl_lstat;
>          }
>  
> -      status = __glob (dirname,
> +      status = __glob (char_array_str (&dirname),
>                         ((flags & (GLOB_ERR | GLOB_NOESCAPE | GLOB_ALTDIRFUNC))
>                          | GLOB_NOSORT | GLOB_ONLYDIR),
>                         errfunc, &dirs);

OK.

> @@ -1020,8 +969,7 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
>                globfree (&dirs);
>                globfree (pglob);
>                pglob->gl_pathc = 0;
> -              retval = GLOB_NOSPACE;
> -              goto out;
> +              goto err_nospace;

OK.

>              }
>          }
>  
> @@ -1043,8 +991,7 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
>                  {
>                  nospace2:
>                    globfree (&dirs);
> -                  retval = GLOB_NOSPACE;
> -                  goto out;
> +                  goto err_nospace;
>                  }
>  
>                new_gl_pathv = realloc (pglob->gl_pathv,
> @@ -1059,8 +1006,7 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
>                    globfree (&dirs);
>                    globfree (pglob);
>                    pglob->gl_pathc = 0;
> -                  retval = GLOB_NOSPACE;
> -                  goto out;
> +                  goto err_nospace;
>                  }

Same.

>  
>                ++pglob->gl_pathc;
> @@ -1086,7 +1032,7 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
>  
>        if (meta & GLOBPAT_BACKSLASH)
>          {
> -          char *p = strchr (dirname, '\\'), *q;
> +          char *p = strchr (char_array_str (&dirname), '\\'), *q;

Okay.

>            /* We need to unescape the dirname string.  It is certainly
>               allocated by alloca, as otherwise filename would be NULL
>               or dirname wouldn't contain backslashes.  */
> @@ -1103,12 +1049,12 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
>                ++q;
>              }
>            while (*p++ != '\0');
> -          dirname_modified = 1;
> +          dirname_modified = true;
>          }
>        if (dirname_modified)
>          flags &= ~(GLOB_NOCHECK | GLOB_NOMAGIC);
> -      status = glob_in_dir (filename, dirname, flags, errfunc, pglob,
> -                            alloca_used);
> +      status = glob_in_dir (filename, char_array_str (&dirname), flags,
> +                            errfunc, pglob, alloca_used);

OK.

>        if (status != 0)
>          {
>            if (status == GLOB_NOMATCH && flags != orig_flags
> @@ -1126,14 +1072,13 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
>        if (dirlen > 0)
>          {
>            /* Stick the directory on the front of each name.  */
> -          if (prefix_array (dirname,
> +          if (prefix_array (char_array_str (&dirname),
>                              &pglob->gl_pathv[old_pathc + pglob->gl_offs],
>                              pglob->gl_pathc - old_pathc))

OK.

>              {
>                globfree (pglob);
>                pglob->gl_pathc = 0;
> -              retval = GLOB_NOSPACE;
> -              goto out;
> +              goto err_nospace;

OK.

>              }
>          }
>      }
> @@ -1152,8 +1097,7 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
>                {
>                  globfree (pglob);
>                  pglob->gl_pathc = 0;
> -                retval = GLOB_NOSPACE;
> -                goto out;
> +                goto err_nospace;

Same.

>                }
>              strcpy (&new[len - 2], "/");
>              pglob->gl_pathv[i] = new;
> @@ -1169,10 +1113,13 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
>      }
>  
>   out:
> -  if (__glibc_unlikely (malloc_dirname))
> -    free (dirname);
>  
> +  char_array_free (&dirname);
>    return retval;
> +
> + err_nospace:
> +  char_array_free (&dirname);
> +  return GLOB_NOSPACE;
>  }

OK. out and err_nospace, both deallocate before returning.

Cheers,
Arjun


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/8] posix: Use char_array for internal glob dirname
  2021-03-23 16:08   ` Arjun Shankar
@ 2021-03-24 17:39     ` Adhemerval Zanella
  0 siblings, 0 replies; 12+ messages in thread
From: Adhemerval Zanella @ 2021-03-24 17:39 UTC (permalink / raw)
  To: Arjun Shankar; +Cc: bug-gnulib, Paul Eggert, libc-alpha



On 23/03/2021 13:08, Arjun Shankar wrote:
> Hi Adhemerval,
> 
>> This is the first patch of the set to remove alloca usage on glob
>> implementation.  Internal path to search for file might expand to a
>> non static directory derived from pattern for some difference cases
>> (GLOB_NOESCAPE, GNU GLOB_TILDE) and to allow a non-static dirname
>> path glob uses a lot of boilerplate code to manage the buffer (which
>> is either allocated using alloca or malloc depending both to size
>> requested and the total alloca_used).
>>
>> The patch changes to use the char_array struct with the default size
>> (256 bytes).  It simplifies all the allocation code by using char_array
>> one and every internal buffer access is done using char_array provided
>> functions.  No functional changes are expected.
> 
> I've been going through this patch series. Here are my comments on the
> first one. I have mostly been looking at this from the point of view of
> making sure the logic remains equivalent, and not the way you tackled
> the problem itself.
> 
> In summary, I found a comparison reversed, a missed concatenation, and
> some minor indent issues. Other than that, this patch looks good to me.

Thanks for the review, however I would like to check if this approach is
what he want to move the glob implementation since it might diverge from
what Paul suggested for gnulib [1].

I tried to model the change to use something similar to a C++ string,
while Paul is caving out the required allocation from a scratch_buffer.
My view is since we expanding multiple different objects (the directory
name, the home directory, the username) it would make sense to make
then different dynarray, Paul in the other hand see that we can cave out
this multiple objects from a single scratch_buffer to simplify the
memory management. Maybe modelling like a C++ string is not the right
approach, since we need manual delocation anyway.

I don't have a strong opinion here, I will take a second look on his
patch to check if the outcome is simpler than my initial patchset.
My only goal here is get rid of the multuiple alloca and simplify
the memory management.

[1] https://sourceware.org/pipermail/libc-alpha/2021-January/121605.html

> 
> Details:
> 
>> diff --git a/posix/glob.c b/posix/glob.c
>> index 32c88e5d15..8c6e1914c5 100644
>> --- a/posix/glob.c
>> +++ b/posix/glob.c
>> @@ -85,6 +85,7 @@
>>  #include <flexmember.h>
>>  #include <glob_internal.h>
>>  #include <scratch_buffer.h>
>> +#include <malloc/char_array-skeleton.c>
> 
> Include the new dynamic character array implementation. OK.
> 
> Note:
> The patch that introduces char_array-skeleton.c will need a slight adjustment
> after de0e1b45b0ab (malloc: Sync dynarray with gnulib) due to the removal
> of the anonymous union.

Ack.

> 
>>  static const char *next_brace_sub (const char *begin, int flags) __THROWNL;
>>  
>> @@ -298,16 +299,15 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
>>          glob_t *pglob)
>>  {
>>    const char *filename;
>> -  char *dirname = NULL;
>>    size_t dirlen;
>>    int status;
>>    size_t oldcount;
>>    int meta;
>> -  int dirname_modified;
>> -  int malloc_dirname = 0;
>>    glob_t dirs;
>>    int retval = 0;
>>    size_t alloca_used = 0;
>> +  struct char_array dirname;
>> +  bool dirname_modified;
> 
> dirname changes type, dirname_modified should be a boolean, and malloc_dirname
> is now redundant.
> 
> OK.
>  
>>    if (pattern == NULL || pglob == NULL || (flags & ~__GLOB_FLAGS) != 0)
>>      {
>> @@ -315,6 +315,10 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
>>        return -1;
>>      }
>>  
>> +  /* Default char array is stack allocated, so there is no need to check
>> +     if setting the initial '\0' succeeds.  */
>> +  char_array_init_empty (&dirname);
>> +
>>    /* POSIX requires all slashes to be matched.  This means that with
>>       a trailing slash we must match only directories.  */
>>    if (pattern[0] && pattern[strlen (pattern) - 1] == '/')
> 
> OK.
> 
>> @@ -335,12 +339,12 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
>>            size_t i;
>>  
>>            if (pglob->gl_offs >= ~((size_t) 0) / sizeof (char *))
>> -            return GLOB_NOSPACE;
>> +            goto err_nospace;
>>  
>>            pglob->gl_pathv = (char **) malloc ((pglob->gl_offs + 1)
>>                                                * sizeof (char *));
> 
> err_nospace frees dirname and returns GLOB_NOSPACE.
> So the code is equivalent.
> OK.
> 
>>            if (pglob->gl_pathv == NULL)
>> -            return GLOB_NOSPACE;
>> +            goto err_nospace;
>>  
>>            for (i = 0; i <= pglob->gl_offs; ++i)
>>              pglob->gl_pathv[i] = NULL;
>> @@ -392,7 +396,7 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
>>              {
>>                onealt = malloc (pattern_len);
>>                if (onealt == NULL)
>> -                return GLOB_NOSPACE;
>> +                goto err_nospace;
>>              }
>>  
>>            /* We know the prefix for all sub-patterns.  */
> 
> OK. Same.
> 
>> @@ -454,7 +458,8 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
>>                        globfree (pglob);
>>                        pglob->gl_pathc = 0;
>>                      }
>> -                  return result;
>> +                  retval = result;
>> +                  goto out;
>>                  }
>>  
>>                if (*next == '}')
> 
> out frees dirname and returns retval.
> So the code is equivalent.
> OK.
> 
>> @@ -471,9 +476,10 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
>>  
>>            if (pglob->gl_pathc != firstc)
>>              /* We found some entries.  */
>> -            return 0;
>> +            retval = 0;
> 
> We will continue at 'out', which will also return after freeing dirname.
> So the code remains equivalent.
> OK.
> 
>>            else if (!(flags & (GLOB_NOCHECK|GLOB_NOMAGIC)))
>> -            return GLOB_NOMATCH;
>> +            retval = GLOB_NOMATCH;
>> +          goto out;
>>          }
>>      }
>>  
> 
> OK. Same here.
>  
>> @@ -492,14 +498,15 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
>>      filename = strchr (pattern, ':');
>>  #endif /* __MSDOS__ || WINDOWS32 */
>>  
>> -  dirname_modified = 0;
>> +  dirname_modified = false;
>>    if (filename == NULL)
>>      {
> 
> OK. It's declared as a boolean now.
> 
>>        /* This can mean two things: a simple name or "~name".  The latter
>>           case is nothing but a notation for a directory.  */
>>        if ((flags & (GLOB_TILDE|GLOB_TILDE_CHECK)) && pattern[0] == '~')
>>          {
>> -          dirname = (char *) pattern;
>> +         if (!char_array_set_str (&dirname, pattern))
>> +           goto err_nospace;
>>            dirlen = strlen (pattern);
>>  
>>            /* Set FILENAME to NULL as a special flag.  This is ugly but
> 
> OK. It's still equivalent. Since char_array_set_str can lead to a failed
> allocation, we add a check and exit with error if that happens.
> Indent is a bit off, though.

Ack.

> 
>> @@ -516,7 +523,8 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
>>              }
>>  
>>            filename = pattern;
>> -          dirname = (char *) ".";
>> +         if (!char_array_set_str (&dirname, "."))
>> +           goto err_nospace;
>>            dirlen = 0;
>>          }
>>      }
> 
> OK. Same as last. Return an error if the allocation fails.
> Indent is a bit off.
> 
>> @@ -525,13 +533,13 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
>>                 && (flags & GLOB_NOESCAPE) == 0))
>>      {
>>        /* "/pattern" or "\\/pattern".  */
>> -      dirname = (char *) "/";
>> +      if (!char_array_set_str (&dirname, "/"))
>> +       goto err_nospace;
>>        dirlen = 1;
>>        ++filename;
>>      }
> 
> OK.
> 
>>    else
>>      {
>> -      char *newp;
>>        dirlen = filename - pattern;
>>  #if defined __MSDOS__ || defined WINDOWS32
>>        if (*filename == ':'
> 
> OK. newp was used for malloc/alloca allocations.
> 
>> @@ -545,31 +553,26 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
>>            /* For now, disallow wildcards in the drive spec, to
>>               prevent infinite recursion in glob.  */
>>            if (__glob_pattern_p (drive_spec, !(flags & GLOB_NOESCAPE)))
>> -            return GLOB_NOMATCH;
>> +           {
>> +             retval = GLOB_NOMATCH;
>> +             goto out;
>> +           }
> 
> We need to deallocate before every return now. This does that.
> OK.
> 
>>            /* If this is "d:pattern", we need to copy ':' to DIRNAME
>>               as well.  If it's "d:/pattern", don't remove the slash
>>               from "d:/", since "d:" and "d:/" are not the same.*/
>>          }
>>  #endif
>>  
>> -      if (glob_use_alloca (alloca_used, dirlen + 1))
>> -        newp = alloca_account (dirlen + 1, alloca_used);
>> -      else
>> -        {
>> -          newp = malloc (dirlen + 1);
>> -          if (newp == NULL)
>> -            return GLOB_NOSPACE;
>> -          malloc_dirname = 1;
>> -        }
>> -      *((char *) mempcpy (newp, pattern, dirlen)) = '\0';
>> -      dirname = newp;
>> +      if (!char_array_set_str_size (&dirname, pattern, dirlen))
>> +       goto err_nospace;
>>        ++filename;
> 
> We used to alloca/malloc some space and copy pattern into it. We still do
> the same but using a dynarray. So don't need malloc_dirname any more.
> OK.
> 
>>  #if defined __MSDOS__ || defined WINDOWS32
>>        bool drive_root = (dirlen > 1
>> -                         && (dirname[dirlen - 1] == ':'
>> -                             || (dirlen > 2 && dirname[dirlen - 2] == ':'
>> -                                 && dirname[dirlen - 1] == '/')));
>> +                         && (char_array_pos (&dirname, dirlen - 1) != ':'
>> +                             || (dirlen > 2
>> +                             && char_array_pos (&dirname, dirlen - 2) != ':'
>> +                                 && char_array_pos (&dirname, dirlen - 1) != '/')));
>>  #else
>>        bool drive_root = false;
>>  #endif
> 
> Looks like the comparisons have been reversed. I think they should be `=='
> instead of `!='.
> 
>> @@ -578,20 +581,24 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
>>          /* "pattern/".  Expand "pattern", appending slashes.  */
>>          {
>>            int orig_flags = flags;
>> -          if (!(flags & GLOB_NOESCAPE) && dirname[dirlen - 1] == '\\')
>> +          if (!(flags & GLOB_NOESCAPE)
>> +              && char_array_pos (&dirname, dirlen - 1) == '\\')
>>              {
> 
> OK.
> 
>>                /* "pattern\\/".  Remove the final backslash if it hasn't
>>                   been quoted.  */
>> -              char *p = (char *) &dirname[dirlen - 1];
>> -
>> -              while (p > dirname && p[-1] == '\\') --p;
>> -              if ((&dirname[dirlen] - p) & 1)
>> +              size_t p = dirlen - 1;
>> +              while (p > 0 && char_array_pos (&dirname, p - 1) == '\\') --p;
>> +              if ((dirlen - p) & 1)
>>                  {
>> -                  *(char *) &dirname[--dirlen] = '\0';
>> +                  /* Since we are shrinking the array, there is no need to
>> +                     check the function return.  */
>> +                  dirlen -= 1;
>> +                  char_array_crop (&dirname, dirlen);
>>                    flags &= ~(GLOB_NOCHECK | GLOB_NOMAGIC);
>>                  }
>>              }
> 
> p was a pointer to the last character in dirname, and we looped on it going
> backwards towards the start of dirname looking for a '\'.
> 
> Now, p is an index into dirname and we do the same thing.
> 
> Looks equivalent, and more readable.
> OK.
> 
>> -          int val = __glob (dirname, flags | GLOB_MARK, errfunc, pglob);
>> +          int val = __glob (char_array_str (&dirname), flags | GLOB_MARK,
>> +                            errfunc, pglob);
>>            if (val == 0)
>>              pglob->gl_flags = ((pglob->gl_flags & ~GLOB_MARK)
>>                                 | (flags & GLOB_MARK));
> 
> OK.
> 
>> @@ -608,11 +615,14 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
>>          }
>>      }
>>  
>> -  if ((flags & (GLOB_TILDE|GLOB_TILDE_CHECK)) && dirname[0] == '~')
>> +  if ((flags & (GLOB_TILDE|GLOB_TILDE_CHECK))
>> +      && char_array_pos (&dirname, 0) == '~')
>>      {
> 
> OK.
> 
>> -      if (dirname[1] == '\0' || dirname[1] == '/'
>> -          || (!(flags & GLOB_NOESCAPE) && dirname[1] == '\\'
>> -              && (dirname[2] == '\0' || dirname[2] == '/')))
>> +      if (char_array_pos (&dirname, 1) == '\0'
>> +         || char_array_pos (&dirname, 1) == '/'
>> +         || (!(flags & GLOB_NOESCAPE) && char_array_pos (&dirname, 1) == '\\'
>> +             && (char_array_pos (&dirname, 2) == '\0'
>> +                || char_array_pos (&dirname, 2) == '/')))
> 
> OK. They do the same thing.
> Indent needs a fix.
> 
>>          {
>>            /* Look up home directory.  */
>>            char *home_dir = getenv ("HOME");
>> @@ -663,10 +673,7 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
>>                    if (err != ERANGE)
>>                      break;
>>                    if (!scratch_buffer_grow (&s))
>> -                    {
>> -                      retval = GLOB_NOSPACE;
>> -                      goto out;
>> -                    }
>> +                  goto err_nospace;
>>                  }
>>                if (err == 0)
>>                  {
> 
> OK.
> 
>> @@ -675,10 +682,7 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
>>                  }
>>                scratch_buffer_free (&s);
>>                if (err == 0 && home_dir == NULL)
>> -                {
>> -                  retval = GLOB_NOSPACE;
>> -                  goto out;
>> -                }
>> +              goto err_nospace;
>>  #endif /* WINDOWS32 */
>>              }
>>            if (home_dir == NULL || home_dir[0] == '\0')
> 
> OK.
> 
>> @@ -697,53 +701,26 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
>>                  }
>>              }
>>            /* Now construct the full directory.  */
>> -          if (dirname[1] == '\0')
>> +          if (char_array_pos (&dirname, 1) == '\0')
>>              {
> 
> OK.
> 
>> -              if (__glibc_unlikely (malloc_dirname))
>> -                free (dirname);
>> -
> 
> OK. We don't use malloc for dirname any more.
> 
>> -              dirname = home_dir;
>> -              dirlen = strlen (dirname);
>> -              malloc_dirname = malloc_home_dir;
>> +              if (!char_array_set_str (&dirname, home_dir))
>> +                goto err_nospace;
>> +              dirlen = char_array_size (&dirname) - 1;
> 
> Equivalent, and we don't use malloc_dirname any more so we throw away that
> assignment.
> OK.
> 
>>              }
>>            else
>>              {
>> -              char *newp;
>> -              size_t home_len = strlen (home_dir);
>> -              int use_alloca = glob_use_alloca (alloca_used, home_len + dirlen);
>> -              if (use_alloca)
>> -                newp = alloca_account (home_len + dirlen, alloca_used);
>> -              else
>> -                {
>> -                  newp = malloc (home_len + dirlen);
>> -                  if (newp == NULL)
>> -                    {
>> -                      if (__glibc_unlikely (malloc_home_dir))
>> -                        free (home_dir);
>> -                      retval = GLOB_NOSPACE;
>> -                      goto out;
>> -                    }
>> -                }
> 
> This code was allocating enough memory to concatenate home_dir and dirname.
> 
>> -              mempcpy (mempcpy (newp, home_dir, home_len),
>> -                       &dirname[1], dirlen);
> 
> ...Then concatenating it.
> 
>> -              if (__glibc_unlikely (malloc_dirname))
>> -                free (dirname);
>> -
>> -              dirname = newp;
>> -              dirlen += home_len - 1;
>> -              malloc_dirname = !use_alloca;
>> -
>> -              if (__glibc_unlikely (malloc_home_dir))
>> -                free (home_dir);
> 
> ...And freeing any previously allocated memory.
> 
>> +              /* Replaces '~' by the obtained HOME dir.  */
>> +              char_array_erase (&dirname, 0);
>> +              if (!char_array_prepend_str (&dirname, home_dir))
>> +               goto err_nospace;
> 
> Now we do it using a dynarray.
> OK. Indent needs a fix.
> 
>>              }
>> -          dirname_modified = 1;
>> +          dirname_modified = true;
>>          }
> 
> OK.
> 
>>        else
>>          {
>>  #ifndef WINDOWS32
>> -          char *end_name = strchr (dirname, '/');
>> +          char *dirnamestr = char_array_at (&dirname, 0);
>> +          char *end_name = strchr (dirnamestr, '/');
> 
> Equivalent. OK.
> 
>>            char *user_name;
>>            int malloc_user_name = 0;
>>            char *unescape = NULL;
>> @@ -752,23 +729,23 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
>>              {
>>                if (end_name == NULL)
>>                  {
>> -                  unescape = strchr (dirname, '\\');
>> +                unescape = strchr (dirnamestr, '\\');
>>                    if (unescape)
>>                      end_name = strchr (unescape, '\0');
>>                  }
> 
> OK. Indent needs a fix, and further down as well.
> 
>>                else
>> -                unescape = memchr (dirname, '\\', end_name - dirname);
>> +              unescape = memchr (dirnamestr, '\\', end_name - dirnamestr);
>>              }
> 
> OK.
> 
>>            if (end_name == NULL)
>> -            user_name = dirname + 1;
>> +            user_name = dirnamestr + 1;
> 
> OK.
> 
>>            else
>>              {
>>                char *newp;
>> -              if (glob_use_alloca (alloca_used, end_name - dirname))
>> -                newp = alloca_account (end_name - dirname, alloca_used);
>> +              if (glob_use_alloca (alloca_used, end_name - dirnamestr))
>> +                newp = alloca_account (end_name - dirnamestr, alloca_used);
>>                else
>>                  {
>> -                  newp = malloc (end_name - dirname);
>> +                  newp = malloc (end_name - dirnamestr);
>>                    if (newp == NULL)
>>                      {
>>                        retval = GLOB_NOSPACE;
> 
> OK. This newp is for user_name which is tackled in a separate patch.
> 
>> @@ -778,8 +755,8 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
>>                  }
>>                if (unescape != NULL)
>>                  {
>> -                  char *p = mempcpy (newp, dirname + 1,
>> -                                     unescape - dirname - 1);
>> +                  char *p = mempcpy (newp, dirnamestr + 1,
>> +                                     unescape - dirnamestr - 1);
>>                    char *q = unescape;
>>                    while (q != end_name)
>>                      {
>> @@ -801,8 +778,9 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
>>                    *p = '\0';
>>                  }
>>                else
>> -                *((char *) mempcpy (newp, dirname + 1, end_name - dirname - 1))
>> -                  = '\0';
>> +                *((char *) mempcpy (newp, dirnamestr + 1,
>> +                                    end_name - dirnamestr - 1))
>> +                   = '\0';
>>                user_name = newp;
>>              }
> 
> Same. OK. There appears to be a changed line due to a stray whitespace.
> 
>> @@ -835,39 +813,14 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
>>              /* If we found a home directory use this.  */
>>              if (p != NULL)
>>                {
>> -                size_t home_len = strlen (p->pw_dir);
>> -                size_t rest_len = end_name == NULL ? 0 : strlen (end_name);
>> -                /* dirname contains end_name; we can't free it now.  */
>> -                char *prev_dirname =
>> -                  (__glibc_unlikely (malloc_dirname) ? dirname : NULL);
>> -                char *d;
>> -
>> -                malloc_dirname = 0;
>> -
>> -                if (glob_use_alloca (alloca_used, home_len + rest_len + 1))
>> -                  dirname = alloca_account (home_len + rest_len + 1,
>> -                                            alloca_used);
>> -                else
>> +                if (!char_array_set_str (&dirname, p->pw_dir))
>>                    {
>> -                    dirname = malloc (home_len + rest_len + 1);
>> -                    if (dirname == NULL)
>> -                      {
>> -                        free (prev_dirname);
>> -                        scratch_buffer_free (&pwtmpbuf);
>> -                        retval = GLOB_NOSPACE;
>> -                        goto out;
>> -                      }
>> -                    malloc_dirname = 1;
>> +                    scratch_buffer_free (&pwtmpbuf);
>> +                    goto err_nospace;
>>                    }
>> -                d = mempcpy (dirname, p->pw_dir, home_len);
>> -                if (end_name != NULL)
>> -                  d = mempcpy (d, end_name, rest_len);
>> -                *d = '\0';
>> -
>> -                free (prev_dirname);
>>  
>> -                dirlen = home_len + rest_len;
>> -                dirname_modified = 1;
>> +                dirlen = strlen (p->pw_dir);
>> +                dirname_modified = true;
>>                }
> 
> It appears that a concatenation (p->pw_dir and end_name) got missed here.
> 
>>              else
>>                {
>> @@ -908,37 +861,33 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
>>          goto nospace;
>>        pglob->gl_pathv = new_gl_pathv;
>>  
>> -      if (flags & GLOB_MARK && is_dir (dirname, flags, pglob))
>> +      if (flags & GLOB_MARK
>> +          && is_dir (char_array_str (&dirname), flags, pglob))
> 
> OK.
> 
>>          {
>>            char *p;
>>            pglob->gl_pathv[newcount] = malloc (dirlen + 2);
>>            if (pglob->gl_pathv[newcount] == NULL)
>>              goto nospace;
>> -          p = mempcpy (pglob->gl_pathv[newcount], dirname, dirlen);
>> +          p = mempcpy (pglob->gl_pathv[newcount],
>> +                       char_array_str (&dirname), dirlen);
> 
> OK.
> 
>>            p[0] = '/';
>>            p[1] = '\0';
>> -          if (__glibc_unlikely (malloc_dirname))
>> -            free (dirname);
>>          }
> 
> OK.
> 
>>        else
>>          {
>> -          if (__glibc_unlikely (malloc_dirname))
>> -            pglob->gl_pathv[newcount] = dirname;
>> -          else
>> -            {
>> -              pglob->gl_pathv[newcount] = strdup (dirname);
>> -              if (pglob->gl_pathv[newcount] == NULL)
>> -                goto nospace;
>> -            }
>> +          pglob->gl_pathv[newcount] = strdup (char_array_str (&dirname));
>> +          if (pglob->gl_pathv[newcount] == NULL)
>> +            goto nospace;
>>          }
> 
> OK.
> 
>>        pglob->gl_pathv[++newcount] = NULL;
>>        ++pglob->gl_pathc;
>>        pglob->gl_flags = flags;
>>  
>> -      return 0;
>> +      goto out;
>>      }
> 
> OK. 'out' so we can deallocate before returning.
> 
>>  
>> -  meta = __glob_pattern_type (dirname, !(flags & GLOB_NOESCAPE));
>> +  meta = __glob_pattern_type (char_array_str (&dirname),
>> +                              !(flags & GLOB_NOESCAPE));
> 
> OK.
> 
>>    /* meta is 1 if correct glob pattern containing metacharacters.
>>       If meta has bit (1 << 2) set, it means there was an unterminated
>>       [ which we handle the same, using fnmatch.  Broken unterminated
>> @@ -951,15 +900,15 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
>>           the pattern in each directory found.  */
>>        size_t i;
>>  
>> -      if (!(flags & GLOB_NOESCAPE) && dirlen > 0 && dirname[dirlen - 1] == '\\')
>> +      if (!(flags & GLOB_NOESCAPE) && dirlen > 0
>> +          && char_array_pos (&dirname, dirlen - 1) == '\\')
> 
> OK.
> 
>>          {
>>            /* "foo\\/bar".  Remove the final backslash from dirname
>>               if it has not been quoted.  */
>> -          char *p = (char *) &dirname[dirlen - 1];
>> -
>> -          while (p > dirname && p[-1] == '\\') --p;
>> -          if ((&dirname[dirlen] - p) & 1)
>> -            *(char *) &dirname[--dirlen] = '\0';
>> +          size_t p = dirlen - 1;
>> +          while (p > 0 && char_array_pos (&dirname, p - 1) == '\\') --p;
>> +          if ((dirlen - p) & 1)
>> +            char_array_crop (&dirname, --dirlen);
> 
> Equivalent. We use an index instead of a pointer, and step back from the
> end. OK.
> 
>>          }
>>  
>>        if (__glibc_unlikely ((flags & GLOB_ALTDIRFUNC) != 0))
>> @@ -973,7 +922,7 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
>>            dirs.gl_lstat = pglob->gl_lstat;
>>          }
>>  
>> -      status = __glob (dirname,
>> +      status = __glob (char_array_str (&dirname),
>>                         ((flags & (GLOB_ERR | GLOB_NOESCAPE | GLOB_ALTDIRFUNC))
>>                          | GLOB_NOSORT | GLOB_ONLYDIR),
>>                         errfunc, &dirs);
> 
> OK.
> 
>> @@ -1020,8 +969,7 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
>>                globfree (&dirs);
>>                globfree (pglob);
>>                pglob->gl_pathc = 0;
>> -              retval = GLOB_NOSPACE;
>> -              goto out;
>> +              goto err_nospace;
> 
> OK.
> 
>>              }
>>          }
>>  
>> @@ -1043,8 +991,7 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
>>                  {
>>                  nospace2:
>>                    globfree (&dirs);
>> -                  retval = GLOB_NOSPACE;
>> -                  goto out;
>> +                  goto err_nospace;
>>                  }
>>  
>>                new_gl_pathv = realloc (pglob->gl_pathv,
>> @@ -1059,8 +1006,7 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
>>                    globfree (&dirs);
>>                    globfree (pglob);
>>                    pglob->gl_pathc = 0;
>> -                  retval = GLOB_NOSPACE;
>> -                  goto out;
>> +                  goto err_nospace;
>>                  }
> 
> Same.
> 
>>  
>>                ++pglob->gl_pathc;
>> @@ -1086,7 +1032,7 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
>>  
>>        if (meta & GLOBPAT_BACKSLASH)
>>          {
>> -          char *p = strchr (dirname, '\\'), *q;
>> +          char *p = strchr (char_array_str (&dirname), '\\'), *q;
> 
> Okay.
> 
>>            /* We need to unescape the dirname string.  It is certainly
>>               allocated by alloca, as otherwise filename would be NULL
>>               or dirname wouldn't contain backslashes.  */
>> @@ -1103,12 +1049,12 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
>>                ++q;
>>              }
>>            while (*p++ != '\0');
>> -          dirname_modified = 1;
>> +          dirname_modified = true;
>>          }
>>        if (dirname_modified)
>>          flags &= ~(GLOB_NOCHECK | GLOB_NOMAGIC);
>> -      status = glob_in_dir (filename, dirname, flags, errfunc, pglob,
>> -                            alloca_used);
>> +      status = glob_in_dir (filename, char_array_str (&dirname), flags,
>> +                            errfunc, pglob, alloca_used);
> 
> OK.
> 
>>        if (status != 0)
>>          {
>>            if (status == GLOB_NOMATCH && flags != orig_flags
>> @@ -1126,14 +1072,13 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
>>        if (dirlen > 0)
>>          {
>>            /* Stick the directory on the front of each name.  */
>> -          if (prefix_array (dirname,
>> +          if (prefix_array (char_array_str (&dirname),
>>                              &pglob->gl_pathv[old_pathc + pglob->gl_offs],
>>                              pglob->gl_pathc - old_pathc))
> 
> OK.
> 
>>              {
>>                globfree (pglob);
>>                pglob->gl_pathc = 0;
>> -              retval = GLOB_NOSPACE;
>> -              goto out;
>> +              goto err_nospace;
> 
> OK.
> 
>>              }
>>          }
>>      }
>> @@ -1152,8 +1097,7 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
>>                {
>>                  globfree (pglob);
>>                  pglob->gl_pathc = 0;
>> -                retval = GLOB_NOSPACE;
>> -                goto out;
>> +                goto err_nospace;
> 
> Same.
> 
>>                }
>>              strcpy (&new[len - 2], "/");
>>              pglob->gl_pathv[i] = new;
>> @@ -1169,10 +1113,13 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
>>      }
>>  
>>   out:
>> -  if (__glibc_unlikely (malloc_dirname))
>> -    free (dirname);
>>  
>> +  char_array_free (&dirname);
>>    return retval;
>> +
>> + err_nospace:
>> +  char_array_free (&dirname);
>> +  return GLOB_NOSPACE;
>>  }
> 
> OK. out and err_nospace, both deallocate before returning.
> 
> Cheers,
> Arjun
> 


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2021-03-24 17:43 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-05 18:58 [PATCH 0/8] Remove alloca usage from glob Adhemerval Zanella
2021-01-05 18:58 ` [PATCH 1/8] malloc: Add specialized dynarray for C strings Adhemerval Zanella
2021-01-05 18:58 ` [PATCH 2/8] posix: Use char_array for internal glob dirname Adhemerval Zanella
2021-03-23 16:08   ` Arjun Shankar
2021-03-24 17:39     ` Adhemerval Zanella
2021-01-05 18:58 ` [PATCH 3/8] posix: Remove alloca usage for GLOB_BRACE on glob Adhemerval Zanella
2021-01-05 18:58 ` [PATCH 4/8] posix: Remove alloca usage on glob dirname Adhemerval Zanella
2021-01-05 18:58 ` [PATCH 5/8] posix: Use dynarray for globname in glob Adhemerval Zanella
2021-01-05 18:58 ` [PATCH 6/8] posix: Remove alloca usage on glob user_name Adhemerval Zanella
2021-01-05 18:58 ` [PATCH 7/8] posix: Use char_array for home_dir in glob Adhemerval Zanella
2021-01-05 18:58 ` [PATCH 8/8] posix: Remove all alloca usage " Adhemerval Zanella
2021-01-13 19:36 ` [PATCH 0/8] Remove alloca usage from glob Paul Eggert

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).