unofficial mirror of libc-alpha@sourceware.org
 help / color / mirror / Atom feed
* [PATCH v2 0/9] Fix getdents{64} regression on some FS
@ 2020-10-02 17:06 Adhemerval Zanella via Libc-alpha
  2020-10-02 17:06 ` [PATCH v2 1/9] linux: Move posix dir implementations to Linux Adhemerval Zanella via Libc-alpha
                   ` (9 more replies)
  0 siblings, 10 replies; 29+ messages in thread
From: Adhemerval Zanella via Libc-alpha @ 2020-10-02 17:06 UTC (permalink / raw)
  To: libc-alpha; +Cc: James Clarke, John Paul Adrian Glaubitz

This is updated version [1], rebased against master add with some
suggestion from previous review.  I am re-sending because BZ #23960
report keep being updated with users that see the BZ #23960 issue in
different scenarios.

I have reports from both Debian and Gentoo developers that this
patchset fixes the issues they saw with recent glibc on some
scenarios (schroot plus qemu) and on some filesystem (ext4).  They
can correct me with their testing status.

[1] https://sourceware.org/pipermail/libc-alpha/2020-April/112866.html

Adhemerval Zanella (9):
  linux: Move posix dir implementations to Linux
  linux: Simplify opendir buffer allocation
  linux: Add __readdir_unlocked
  linux: Use getdents64 on non-LFS readdir
  linux: Set internal DIR filepos as off64_t [BZ #23960, BZ #24050]
  linux: Add __readdir64_unlocked
  linux: Add __old_readdir64_unlocked
  linux: Use getdents64 on readdir64 compat implementation
  dirent: Deprecate getdirentries

 NEWS                                          |   3 +
 dirent/Makefile                               |   2 +-
 dirent/dirent.h                               |  11 +-
 dirent/tst-seekdir2.c                         | 156 +++++++++++++++++
 include/dirent.h                              |   5 +-
 support/temp_file.c                           |  12 +-
 support/temp_file.h                           |   7 +
 sysdeps/posix/readdir.c                       | 127 --------------
 sysdeps/posix/readdir_r.c                     | 159 ------------------
 sysdeps/posix/telldir.c                       |  33 ----
 sysdeps/unix/sysv/linux/Makefile              |   3 +
 sysdeps/{posix => unix/sysv/linux}/closedir.c |   4 +
 sysdeps/{posix => unix/sysv/linux}/dirfd.c    |   0
 .../{posix => unix/sysv/linux}/dirstream.h    |   7 +-
 .../{posix => unix/sysv/linux}/fdopendir.c    |   0
 sysdeps/unix/sysv/linux/getdents64.c          |  95 +----------
 sysdeps/unix/sysv/linux/olddirent.h           |   4 +-
 sysdeps/{posix => unix/sysv/linux}/opendir.c  |  56 +++---
 sysdeps/unix/sysv/linux/readdir.c             |  69 +++++++-
 sysdeps/unix/sysv/linux/readdir.h             | 159 ++++++++++++++++++
 sysdeps/unix/sysv/linux/readdir64.c           | 149 ++++++++++++++--
 sysdeps/unix/sysv/linux/readdir64_r.c         | 103 ++++++++++--
 sysdeps/unix/sysv/linux/readdir_r.c           |  50 +++++-
 .../{posix => unix/sysv/linux}/rewinddir.c    |   5 +
 sysdeps/{posix => unix/sysv/linux}/seekdir.c  |  36 +++-
 sysdeps/unix/sysv/linux/telldir.c             |  76 +++++++++
 sysdeps/unix/sysv/linux/telldir.h             |  64 +++++++
 27 files changed, 907 insertions(+), 488 deletions(-)
 create mode 100644 dirent/tst-seekdir2.c
 delete mode 100644 sysdeps/posix/readdir.c
 delete mode 100644 sysdeps/posix/readdir_r.c
 delete mode 100644 sysdeps/posix/telldir.c
 rename sysdeps/{posix => unix/sysv/linux}/closedir.c (95%)
 rename sysdeps/{posix => unix/sysv/linux}/dirfd.c (100%)
 rename sysdeps/{posix => unix/sysv/linux}/dirstream.h (93%)
 rename sysdeps/{posix => unix/sysv/linux}/fdopendir.c (100%)
 rename sysdeps/{posix => unix/sysv/linux}/opendir.c (74%)
 create mode 100644 sysdeps/unix/sysv/linux/readdir.h
 rename sysdeps/{posix => unix/sysv/linux}/rewinddir.c (95%)
 rename sysdeps/{posix => unix/sysv/linux}/seekdir.c (65%)
 create mode 100644 sysdeps/unix/sysv/linux/telldir.c
 create mode 100644 sysdeps/unix/sysv/linux/telldir.h

-- 
2.25.1


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v2 1/9] linux: Move posix dir implementations to Linux
  2020-10-02 17:06 [PATCH v2 0/9] Fix getdents{64} regression on some FS Adhemerval Zanella via Libc-alpha
@ 2020-10-02 17:06 ` Adhemerval Zanella via Libc-alpha
  2020-10-13 15:33   ` Florian Weimer via Libc-alpha
  2020-10-02 17:06 ` [PATCH v2 2/9] linux: Simplify opendir buffer allocation Adhemerval Zanella via Libc-alpha
                   ` (8 subsequent siblings)
  9 siblings, 1 reply; 29+ messages in thread
From: Adhemerval Zanella via Libc-alpha @ 2020-10-02 17:06 UTC (permalink / raw)
  To: libc-alpha; +Cc: James Clarke, John Paul Adrian Glaubitz

This generic implementation already expects a getdents API which
is Linux specific.  It also allows simplify it by assuming
_DIRENT_HAVE_D_RECLEN and _DIRENT_HAVE_D_OFF support.

The readdir are also expanded on each required implementation,
futher fixes and improvements will make parametrize the
implementation more complex.

Checked on x86_64-linux-gnu, i686-linux-gnu, and with a build
for all affected ABIs.
---
 sysdeps/posix/readdir.c                       | 127 ------------
 sysdeps/posix/readdir_r.c                     | 159 --------------
 sysdeps/{posix => unix/sysv/linux}/closedir.c |   0
 sysdeps/{posix => unix/sysv/linux}/dirfd.c    |   0
 .../{posix => unix/sysv/linux}/dirstream.h    |   0
 .../{posix => unix/sysv/linux}/fdopendir.c    |   0
 sysdeps/{posix => unix/sysv/linux}/opendir.c  |   0
 sysdeps/unix/sysv/linux/readdir.c             |  65 +++++-
 sysdeps/unix/sysv/linux/readdir64.c           | 131 +++++++++++-
 sysdeps/unix/sysv/linux/readdir64_r.c         | 194 +++++++++++++++++-
 sysdeps/unix/sysv/linux/readdir_r.c           |  95 ++++++++-
 .../{posix => unix/sysv/linux}/rewinddir.c    |   0
 sysdeps/{posix => unix/sysv/linux}/seekdir.c  |   0
 sysdeps/{posix => unix/sysv/linux}/telldir.c  |   0
 14 files changed, 462 insertions(+), 309 deletions(-)
 delete mode 100644 sysdeps/posix/readdir.c
 delete mode 100644 sysdeps/posix/readdir_r.c
 rename sysdeps/{posix => unix/sysv/linux}/closedir.c (100%)
 rename sysdeps/{posix => unix/sysv/linux}/dirfd.c (100%)
 rename sysdeps/{posix => unix/sysv/linux}/dirstream.h (100%)
 rename sysdeps/{posix => unix/sysv/linux}/fdopendir.c (100%)
 rename sysdeps/{posix => unix/sysv/linux}/opendir.c (100%)
 rename sysdeps/{posix => unix/sysv/linux}/rewinddir.c (100%)
 rename sysdeps/{posix => unix/sysv/linux}/seekdir.c (100%)
 rename sysdeps/{posix => unix/sysv/linux}/telldir.c (100%)

diff --git a/sysdeps/posix/readdir.c b/sysdeps/posix/readdir.c
deleted file mode 100644
index b36278b5f4..0000000000
--- a/sysdeps/posix/readdir.c
+++ /dev/null
@@ -1,127 +0,0 @@
-/* Copyright (C) 1991-2020 Free Software Foundation, Inc.
-   This file is part of the GNU C Library.
-
-   The GNU C Library is free software; you can redistribute it and/or
-   modify it under the terms of the GNU Lesser General Public
-   License as published by the Free Software Foundation; either
-   version 2.1 of the License, or (at your option) any later version.
-
-   The GNU C Library is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
-   Lesser General Public License for more details.
-
-   You should have received a copy of the GNU Lesser General Public
-   License along with the GNU C Library; if not, see
-   <https://www.gnu.org/licenses/>.  */
-
-#include <errno.h>
-#include <limits.h>
-#include <stddef.h>
-#include <string.h>
-#include <dirent.h>
-#include <unistd.h>
-#include <sys/types.h>
-#include <assert.h>
-
-#include <dirstream.h>
-
-#ifndef __READDIR
-# define __READDIR __readdir
-# define __GETDENTS __getdents
-# define DIRENT_TYPE struct dirent
-# define __READDIR_ALIAS
-#endif
-
-/* Read a directory entry from DIRP.  */
-DIRENT_TYPE *
-__READDIR (DIR *dirp)
-{
-  DIRENT_TYPE *dp;
-  int saved_errno = errno;
-
-#if IS_IN (libc)
-  __libc_lock_lock (dirp->lock);
-#endif
-
-  do
-    {
-      size_t reclen;
-
-      if (dirp->offset >= dirp->size)
-	{
-	  /* We've emptied out our buffer.  Refill it.  */
-
-	  size_t maxread;
-	  ssize_t bytes;
-
-#ifndef _DIRENT_HAVE_D_RECLEN
-	  /* Fixed-size struct; must read one at a time (see below).  */
-	  maxread = sizeof *dp;
-#else
-	  maxread = dirp->allocation;
-#endif
-
-	  bytes = __GETDENTS (dirp->fd, dirp->data, maxread);
-	  if (bytes <= 0)
-	    {
-	      /* On some systems getdents fails with ENOENT when the
-		 open directory has been rmdir'd already.  POSIX.1
-		 requires that we treat this condition like normal EOF.  */
-	      if (bytes < 0 && errno == ENOENT)
-		bytes = 0;
-
-	      /* Don't modifiy errno when reaching EOF.  */
-	      if (bytes == 0)
-		__set_errno (saved_errno);
-	      dp = NULL;
-	      break;
-	    }
-	  dirp->size = (size_t) bytes;
-
-	  /* Reset the offset into the buffer.  */
-	  dirp->offset = 0;
-	}
-
-      dp = (DIRENT_TYPE *) &dirp->data[dirp->offset];
-
-#ifdef _DIRENT_HAVE_D_RECLEN
-      reclen = dp->d_reclen;
-#else
-      /* The only version of `struct dirent*' that lacks `d_reclen'
-	 is fixed-size.  */
-      assert (sizeof dp->d_name > 1);
-      reclen = sizeof *dp;
-      /* The name is not terminated if it is the largest possible size.
-	 Clobber the following byte to ensure proper null termination.  We
-	 read jst one entry at a time above so we know that byte will not
-	 be used later.  */
-      dp->d_name[sizeof dp->d_name] = '\0';
-#endif
-
-      dirp->offset += reclen;
-
-#ifdef _DIRENT_HAVE_D_OFF
-      dirp->filepos = dp->d_off;
-#else
-      dirp->filepos += reclen;
-#endif
-
-      /* Skip deleted files.  */
-    } while (dp->d_ino == 0);
-
-#if IS_IN (libc)
-  __libc_lock_unlock (dirp->lock);
-#endif
-
-  return dp;
-}
-
-#ifdef __READDIR_ALIAS
-weak_alias (__readdir, readdir)
-#endif
-
-#undef __READDIR
-#undef __GETDENTS
-#undef DIRENT_TYPE
-#undef __READDIR_ALIAS
diff --git a/sysdeps/posix/readdir_r.c b/sysdeps/posix/readdir_r.c
deleted file mode 100644
index 9079abc2ff..0000000000
--- a/sysdeps/posix/readdir_r.c
+++ /dev/null
@@ -1,159 +0,0 @@
-/* Copyright (C) 1991-2020 Free Software Foundation, Inc.
-   This file is part of the GNU C Library.
-
-   The GNU C Library is free software; you can redistribute it and/or
-   modify it under the terms of the GNU Lesser General Public
-   License as published by the Free Software Foundation; either
-   version 2.1 of the License, or (at your option) any later version.
-
-   The GNU C Library is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
-   Lesser General Public License for more details.
-
-   You should have received a copy of the GNU Lesser General Public
-   License along with the GNU C Library; if not, see
-   <https://www.gnu.org/licenses/>.  */
-
-#include <errno.h>
-#include <limits.h>
-#include <stddef.h>
-#include <string.h>
-#include <dirent.h>
-#include <unistd.h>
-#include <sys/types.h>
-#include <assert.h>
-
-#include <dirstream.h>
-
-#ifndef __READDIR_R
-# define __READDIR_R __readdir_r
-# define __GETDENTS __getdents
-# define DIRENT_TYPE struct dirent
-# define __READDIR_R_ALIAS
-#endif
-
-/* Read a directory entry from DIRP.  */
-int
-__READDIR_R (DIR *dirp, DIRENT_TYPE *entry, DIRENT_TYPE **result)
-{
-  DIRENT_TYPE *dp;
-  size_t reclen;
-  const int saved_errno = errno;
-  int ret;
-
-  __libc_lock_lock (dirp->lock);
-
-  do
-    {
-      if (dirp->offset >= dirp->size)
-	{
-	  /* We've emptied out our buffer.  Refill it.  */
-
-	  size_t maxread;
-	  ssize_t bytes;
-
-#ifndef _DIRENT_HAVE_D_RECLEN
-	  /* Fixed-size struct; must read one at a time (see below).  */
-	  maxread = sizeof *dp;
-#else
-	  maxread = dirp->allocation;
-#endif
-
-	  bytes = __GETDENTS (dirp->fd, dirp->data, maxread);
-	  if (bytes <= 0)
-	    {
-	      /* On some systems getdents fails with ENOENT when the
-		 open directory has been rmdir'd already.  POSIX.1
-		 requires that we treat this condition like normal EOF.  */
-	      if (bytes < 0 && errno == ENOENT)
-		{
-		  bytes = 0;
-		  __set_errno (saved_errno);
-		}
-	      if (bytes < 0)
-		dirp->errcode = errno;
-
-	      dp = NULL;
-	      break;
-	    }
-	  dirp->size = (size_t) bytes;
-
-	  /* Reset the offset into the buffer.  */
-	  dirp->offset = 0;
-	}
-
-      dp = (DIRENT_TYPE *) &dirp->data[dirp->offset];
-
-#ifdef _DIRENT_HAVE_D_RECLEN
-      reclen = dp->d_reclen;
-#else
-      /* The only version of `struct dirent*' that lacks `d_reclen'
-	 is fixed-size.  */
-      assert (sizeof dp->d_name > 1);
-      reclen = sizeof *dp;
-      /* The name is not terminated if it is the largest possible size.
-	 Clobber the following byte to ensure proper null termination.  We
-	 read just one entry at a time above so we know that byte will not
-	 be used later.  */
-      dp->d_name[sizeof dp->d_name] = '\0';
-#endif
-
-      dirp->offset += reclen;
-
-#ifdef _DIRENT_HAVE_D_OFF
-      dirp->filepos = dp->d_off;
-#else
-      dirp->filepos += reclen;
-#endif
-
-#ifdef NAME_MAX
-      if (reclen > offsetof (DIRENT_TYPE, d_name) + NAME_MAX + 1)
-	{
-	  /* The record is very long.  It could still fit into the
-	     caller-supplied buffer if we can skip padding at the
-	     end.  */
-	  size_t namelen = _D_EXACT_NAMLEN (dp);
-	  if (namelen <= NAME_MAX)
-	    reclen = offsetof (DIRENT_TYPE, d_name) + namelen + 1;
-	  else
-	    {
-	      /* The name is too long.  Ignore this file.  */
-	      dirp->errcode = ENAMETOOLONG;
-	      dp->d_ino = 0;
-	      continue;
-	    }
-	}
-#endif
-
-      /* Skip deleted and ignored files.  */
-    }
-  while (dp->d_ino == 0);
-
-  if (dp != NULL)
-    {
-      *result = memcpy (entry, dp, reclen);
-#ifdef _DIRENT_HAVE_D_RECLEN
-      entry->d_reclen = reclen;
-#endif
-      ret = 0;
-    }
-  else
-    {
-      *result = NULL;
-      ret = dirp->errcode;
-    }
-
-  __libc_lock_unlock (dirp->lock);
-
-  return ret;
-}
-
-#ifdef __READDIR_R_ALIAS
-weak_alias (__readdir_r, readdir_r)
-#endif
-
-#undef __READDIR_R
-#undef __GETDENTS
-#undef DIRENT_TYPE
-#undef __READDIR_R_ALIAS
diff --git a/sysdeps/posix/closedir.c b/sysdeps/unix/sysv/linux/closedir.c
similarity index 100%
rename from sysdeps/posix/closedir.c
rename to sysdeps/unix/sysv/linux/closedir.c
diff --git a/sysdeps/posix/dirfd.c b/sysdeps/unix/sysv/linux/dirfd.c
similarity index 100%
rename from sysdeps/posix/dirfd.c
rename to sysdeps/unix/sysv/linux/dirfd.c
diff --git a/sysdeps/posix/dirstream.h b/sysdeps/unix/sysv/linux/dirstream.h
similarity index 100%
rename from sysdeps/posix/dirstream.h
rename to sysdeps/unix/sysv/linux/dirstream.h
diff --git a/sysdeps/posix/fdopendir.c b/sysdeps/unix/sysv/linux/fdopendir.c
similarity index 100%
rename from sysdeps/posix/fdopendir.c
rename to sysdeps/unix/sysv/linux/fdopendir.c
diff --git a/sysdeps/posix/opendir.c b/sysdeps/unix/sysv/linux/opendir.c
similarity index 100%
rename from sysdeps/posix/opendir.c
rename to sysdeps/unix/sysv/linux/opendir.c
diff --git a/sysdeps/unix/sysv/linux/readdir.c b/sysdeps/unix/sysv/linux/readdir.c
index df7a92aa78..2e03e66e69 100644
--- a/sysdeps/unix/sysv/linux/readdir.c
+++ b/sysdeps/unix/sysv/linux/readdir.c
@@ -19,5 +19,68 @@
 #include <dirent.h>
 
 #if !_DIRENT_MATCHES_DIRENT64
-# include <sysdeps/posix/readdir.c>
+#include <dirstream.h>
+
+/* Read a directory entry from DIRP.  */
+struct dirent *
+__readdir (DIR *dirp)
+{
+  struct dirent *dp;
+  int saved_errno = errno;
+
+#if IS_IN (libc)
+  __libc_lock_lock (dirp->lock);
+#endif
+
+  do
+    {
+      size_t reclen;
+
+      if (dirp->offset >= dirp->size)
+	{
+	  /* We've emptied out our buffer.  Refill it.  */
+
+	  size_t maxread = dirp->allocation;
+	  ssize_t bytes;
+
+	  bytes = __getdents (dirp->fd, dirp->data, maxread);
+	  if (bytes <= 0)
+	    {
+	      /* On some systems getdents fails with ENOENT when the
+		 open directory has been rmdir'd already.  POSIX.1
+		 requires that we treat this condition like normal EOF.  */
+	      if (bytes < 0 && errno == ENOENT)
+		bytes = 0;
+
+	      /* Don't modifiy errno when reaching EOF.  */
+	      if (bytes == 0)
+		__set_errno (saved_errno);
+	      dp = NULL;
+	      break;
+	    }
+	  dirp->size = (size_t) bytes;
+
+	  /* Reset the offset into the buffer.  */
+	  dirp->offset = 0;
+	}
+
+      dp = (struct dirent *) &dirp->data[dirp->offset];
+
+      reclen = dp->d_reclen;
+
+      dirp->offset += reclen;
+
+      dirp->filepos = dp->d_off;
+
+      /* Skip deleted files.  */
+    } while (dp->d_ino == 0);
+
+#if IS_IN (libc)
+  __libc_lock_unlock (dirp->lock);
+#endif
+
+  return dp;
+}
+weak_alias (__readdir, readdir)
+
 #endif
diff --git a/sysdeps/unix/sysv/linux/readdir64.c b/sysdeps/unix/sysv/linux/readdir64.c
index 170a889c51..1aa6e2664f 100644
--- a/sysdeps/unix/sysv/linux/readdir64.c
+++ b/sysdeps/unix/sysv/linux/readdir64.c
@@ -23,17 +23,71 @@
 #define readdir   __no_readdir_decl
 #define __readdir __no___readdir_decl
 #include <dirent.h>
+#undef __readdir
+#undef readdir
 
-#define __READDIR   __readdir64
-#define __GETDENTS  __getdents64
-#define DIRENT_TYPE struct dirent64
+/* Read a directory entry from DIRP.  */
+struct dirent64 *
+__readdir64 (DIR *dirp)
+{
+  struct dirent64 *dp;
+  int saved_errno = errno;
 
-#include <sysdeps/posix/readdir.c>
+#if IS_IN (libc)
+  __libc_lock_lock (dirp->lock);
+#endif
 
-#undef __readdir
-#undef readdir
+  do
+    {
+      size_t reclen;
+
+      if (dirp->offset >= dirp->size)
+	{
+	  /* We've emptied out our buffer.  Refill it.  */
+
+	  size_t maxread = dirp->allocation;
+	  ssize_t bytes;
+
+	  bytes = __getdents64 (dirp->fd, dirp->data, maxread);
+	  if (bytes <= 0)
+	    {
+	      /* On some systems getdents fails with ENOENT when the
+		 open directory has been rmdir'd already.  POSIX.1
+		 requires that we treat this condition like normal EOF.  */
+	      if (bytes < 0 && errno == ENOENT)
+		bytes = 0;
+
+	      /* Don't modifiy errno when reaching EOF.  */
+	      if (bytes == 0)
+		__set_errno (saved_errno);
+	      dp = NULL;
+	      break;
+	    }
+	  dirp->size = (size_t) bytes;
 
+	  /* Reset the offset into the buffer.  */
+	  dirp->offset = 0;
+	}
+
+      dp = (struct dirent64 *) &dirp->data[dirp->offset];
+
+      reclen = dp->d_reclen;
+
+      dirp->offset += reclen;
+
+      dirp->filepos = dp->d_off;
+
+      /* Skip deleted files.  */
+    } while (dp->d_ino == 0);
+
+#if IS_IN (libc)
+  __libc_lock_unlock (dirp->lock);
+#endif
+
+  return dp;
+}
 libc_hidden_def (__readdir64)
+
 #if _DIRENT_MATCHES_DIRENT64
 strong_alias (__readdir64, __readdir)
 weak_alias (__readdir64, readdir64)
@@ -49,10 +103,67 @@ versioned_symbol (libc, __readdir64, readdir64, GLIBC_2_2);
 # endif
 # if SHLIB_COMPAT(libc, GLIBC_2_1, GLIBC_2_2)
 #  include <olddirent.h>
-#  define __READDIR   attribute_compat_text_section __old_readdir64
-#  define __GETDENTS  __old_getdents64
-#  define DIRENT_TYPE struct __old_dirent64
-#  include <sysdeps/posix/readdir.c>
+
+attribute_compat_text_section
+struct __old_dirent64 *
+__old_readdir64 (DIR *dirp)
+{
+  struct __old_dirent64 *dp;
+  int saved_errno = errno;
+
+#if IS_IN (libc)
+  __libc_lock_lock (dirp->lock);
+#endif
+
+  do
+    {
+      size_t reclen;
+
+      if (dirp->offset >= dirp->size)
+	{
+	  /* We've emptied out our buffer.  Refill it.  */
+
+	  size_t maxread = dirp->allocation;
+	  ssize_t bytes;
+
+	  bytes = __old_getdents64 (dirp->fd, dirp->data, maxread);
+	  if (bytes <= 0)
+	    {
+	      /* On some systems getdents fails with ENOENT when the
+		 open directory has been rmdir'd already.  POSIX.1
+		 requires that we treat this condition like normal EOF.  */
+	      if (bytes < 0 && errno == ENOENT)
+		bytes = 0;
+
+	      /* Don't modifiy errno when reaching EOF.  */
+	      if (bytes == 0)
+		__set_errno (saved_errno);
+	      dp = NULL;
+	      break;
+	    }
+	  dirp->size = (size_t) bytes;
+
+	  /* Reset the offset into the buffer.  */
+	  dirp->offset = 0;
+	}
+
+      dp = (struct __old_dirent64 *) &dirp->data[dirp->offset];
+
+      reclen = dp->d_reclen;
+
+      dirp->offset += reclen;
+
+      dirp->filepos = dp->d_off;
+
+      /* Skip deleted files.  */
+    } while (dp->d_ino == 0);
+
+#if IS_IN (libc)
+  __libc_lock_unlock (dirp->lock);
+#endif
+
+  return dp;
+}
 libc_hidden_def (__old_readdir64)
 compat_symbol (libc, __old_readdir64, readdir64, GLIBC_2_1);
 # endif /* SHLIB_COMPAT(libc, GLIBC_2_1, GLIBC_2_2)  */
diff --git a/sysdeps/unix/sysv/linux/readdir64_r.c b/sysdeps/unix/sysv/linux/readdir64_r.c
index 6d589f36f5..c587787417 100644
--- a/sysdeps/unix/sysv/linux/readdir64_r.c
+++ b/sysdeps/unix/sysv/linux/readdir64_r.c
@@ -23,15 +23,100 @@
 #define readdir_r   __no_readdir_r_decl
 #define __readdir_r __no___readdir_r_decl
 #include <dirent.h>
+#undef __readdir_r
+#undef readdir_r
 
-#define __READDIR_R __readdir64_r
-#define __GETDENTS  __getdents64
-#define DIRENT_TYPE struct dirent64
+/* Read a directory entry from DIRP.  */
+int
+__readdir64_r (DIR *dirp, struct dirent64 *entry, struct dirent64 **result)
+{
+  struct dirent64 *dp;
+  size_t reclen;
+  const int saved_errno = errno;
+  int ret;
 
-#include <sysdeps/posix/readdir_r.c>
+  __libc_lock_lock (dirp->lock);
+
+  do
+    {
+      if (dirp->offset >= dirp->size)
+	{
+	  /* We've emptied out our buffer.  Refill it.  */
+
+	  size_t maxread = dirp->allocation;
+	  ssize_t bytes;
+
+	  maxread = dirp->allocation;
+
+	  bytes = __getdents64 (dirp->fd, dirp->data, maxread);
+	  if (bytes <= 0)
+	    {
+	      /* On some systems getdents fails with ENOENT when the
+		 open directory has been rmdir'd already.  POSIX.1
+		 requires that we treat this condition like normal EOF.  */
+	      if (bytes < 0 && errno == ENOENT)
+		{
+		  bytes = 0;
+		  __set_errno (saved_errno);
+		}
+	      if (bytes < 0)
+		dirp->errcode = errno;
+
+	      dp = NULL;
+	      break;
+	    }
+	  dirp->size = (size_t) bytes;
+
+	  /* Reset the offset into the buffer.  */
+	  dirp->offset = 0;
+	}
+
+      dp = (struct dirent64 *) &dirp->data[dirp->offset];
+
+      reclen = dp->d_reclen;
+
+      dirp->offset += reclen;
+
+      dirp->filepos = dp->d_off;
+
+      if (reclen > offsetof (struct dirent64, d_name) + NAME_MAX + 1)
+	{
+	  /* The record is very long.  It could still fit into the
+	     caller-supplied buffer if we can skip padding at the
+	     end.  */
+	  size_t namelen = _D_EXACT_NAMLEN (dp);
+	  if (namelen <= NAME_MAX)
+	    reclen = offsetof (struct dirent64, d_name) + namelen + 1;
+	  else
+	    {
+	      /* The name is too long.  Ignore this file.  */
+	      dirp->errcode = ENAMETOOLONG;
+	      dp->d_ino = 0;
+	      continue;
+	    }
+	}
+
+      /* Skip deleted and ignored files.  */
+    }
+  while (dp->d_ino == 0);
+
+  if (dp != NULL)
+    {
+      *result = memcpy (entry, dp, reclen);
+      entry->d_reclen = reclen;
+      ret = 0;
+    }
+  else
+    {
+      *result = NULL;
+      ret = dirp->errcode;
+    }
+
+  __libc_lock_unlock (dirp->lock);
+
+  return ret;
+}
 
-#undef __readdir_r
-#undef readdir_r
 
 #if _DIRENT_MATCHES_DIRENT64
 strong_alias (__readdir64_r, __readdir_r)
@@ -44,10 +129,99 @@ weak_alias (__readdir64_r, readdir64_r)
 versioned_symbol (libc, __readdir64_r, readdir64_r, GLIBC_2_2);
 # if SHLIB_COMPAT(libc, GLIBC_2_1, GLIBC_2_2)
 #  include <olddirent.h>
-#  define __READDIR_R attribute_compat_text_section __old_readdir64_r
-#  define __GETDENTS  __old_getdents64
-#  define DIRENT_TYPE struct __old_dirent64
-#  include <sysdeps/posix/readdir_r.c>
+
+int
+attribute_compat_text_section
+__old_readdir64_r (DIR *dirp, struct __old_dirent64 *entry,
+		   struct __old_dirent64 **result)
+{
+  struct __old_dirent64 *dp;
+  size_t reclen;
+  const int saved_errno = errno;
+  int ret;
+
+  __libc_lock_lock (dirp->lock);
+
+  do
+    {
+      if (dirp->offset >= dirp->size)
+	{
+	  /* We've emptied out our buffer.  Refill it.  */
+
+	  size_t maxread = dirp->allocation;
+	  ssize_t bytes;
+
+	  maxread = dirp->allocation;
+
+	  bytes = __old_getdents64 (dirp->fd, dirp->data, maxread);
+	  if (bytes <= 0)
+	    {
+	      /* On some systems getdents fails with ENOENT when the
+		 open directory has been rmdir'd already.  POSIX.1
+		 requires that we treat this condition like normal EOF.  */
+	      if (bytes < 0 && errno == ENOENT)
+		{
+		  bytes = 0;
+		  __set_errno (saved_errno);
+		}
+	      if (bytes < 0)
+		dirp->errcode = errno;
+
+	      dp = NULL;
+	      break;
+	    }
+	  dirp->size = (size_t) bytes;
+
+	  /* Reset the offset into the buffer.  */
+	  dirp->offset = 0;
+	}
+
+      dp = (struct __old_dirent64 *) &dirp->data[dirp->offset];
+
+      reclen = dp->d_reclen;
+
+      dirp->offset += reclen;
+
+      dirp->filepos = dp->d_off;
+
+      if (reclen > offsetof (struct __old_dirent64, d_name) + NAME_MAX + 1)
+	{
+	  /* The record is very long.  It could still fit into the
+	     caller-supplied buffer if we can skip padding at the
+	     end.  */
+	  size_t namelen = _D_EXACT_NAMLEN (dp);
+	  if (namelen <= NAME_MAX)
+	    reclen = offsetof (struct __old_dirent64, d_name) + namelen + 1;
+	  else
+	    {
+	      /* The name is too long.  Ignore this file.  */
+	      dirp->errcode = ENAMETOOLONG;
+	      dp->d_ino = 0;
+	      continue;
+	    }
+	}
+
+      /* Skip deleted and ignored files.  */
+    }
+  while (dp->d_ino == 0);
+
+  if (dp != NULL)
+    {
+      *result = memcpy (entry, dp, reclen);
+      entry->d_reclen = reclen;
+      ret = 0;
+    }
+  else
+    {
+      *result = NULL;
+      ret = dirp->errcode;
+    }
+
+  __libc_lock_unlock (dirp->lock);
+
+  return ret;
+}
+
 compat_symbol (libc, __old_readdir64_r, readdir64_r, GLIBC_2_1);
 # endif /* SHLIB_COMPAT(libc, GLIBC_2_1, GLIBC_2_2)  */
 #endif /* _DIRENT_MATCHES_DIRENT64  */
diff --git a/sysdeps/unix/sysv/linux/readdir_r.c b/sysdeps/unix/sysv/linux/readdir_r.c
index 30f237dbcc..0069041394 100644
--- a/sysdeps/unix/sysv/linux/readdir_r.c
+++ b/sysdeps/unix/sysv/linux/readdir_r.c
@@ -19,5 +19,96 @@
 #include <dirent.h>
 
 #if !_DIRENT_MATCHES_DIRENT64
-# include <sysdeps/posix/readdir_r.c>
-#endif
+/* Read a directory entry from DIRP.  */
+int
+__readdir_r (DIR *dirp, struct dirent *entry, struct dirent **result)
+{
+  struct dirent *dp;
+  size_t reclen;
+  const int saved_errno = errno;
+  int ret;
+
+  __libc_lock_lock (dirp->lock);
+
+  do
+    {
+      if (dirp->offset >= dirp->size)
+	{
+	  /* We've emptied out our buffer.  Refill it.  */
+
+	  size_t maxread = dirp->allocation;
+	  ssize_t bytes;
+
+	  maxread = dirp->allocation;
+
+	  bytes = __getdents (dirp->fd, dirp->data, maxread);
+	  if (bytes <= 0)
+	    {
+	      /* On some systems getdents fails with ENOENT when the
+		 open directory has been rmdir'd already.  POSIX.1
+		 requires that we treat this condition like normal EOF.  */
+	      if (bytes < 0 && errno == ENOENT)
+		{
+		  bytes = 0;
+		  __set_errno (saved_errno);
+		}
+	      if (bytes < 0)
+		dirp->errcode = errno;
+
+	      dp = NULL;
+	      break;
+	    }
+	  dirp->size = (size_t) bytes;
+
+	  /* Reset the offset into the buffer.  */
+	  dirp->offset = 0;
+	}
+
+      dp = (struct dirent *) &dirp->data[dirp->offset];
+
+      reclen = dp->d_reclen;
+
+      dirp->offset += reclen;
+
+      dirp->filepos = dp->d_off;
+
+      if (reclen > offsetof (struct dirent, d_name) + NAME_MAX + 1)
+	{
+	  /* The record is very long.  It could still fit into the
+	     caller-supplied buffer if we can skip padding at the
+	     end.  */
+	  size_t namelen = _D_EXACT_NAMLEN (dp);
+	  if (namelen <= NAME_MAX)
+	    reclen = offsetof (struct dirent, d_name) + namelen + 1;
+	  else
+	    {
+	      /* The name is too long.  Ignore this file.  */
+	      dirp->errcode = ENAMETOOLONG;
+	      dp->d_ino = 0;
+	      continue;
+	    }
+	}
+
+      /* Skip deleted and ignored files.  */
+    }
+  while (dp->d_ino == 0);
+
+  if (dp != NULL)
+    {
+      *result = memcpy (entry, dp, reclen);
+      entry->d_reclen = reclen;
+      ret = 0;
+    }
+  else
+    {
+      *result = NULL;
+      ret = dirp->errcode;
+    }
+
+  __libc_lock_unlock (dirp->lock);
+
+  return ret;
+}
+
+weak_alias (__readdir_r, readdir_r)
+#endif /* _DIRENT_MATCHES_DIRENT64  */
diff --git a/sysdeps/posix/rewinddir.c b/sysdeps/unix/sysv/linux/rewinddir.c
similarity index 100%
rename from sysdeps/posix/rewinddir.c
rename to sysdeps/unix/sysv/linux/rewinddir.c
diff --git a/sysdeps/posix/seekdir.c b/sysdeps/unix/sysv/linux/seekdir.c
similarity index 100%
rename from sysdeps/posix/seekdir.c
rename to sysdeps/unix/sysv/linux/seekdir.c
diff --git a/sysdeps/posix/telldir.c b/sysdeps/unix/sysv/linux/telldir.c
similarity index 100%
rename from sysdeps/posix/telldir.c
rename to sysdeps/unix/sysv/linux/telldir.c
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v2 2/9] linux: Simplify opendir buffer allocation
  2020-10-02 17:06 [PATCH v2 0/9] Fix getdents{64} regression on some FS Adhemerval Zanella via Libc-alpha
  2020-10-02 17:06 ` [PATCH v2 1/9] linux: Move posix dir implementations to Linux Adhemerval Zanella via Libc-alpha
@ 2020-10-02 17:06 ` Adhemerval Zanella via Libc-alpha
  2020-10-13 15:34   ` Florian Weimer via Libc-alpha
  2020-10-02 17:06 ` [PATCH v2 3/9] linux: Add __readdir_unlocked Adhemerval Zanella via Libc-alpha
                   ` (7 subsequent siblings)
  9 siblings, 1 reply; 29+ messages in thread
From: Adhemerval Zanella via Libc-alpha @ 2020-10-02 17:06 UTC (permalink / raw)
  To: libc-alpha; +Cc: James Clarke, John Paul Adrian Glaubitz

The fallback allocation is removed, so the possible size constraint
should be analyzed just once; __alloc_dir assumes that 'statp'
argument is non-null, and the max_buffer_size move to close its
used.

Checked on x86_64-linux-gnu and i686-linux-gnu.
---
 include/dirent.h                  |  3 +-
 sysdeps/unix/sysv/linux/opendir.c | 51 +++++++++++--------------------
 2 files changed, 20 insertions(+), 34 deletions(-)

diff --git a/include/dirent.h b/include/dirent.h
index 2b1cdcf8bd..fdf4c4a2f1 100644
--- a/include/dirent.h
+++ b/include/dirent.h
@@ -48,7 +48,8 @@ extern int __versionsort64 (const struct dirent64 **a,
 			    const struct dirent64 **b)
      __attribute_pure__;
 extern DIR *__alloc_dir (int fd, bool close_fd, int flags,
-			 const struct stat64 *statp) attribute_hidden;
+			 const struct stat64 *statp)
+     __nonnull (4) attribute_hidden;
 extern __typeof (rewinddir) __rewinddir;
 extern __typeof (seekdir) __seekdir;
 extern __typeof (dirfd) __dirfd;
diff --git a/sysdeps/unix/sysv/linux/opendir.c b/sysdeps/unix/sysv/linux/opendir.c
index e89e09bfc7..2198224588 100644
--- a/sysdeps/unix/sysv/linux/opendir.c
+++ b/sysdeps/unix/sysv/linux/opendir.c
@@ -23,12 +23,6 @@
 
 #include <not-cancel.h>
 
-/* The st_blksize value of the directory is used as a hint for the
-   size of the buffer which receives struct dirent values from the
-   kernel.  st_blksize is limited to MAX_DIR_BUFFER_SIZE, in case the
-   file system provides a bogus value.  */
-#define MAX_DIR_BUFFER_SIZE 1048576U
-
 enum {
   opendir_oflags = O_RDONLY|O_NDELAY|O_DIRECTORY|O_LARGEFILE|O_CLOEXEC
 };
@@ -100,38 +94,29 @@ __alloc_dir (int fd, bool close_fd, int flags, const struct stat64 *statp)
      file descriptor.  */
   if (!close_fd
       && __glibc_unlikely (__fcntl64_nocancel (fd, F_SETFD, FD_CLOEXEC) < 0))
-	goto lose;
-
-  const size_t default_allocation = (4 * BUFSIZ < sizeof (struct dirent64)
-				     ? sizeof (struct dirent64) : 4 * BUFSIZ);
-  const size_t small_allocation = (BUFSIZ < sizeof (struct dirent64)
-				   ? sizeof (struct dirent64) : BUFSIZ);
-  size_t allocation = default_allocation;
-#ifdef _STATBUF_ST_BLKSIZE
+    return NULL;
+
+  /* The st_blksize value of the directory is used as a hint for the
+     size of the buffer which receives struct dirent values from the
+     kernel.  st_blksize is limited to max_buffer_size, in case the
+     file system provides a bogus value.  */
+  enum { max_buffer_size = 1048576 };
+
+  const size_t allocation_size = 32768;
+  _Static_assert (allocation_size >= sizeof (struct dirent64),
+		  "allocation_size < sizeof (struct dirent64)");
+
   /* Increase allocation if requested, but not if the value appears to
-     be bogus.  */
-  if (statp != NULL)
-    allocation = MIN (MAX ((size_t) statp->st_blksize, default_allocation),
-		      MAX_DIR_BUFFER_SIZE);
-#endif
+     be bogus.  It will be between 32Kb and 1Mb.  */
+  size_t allocation = MIN (MAX ((size_t) statp->st_blksize, allocation_size),
+			   max_buffer_size);
 
   DIR *dirp = (DIR *) malloc (sizeof (DIR) + allocation);
   if (dirp == NULL)
     {
-      allocation = small_allocation;
-      dirp = (DIR *) malloc (sizeof (DIR) + allocation);
-
-      if (dirp == NULL)
-      lose:
-	{
-	  if (close_fd)
-	    {
-	      int save_errno = errno;
-	      __close_nocancel_nostatus (fd);
-	      __set_errno (save_errno);
-	    }
-	  return NULL;
-	}
+      if (close_fd)
+	__close_nocancel_nostatus (fd);
+      return NULL;
     }
 
   dirp->fd = fd;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v2 3/9] linux: Add __readdir_unlocked
  2020-10-02 17:06 [PATCH v2 0/9] Fix getdents{64} regression on some FS Adhemerval Zanella via Libc-alpha
  2020-10-02 17:06 ` [PATCH v2 1/9] linux: Move posix dir implementations to Linux Adhemerval Zanella via Libc-alpha
  2020-10-02 17:06 ` [PATCH v2 2/9] linux: Simplify opendir buffer allocation Adhemerval Zanella via Libc-alpha
@ 2020-10-02 17:06 ` Adhemerval Zanella via Libc-alpha
  2020-10-13 15:43   ` Florian Weimer via Libc-alpha
  2020-10-02 17:06 ` [PATCH v2 4/9] linux: Use getdents64 on non-LFS readdir Adhemerval Zanella via Libc-alpha
                   ` (6 subsequent siblings)
  9 siblings, 1 reply; 29+ messages in thread
From: Adhemerval Zanella via Libc-alpha @ 2020-10-02 17:06 UTC (permalink / raw)
  To: libc-alpha; +Cc: James Clarke, John Paul Adrian Glaubitz

And use it on readdir_r implementation.

Checked on i686-linux-gnu.
---
 include/dirent.h                    |  1 +
 sysdeps/unix/sysv/linux/readdir.c   | 18 +++++--
 sysdeps/unix/sysv/linux/readdir_r.c | 79 +++++++----------------------
 3 files changed, 31 insertions(+), 67 deletions(-)

diff --git a/include/dirent.h b/include/dirent.h
index fdf4c4a2f1..8325a19e5f 100644
--- a/include/dirent.h
+++ b/include/dirent.h
@@ -20,6 +20,7 @@ extern DIR *__opendirat (int dfd, const char *__name) attribute_hidden;
 extern DIR *__fdopendir (int __fd) attribute_hidden;
 extern int __closedir (DIR *__dirp) attribute_hidden;
 extern struct dirent *__readdir (DIR *__dirp) attribute_hidden;
+extern struct dirent *__readdir_unlocked (DIR *__dirp) attribute_hidden;
 extern struct dirent64 *__readdir64 (DIR *__dirp);
 libc_hidden_proto (__readdir64)
 extern int __readdir_r (DIR *__dirp, struct dirent *__entry,
diff --git a/sysdeps/unix/sysv/linux/readdir.c b/sysdeps/unix/sysv/linux/readdir.c
index 2e03e66e69..ca2a8964e9 100644
--- a/sysdeps/unix/sysv/linux/readdir.c
+++ b/sysdeps/unix/sysv/linux/readdir.c
@@ -23,15 +23,11 @@
 
 /* Read a directory entry from DIRP.  */
 struct dirent *
-__readdir (DIR *dirp)
+__readdir_unlocked (DIR *dirp)
 {
   struct dirent *dp;
   int saved_errno = errno;
 
-#if IS_IN (libc)
-  __libc_lock_lock (dirp->lock);
-#endif
-
   do
     {
       size_t reclen;
@@ -75,6 +71,18 @@ __readdir (DIR *dirp)
       /* Skip deleted files.  */
     } while (dp->d_ino == 0);
 
+  return dp;
+}
+
+struct dirent *
+__readdir (DIR *dirp)
+{
+  struct dirent *dp;
+
+#if IS_IN (libc)
+  __libc_lock_lock (dirp->lock);
+#endif
+  dp = __readdir_unlocked (dirp);
 #if IS_IN (libc)
   __libc_lock_unlock (dirp->lock);
 #endif
diff --git a/sysdeps/unix/sysv/linux/readdir_r.c b/sysdeps/unix/sysv/linux/readdir_r.c
index 0069041394..a01d2845a6 100644
--- a/sysdeps/unix/sysv/linux/readdir_r.c
+++ b/sysdeps/unix/sysv/linux/readdir_r.c
@@ -25,89 +25,44 @@ __readdir_r (DIR *dirp, struct dirent *entry, struct dirent **result)
 {
   struct dirent *dp;
   size_t reclen;
-  const int saved_errno = errno;
-  int ret;
 
   __libc_lock_lock (dirp->lock);
 
-  do
+  while (1)
     {
-      if (dirp->offset >= dirp->size)
-	{
-	  /* We've emptied out our buffer.  Refill it.  */
-
-	  size_t maxread = dirp->allocation;
-	  ssize_t bytes;
-
-	  maxread = dirp->allocation;
-
-	  bytes = __getdents (dirp->fd, dirp->data, maxread);
-	  if (bytes <= 0)
-	    {
-	      /* On some systems getdents fails with ENOENT when the
-		 open directory has been rmdir'd already.  POSIX.1
-		 requires that we treat this condition like normal EOF.  */
-	      if (bytes < 0 && errno == ENOENT)
-		{
-		  bytes = 0;
-		  __set_errno (saved_errno);
-		}
-	      if (bytes < 0)
-		dirp->errcode = errno;
-
-	      dp = NULL;
-	      break;
-	    }
-	  dirp->size = (size_t) bytes;
-
-	  /* Reset the offset into the buffer.  */
-	  dirp->offset = 0;
-	}
-
-      dp = (struct dirent *) &dirp->data[dirp->offset];
+      dp = __readdir_unlocked (dirp);
+      if (dp == NULL)
+	break;
 
       reclen = dp->d_reclen;
+      if (reclen <= offsetof (struct dirent, d_name) + NAME_MAX + 1)
+	break;
 
-      dirp->offset += reclen;
-
-      dirp->filepos = dp->d_off;
-
-      if (reclen > offsetof (struct dirent, d_name) + NAME_MAX + 1)
+      /* The record is very long.  It could still fit into the caller-supplied
+	 buffer if we can skip padding at the end.  */
+      size_t namelen = _D_EXACT_NAMLEN (dp);
+      if (namelen <= NAME_MAX)
 	{
-	  /* The record is very long.  It could still fit into the
-	     caller-supplied buffer if we can skip padding at the
-	     end.  */
-	  size_t namelen = _D_EXACT_NAMLEN (dp);
-	  if (namelen <= NAME_MAX)
-	    reclen = offsetof (struct dirent, d_name) + namelen + 1;
-	  else
-	    {
-	      /* The name is too long.  Ignore this file.  */
-	      dirp->errcode = ENAMETOOLONG;
-	      dp->d_ino = 0;
-	      continue;
-	    }
+	  reclen = offsetof (struct dirent, d_name) + namelen + 1;
+	  break;
 	}
 
-      /* Skip deleted and ignored files.  */
+      /* The name is too long.  Ignore this file.  */
+      dirp->errcode = ENAMETOOLONG;
+      dp->d_ino = 0;
     }
-  while (dp->d_ino == 0);
 
   if (dp != NULL)
     {
       *result = memcpy (entry, dp, reclen);
       entry->d_reclen = reclen;
-      ret = 0;
     }
   else
-    {
-      *result = NULL;
-      ret = dirp->errcode;
-    }
+    *result = NULL;
 
   __libc_lock_unlock (dirp->lock);
 
-  return ret;
+  return dp != NULL ? 0 : dirp->errcode;
 }
 
 weak_alias (__readdir_r, readdir_r)
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v2 4/9] linux: Use getdents64 on non-LFS readdir
  2020-10-02 17:06 [PATCH v2 0/9] Fix getdents{64} regression on some FS Adhemerval Zanella via Libc-alpha
                   ` (2 preceding siblings ...)
  2020-10-02 17:06 ` [PATCH v2 3/9] linux: Add __readdir_unlocked Adhemerval Zanella via Libc-alpha
@ 2020-10-02 17:06 ` Adhemerval Zanella via Libc-alpha
  2020-10-13 15:59   ` Florian Weimer via Libc-alpha
  2020-10-02 17:06 ` [PATCH v2 5/9] linux: Set internal DIR filepos as off64_t [BZ #23960, BZ #24050] Adhemerval Zanella via Libc-alpha
                   ` (5 subsequent siblings)
  9 siblings, 1 reply; 29+ messages in thread
From: Adhemerval Zanella via Libc-alpha @ 2020-10-02 17:06 UTC (permalink / raw)
  To: libc-alpha; +Cc: James Clarke, John Paul Adrian Glaubitz

It reserves some space on the allocated internal buffer to be
used as a the returned dirent struct.  The kernel obtained dirent64
struct are copied to the temporary buffer on each readdir call.

The overflow test is moved once the dirent64 entry is copied
to the temporary buffer, and a subsequent readdir will obtain the
next entry.  The idea is an overflow fails to return the entry on
readdir, but a next readdir might still obtain the next entry.
(for filesystem that does not have the concept of sequential d_off,
such as ext4).

Checked on x86_64-linux-gnu and i686-linux-gnu.
---
 sysdeps/unix/sysv/linux/opendir.c |   6 +-
 sysdeps/unix/sysv/linux/readdir.c |  25 +++----
 sysdeps/unix/sysv/linux/readdir.h | 117 ++++++++++++++++++++++++++++++
 3 files changed, 131 insertions(+), 17 deletions(-)
 create mode 100644 sysdeps/unix/sysv/linux/readdir.h

diff --git a/sysdeps/unix/sysv/linux/opendir.c b/sysdeps/unix/sysv/linux/opendir.c
index 2198224588..24bd63d2ba 100644
--- a/sysdeps/unix/sysv/linux/opendir.c
+++ b/sysdeps/unix/sysv/linux/opendir.c
@@ -22,6 +22,7 @@
 #include <sys/param.h>	/* For MIN and MAX.  */
 
 #include <not-cancel.h>
+#include <readdir.h>    /* For return_buffer_size.  */
 
 enum {
   opendir_oflags = O_RDONLY|O_NDELAY|O_DIRECTORY|O_LARGEFILE|O_CLOEXEC
@@ -103,8 +104,9 @@ __alloc_dir (int fd, bool close_fd, int flags, const struct stat64 *statp)
   enum { max_buffer_size = 1048576 };
 
   const size_t allocation_size = 32768;
-  _Static_assert (allocation_size >= sizeof (struct dirent64),
-		  "allocation_size < sizeof (struct dirent64)");
+  _Static_assert (allocation_size >= sizeof (struct dirent64)
+				     + return_buffer_size,
+		  "opendir buffer size smaller than required");
 
   /* Increase allocation if requested, but not if the value appears to
      be bogus.  It will be between 32Kb and 1Mb.  */
diff --git a/sysdeps/unix/sysv/linux/readdir.c b/sysdeps/unix/sysv/linux/readdir.c
index ca2a8964e9..8eab0f4c9b 100644
--- a/sysdeps/unix/sysv/linux/readdir.c
+++ b/sysdeps/unix/sysv/linux/readdir.c
@@ -19,7 +19,7 @@
 #include <dirent.h>
 
 #if !_DIRENT_MATCHES_DIRENT64
-#include <dirstream.h>
+#include <readdir.h>
 
 /* Read a directory entry from DIRP.  */
 struct dirent *
@@ -30,16 +30,12 @@ __readdir_unlocked (DIR *dirp)
 
   do
     {
-      size_t reclen;
-
       if (dirp->offset >= dirp->size)
 	{
 	  /* We've emptied out our buffer.  Refill it.  */
 
-	  size_t maxread = dirp->allocation;
-	  ssize_t bytes;
-
-	  bytes = __getdents (dirp->fd, dirp->data, maxread);
+	  ssize_t bytes = __getdents64 (dirp->fd, dirstream_data (dirp),
+					dirstream_alloc_size (dirp));
 	  if (bytes <= 0)
 	    {
 	      /* On some systems getdents fails with ENOENT when the
@@ -54,19 +50,18 @@ __readdir_unlocked (DIR *dirp)
 	      dp = NULL;
 	      break;
 	    }
-	  dirp->size = (size_t) bytes;
+	  dirp->size = bytes;
 
 	  /* Reset the offset into the buffer.  */
 	  dirp->offset = 0;
 	}
 
-      dp = (struct dirent *) &dirp->data[dirp->offset];
-
-      reclen = dp->d_reclen;
-
-      dirp->offset += reclen;
-
-      dirp->filepos = dp->d_off;
+      dp = dirstream_ret_entry (dirp);
+      if (dp == NULL)
+	{
+	  __set_errno (EOVERFLOW);
+	  break;
+	}
 
       /* Skip deleted files.  */
     } while (dp->d_ino == 0);
diff --git a/sysdeps/unix/sysv/linux/readdir.h b/sysdeps/unix/sysv/linux/readdir.h
new file mode 100644
index 0000000000..4dc219e220
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/readdir.h
@@ -0,0 +1,117 @@
+/* Linux readdir internal implementation details.
+   Copyright (C) 2020 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#ifndef _DIRSTREAM_NOLFS_H
+#define _DIRSTREAM_NOLFS_H
+
+#if !_DIRENT_MATCHES_DIRENT64
+# include <dirstream.h>
+
+/* getdents64 is used internally for both LFS and non-LFS implementations.
+   The non-LFS interface reserves part of the allocated buffer to return the
+   non-LFS 'struct dirent' entry.  */
+
+/* This defines the reserved space size on DIR internal buffer to use as the
+   returned 'struct dirent' from a 'readdir' call.
+
+   The largest possible practical length of the d_name member are 255
+   Unicode characters in UTF-8 encoding, so d_name is 766 bytes long, plus
+   10 bytes from header, for a total of 776 bytes total.
+
+   Also it should take in cosideration the alignment requirement for
+   getdents64 call.  */
+enum { return_buffer_size = 1024
+			    + sizeof (off64_t)
+			    - _Alignof (((struct __dirstream) {0}).data) };
+
+_Static_assert ((_Alignof (((struct __dirstream) {0}).data)
+		 + return_buffer_size) % sizeof (off64_t) == 0,
+		"return_buffer_size does not align the buffer properly");
+
+/* Return the avaliable buffer size to use with getdents64 calls.  */
+static inline size_t
+dirstream_alloc_size (struct __dirstream *ds)
+{
+  return ds->allocation - return_buffer_size;
+}
+
+/* Return the start of the allocated buffer minus the reserved part to use on
+   non-LFS readdir call.  */
+static inline void *
+dirstream_data (struct __dirstream *ds)
+{
+  return (char *) ds->data + return_buffer_size;
+}
+
+/* Return the allocated buffer used on non-LFS readdir call.  */
+static inline struct dirent *
+dirstream_ret (struct __dirstream *ds)
+{
+  return (struct dirent *) ds->data;
+}
+
+/* Return the current dirent64 entry from the reserved buffer used on
+   getdent64.  */
+static inline struct dirent64 *
+dirstream_entry (struct __dirstream *ds)
+{
+  size_t offset = return_buffer_size + ds->offset;
+  return (struct dirent64 *) ((char *) ds->data + offset);
+}
+
+/* Copy one obtained entry from 'getdents64' call to the reserved space
+   on DS allocated buffer and updated its internal state.  */
+static inline struct dirent *
+dirstream_ret_entry (struct __dirstream *ds)
+{
+  struct dirent64 *dp64 = dirstream_entry (ds);
+  struct dirent *dp = dirstream_ret (ds);
+
+  dp->d_ino = dp64->d_ino;
+
+  dp->d_off = dp64->d_off;
+  if (dp->d_off != dp64->d_off)
+    /* Overflow.  */
+    return NULL;
+
+  const size_t size_diff = (offsetof (struct dirent64, d_name)
+			    - offsetof (struct dirent, d_name));
+  const size_t alignment = _Alignof (struct dirent);
+  size_t new_reclen  = (dp64->d_reclen - size_diff + alignment - 1)
+			& ~(alignment - 1);
+  if (new_reclen > return_buffer_size)
+    /* Overflow.  */
+    return NULL;
+  dp->d_reclen = new_reclen;
+
+  dp->d_type = dp64->d_type;
+
+  memcpy (dp->d_name, dp64->d_name,
+	  dp64->d_reclen - offsetof (struct dirent64, d_name));
+
+  ds->offset += dp64->d_reclen;
+  ds->filepos = dp64->d_off;
+
+  return dp;
+}
+#else
+/* No need to reserve an buffer space if dirent has already LFS support.  */
+enum { return_buffer_size = 0 };
+#endif /* _DIRENT_MATCHES_DIRENT64  */
+
+#endif
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v2 5/9] linux: Set internal DIR filepos as off64_t [BZ #23960, BZ #24050]
  2020-10-02 17:06 [PATCH v2 0/9] Fix getdents{64} regression on some FS Adhemerval Zanella via Libc-alpha
                   ` (3 preceding siblings ...)
  2020-10-02 17:06 ` [PATCH v2 4/9] linux: Use getdents64 on non-LFS readdir Adhemerval Zanella via Libc-alpha
@ 2020-10-02 17:06 ` Adhemerval Zanella via Libc-alpha
  2020-10-13 16:00   ` Florian Weimer via Libc-alpha
  2020-10-02 17:06 ` [PATCH v2 6/9] linux: Add __readdir64_unlocked Adhemerval Zanella via Libc-alpha
                   ` (4 subsequent siblings)
  9 siblings, 1 reply; 29+ messages in thread
From: Adhemerval Zanella via Libc-alpha @ 2020-10-02 17:06 UTC (permalink / raw)
  To: libc-alpha; +Cc: James Clarke, John Paul Adrian Glaubitz

It allows to obtain the expected entry offset on telldir and set
it correctly on seekdir on platforms where long int is smaller
than off64_t.

On such cases telldir will mantain an internal list that maps the
DIR object off64_t offsets to the returned long int (the function
return value).  The seekdir will then set the correct offset from
the internal list using the telldir as the list key.

It also removes the overflow check on readdir and the returned value
will be truncated by the non-LFS off_t size.  As Joseph has noted
in BZ #23960 comment #22, d_off is an opaque value and since
telldir/seekdir works regardless of the returned dirent d_off value.

Finally it removed the requirement to check for overflow values on
telldir (BZ #24050).

Checked on x86_64-linux-gnu, i686-linux-gnu, powerpc-linux-gnu,
and arm-linux-gnueabihf.
---
 dirent/Makefile                     |   2 +-
 dirent/tst-seekdir2.c               | 156 ++++++++++++++++++++++++++++
 support/temp_file.c                 |  12 ++-
 support/temp_file.h                 |   7 ++
 sysdeps/unix/sysv/linux/closedir.c  |   4 +
 sysdeps/unix/sysv/linux/dirstream.h |   7 +-
 sysdeps/unix/sysv/linux/opendir.c   |   3 +
 sysdeps/unix/sysv/linux/readdir.c   |   1 +
 sysdeps/unix/sysv/linux/readdir.h   |   7 +-
 sysdeps/unix/sysv/linux/rewinddir.c |   5 +
 sysdeps/unix/sysv/linux/seekdir.c   |  36 ++++++-
 sysdeps/unix/sysv/linux/telldir.c   |  47 ++++++++-
 sysdeps/unix/sysv/linux/telldir.h   |  64 ++++++++++++
 13 files changed, 333 insertions(+), 18 deletions(-)
 create mode 100644 dirent/tst-seekdir2.c
 create mode 100644 sysdeps/unix/sysv/linux/telldir.h

diff --git a/dirent/Makefile b/dirent/Makefile
index e917d5ceab..ba9755da64 100644
--- a/dirent/Makefile
+++ b/dirent/Makefile
@@ -31,7 +31,7 @@ routines	:= opendir closedir readdir readdir_r rewinddir \
 		   scandir-cancel scandir-tail scandir64-tail
 
 tests	   := list tst-seekdir opendir-tst1 bug-readdir1 tst-fdopendir \
-	      tst-fdopendir2 tst-scandir tst-scandir64
+	      tst-fdopendir2 tst-scandir tst-scandir64 tst-seekdir2
 
 CFLAGS-scandir.c += $(uses-callbacks)
 CFLAGS-scandir64.c += $(uses-callbacks)
diff --git a/dirent/tst-seekdir2.c b/dirent/tst-seekdir2.c
new file mode 100644
index 0000000000..4bd2509f72
--- /dev/null
+++ b/dirent/tst-seekdir2.c
@@ -0,0 +1,156 @@
+/* Check multiple telldir and seekdir.
+   Copyright (C) 2020 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#include <dirent.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <stdio.h>
+#include <string.h>
+
+#include <support/temp_file.h>
+#include <support/support.h>
+#include <support/check.h>
+
+/* Some filesystems returns a arbitrary value for d_off direnty entry (ext4
+   for instance, where the value is an internal hash key).  The idea of
+   create a large number of file is to try trigger a overflow d_off value
+   in a entry to check if telldir/seekdir does work corretly in such
+   case.  */
+static const char *dirname;
+static const size_t nfiles = 10240;
+
+static void
+do_prepare (int argc, char *argv[])
+{
+  dirname = support_create_temp_directory ("tst-seekdir2-");
+
+  for (size_t i = 0; i < nfiles; i++)
+    {
+      int fd = create_temp_file_in_dir ("tempfile.", dirname, NULL);
+      TEST_VERIFY_EXIT (fd > 0);
+      close (fd);
+    }
+}
+#define PREPARE do_prepare
+
+/* Check for old non Large File Support (LFS).  */
+static int
+do_test_not_lfs (void)
+{
+  DIR *dirp;
+  struct dirent *dp;
+  size_t dirp_count;
+
+  dirp = opendir (dirname);
+  TEST_VERIFY_EXIT (dirp != NULL);
+
+  dirp_count = 0;
+  for (dp = readdir (dirp);
+       dp != NULL;
+       dp = readdir (dirp))
+    dirp_count++;
+
+  /* The 2 extra files are '.' and '..'.  */
+  TEST_COMPARE (dirp_count, nfiles + 2);
+
+  rewinddir (dirp);
+
+  long *tdirp = xmalloc (dirp_count * sizeof (long));
+  struct dirent *ddirp = xmalloc (dirp_count * sizeof (struct dirent));
+
+  size_t i = 0;
+  do
+    {
+      tdirp[i] = telldir (dirp);
+      dp = readdir (dirp);
+      TEST_VERIFY_EXIT (dp != NULL);
+      memcpy (&ddirp[i], dp, sizeof (struct dirent));
+    } while (++i < dirp_count);
+
+  for (i = 0; i < dirp_count - 1; i++)
+    {
+      seekdir (dirp, tdirp[i]);
+      dp = readdir (dirp);
+      TEST_COMPARE (strcmp (dp->d_name, ddirp[i].d_name), 0);
+      TEST_COMPARE (dp->d_ino, ddirp[i].d_ino);
+      TEST_COMPARE (dp->d_off, ddirp[i].d_off);
+    }
+
+  closedir (dirp);
+
+  return 0;
+}
+
+/* Same as before but with LFS support.  */
+static int
+do_test_lfs (void)
+{
+  DIR *dirp;
+  struct dirent64 *dp;
+  size_t dirp_count;
+
+  dirp = opendir (dirname);
+  TEST_VERIFY_EXIT (dirp != NULL);
+
+  dirp_count = 0;
+  for (dp = readdir64 (dirp);
+       dp != NULL;
+       dp = readdir64 (dirp))
+    dirp_count++;
+
+  /* The 2 extra files are '.' and '..'.  */
+  TEST_COMPARE (dirp_count, nfiles + 2);
+
+  rewinddir (dirp);
+
+  long *tdirp = xmalloc (dirp_count * sizeof (long));
+  struct dirent64 *ddirp = xmalloc (dirp_count * sizeof (struct dirent64));
+
+  size_t i = 0;
+  do
+    {
+      tdirp[i] = telldir (dirp);
+      dp = readdir64 (dirp);
+      TEST_VERIFY_EXIT (dp != NULL);
+      memcpy (&ddirp[i], dp, sizeof (struct dirent64));
+    } while (++i < dirp_count);
+
+  for (i = 0; i < dirp_count - 1; i++)
+    {
+      seekdir (dirp, tdirp[i]);
+      dp = readdir64 (dirp);
+      TEST_COMPARE (strcmp (dp->d_name, ddirp[i].d_name), 0);
+      TEST_COMPARE (dp->d_ino, ddirp[i].d_ino);
+      TEST_COMPARE (dp->d_off, ddirp[i].d_off);
+    }
+
+  closedir (dirp);
+
+  return 0;
+}
+
+static int
+do_test (void)
+{
+  do_test_not_lfs ();
+  do_test_lfs ();
+
+  return 0;
+}
+
+#include <support/test-driver.c>
diff --git a/support/temp_file.c b/support/temp_file.c
index 277c5e0cf1..98bd235526 100644
--- a/support/temp_file.c
+++ b/support/temp_file.c
@@ -60,14 +60,12 @@ add_temp_file (const char *name)
 }
 
 int
-create_temp_file (const char *base, char **filename)
+create_temp_file_in_dir (const char *base, const char *dir, char **filename)
 {
   char *fname;
   int fd;
 
-  fname = (char *) xmalloc (strlen (test_dir) + 1 + strlen (base)
-			    + sizeof ("XXXXXX"));
-  strcpy (stpcpy (stpcpy (stpcpy (fname, test_dir), "/"), base), "XXXXXX");
+  fname = xasprintf ("%s/%sXXXXXX", dir, base);
 
   fd = mkstemp (fname);
   if (fd == -1)
@@ -86,6 +84,12 @@ create_temp_file (const char *base, char **filename)
   return fd;
 }
 
+int
+create_temp_file (const char *base, char **filename)
+{
+  return create_temp_file_in_dir (base, test_dir, filename);
+}
+
 char *
 support_create_temp_directory (const char *base)
 {
diff --git a/support/temp_file.h b/support/temp_file.h
index 8b6303a6e4..ac61105428 100644
--- a/support/temp_file.h
+++ b/support/temp_file.h
@@ -32,6 +32,13 @@ void add_temp_file (const char *name);
    *FILENAME.  */
 int create_temp_file (const char *base, char **filename);
 
+/* Create a temporary file in directory DIR.  Return the opened file
+   descriptor on success, or -1 on failure.  Write the file name to
+   *FILENAME if FILENAME is not NULL.  In this case, the caller is
+   expected to free *FILENAME.  */
+int create_temp_file_in_dir (const char *base, const char *dir,
+			     char **filename);
+
 /* Create a temporary directory and schedule it for deletion.  BASE is
    used as a prefix for the unique directory name, which the function
    returns.  The caller should free this string.  */
diff --git a/sysdeps/unix/sysv/linux/closedir.c b/sysdeps/unix/sysv/linux/closedir.c
index ccc19eefcd..c39f58aba5 100644
--- a/sysdeps/unix/sysv/linux/closedir.c
+++ b/sysdeps/unix/sysv/linux/closedir.c
@@ -43,6 +43,10 @@ __closedir (DIR *dirp)
 
   fd = dirp->fd;
 
+#ifndef __LP64__
+  dirstream_loc_clear (&dirp->locs);
+#endif
+
 #if IS_IN (libc)
   __libc_lock_fini (dirp->lock);
 #endif
diff --git a/sysdeps/unix/sysv/linux/dirstream.h b/sysdeps/unix/sysv/linux/dirstream.h
index a3ea2b7197..664d73cd40 100644
--- a/sysdeps/unix/sysv/linux/dirstream.h
+++ b/sysdeps/unix/sysv/linux/dirstream.h
@@ -21,6 +21,7 @@
 #include <sys/types.h>
 
 #include <libc-lock.h>
+#include <telldir.h>
 
 /* Directory stream type.
 
@@ -37,10 +38,14 @@ struct __dirstream
     size_t size;		/* Total valid data in the block.  */
     size_t offset;		/* Current offset into the block.  */
 
-    off_t filepos;		/* Position of next entry to read.  */
+    off64_t filepos;		/* Position of next entry to read.  */
 
     int errcode;		/* Delayed error code.  */
 
+#ifndef __LP64__
+    struct dirstream_loc_t locs;
+#endif
+
     /* Directory block.  We must make sure that this block starts
        at an address that is aligned adequately enough to store
        dirent entries.  Using the alignment of "void *" is not
diff --git a/sysdeps/unix/sysv/linux/opendir.c b/sysdeps/unix/sysv/linux/opendir.c
index 24bd63d2ba..fb32c95ad0 100644
--- a/sysdeps/unix/sysv/linux/opendir.c
+++ b/sysdeps/unix/sysv/linux/opendir.c
@@ -130,6 +130,9 @@ __alloc_dir (int fd, bool close_fd, int flags, const struct stat64 *statp)
   dirp->offset = 0;
   dirp->filepos = 0;
   dirp->errcode = 0;
+#ifndef __LP64__
+  dirstream_loc_init (&dirp->locs);
+#endif
 
   return dirp;
 }
diff --git a/sysdeps/unix/sysv/linux/readdir.c b/sysdeps/unix/sysv/linux/readdir.c
index 8eab0f4c9b..010ccf0a00 100644
--- a/sysdeps/unix/sysv/linux/readdir.c
+++ b/sysdeps/unix/sysv/linux/readdir.c
@@ -17,6 +17,7 @@
    <https://www.gnu.org/licenses/>.  */
 
 #include <dirent.h>
+#include <unistd.h>
 
 #if !_DIRENT_MATCHES_DIRENT64
 #include <readdir.h>
diff --git a/sysdeps/unix/sysv/linux/readdir.h b/sysdeps/unix/sysv/linux/readdir.h
index 4dc219e220..7f6cf3783d 100644
--- a/sysdeps/unix/sysv/linux/readdir.h
+++ b/sysdeps/unix/sysv/linux/readdir.h
@@ -85,15 +85,12 @@ dirstream_ret_entry (struct __dirstream *ds)
   dp->d_ino = dp64->d_ino;
 
   dp->d_off = dp64->d_off;
-  if (dp->d_off != dp64->d_off)
-    /* Overflow.  */
-    return NULL;
 
   const size_t size_diff = (offsetof (struct dirent64, d_name)
 			    - offsetof (struct dirent, d_name));
   const size_t alignment = _Alignof (struct dirent);
-  size_t new_reclen  = (dp64->d_reclen - size_diff + alignment - 1)
-			& ~(alignment - 1);
+  size_t new_reclen = (dp64->d_reclen - size_diff + alignment - 1)
+		       & ~(alignment - 1);
   if (new_reclen > return_buffer_size)
     /* Overflow.  */
     return NULL;
diff --git a/sysdeps/unix/sysv/linux/rewinddir.c b/sysdeps/unix/sysv/linux/rewinddir.c
index 860bfda004..8db0d0be4a 100644
--- a/sysdeps/unix/sysv/linux/rewinddir.c
+++ b/sysdeps/unix/sysv/linux/rewinddir.c
@@ -33,6 +33,11 @@ __rewinddir (DIR *dirp)
   dirp->offset = 0;
   dirp->size = 0;
   dirp->errcode = 0;
+
+#ifndef __LP64__
+  dirstream_loc_clear (&dirp->locs);
+#endif
+
 #if IS_IN (libc)
   __libc_lock_unlock (dirp->lock);
 #endif
diff --git a/sysdeps/unix/sysv/linux/seekdir.c b/sysdeps/unix/sysv/linux/seekdir.c
index 3c30520928..5f39ef9eef 100644
--- a/sysdeps/unix/sysv/linux/seekdir.c
+++ b/sysdeps/unix/sysv/linux/seekdir.c
@@ -22,14 +22,40 @@
 #include <dirstream.h>
 
 /* Seek to position POS in DIRP.  */
-/* XXX should be __seekdir ? */
 void
 seekdir (DIR *dirp, long int pos)
 {
+  off64_t filepos;
+
   __libc_lock_lock (dirp->lock);
-  (void) __lseek (dirp->fd, pos, SEEK_SET);
-  dirp->size = 0;
-  dirp->offset = 0;
-  dirp->filepos = pos;
+
+#ifndef __LP64__
+  union dirstream_packed dsp;
+
+  dsp.l = pos;
+
+  if (dsp.p.is_packed == 1)
+    filepos = dsp.p.info;
+  else
+    {
+      size_t index = dsp.p.info;
+
+      if (index >= dirstream_loc_size (&dirp->locs))
+	return;
+      struct dirstream_loc *loc = dirstream_loc_at (&dirp->locs, index);
+      filepos = loc->filepos;
+    }
+#else
+  filepos = pos;
+#endif
+
+  if (dirp->filepos != filepos)
+    {
+      __lseek64 (dirp->fd, filepos, SEEK_SET);
+      dirp->filepos = filepos;
+      dirp->offset = 0;
+      dirp->size = 0;
+    }
+
   __libc_lock_unlock (dirp->lock);
 }
diff --git a/sysdeps/unix/sysv/linux/telldir.c b/sysdeps/unix/sysv/linux/telldir.c
index 57d435ed21..bb33626fa4 100644
--- a/sysdeps/unix/sysv/linux/telldir.c
+++ b/sysdeps/unix/sysv/linux/telldir.c
@@ -18,16 +18,59 @@
 #include <dirent.h>
 
 #include <dirstream.h>
+#include <telldir.h>
 
 /* Return the current position of DIRP.  */
 long int
 telldir (DIR *dirp)
 {
-  long int ret;
+#ifndef __LP64__
+  /* If the directory position fits in the packet structure returns it.
+     Otherwise, check if the position is already been recorded in the
+     dynamic array.  If not, add the new record.  */
+
+  union dirstream_packed dsp;
+  size_t i;
 
   __libc_lock_lock (dirp->lock);
-  ret = dirp->filepos;
+
+  if (dirp->filepos < (1U << 31))
+    {
+      dsp.p.is_packed = 1;
+      dsp.p.info = dirp->filepos;
+      goto out;
+    }
+
+  dsp.l = -1;
+
+  for (i = 0; i < dirstream_loc_size (&dirp->locs); i++)
+    {
+      struct dirstream_loc *loc = dirstream_loc_at (&dirp->locs, i);
+      if (loc->filepos == dirp->filepos)
+	break;
+    }
+  if (i == dirstream_loc_size (&dirp->locs))
+    {
+      dirstream_loc_add (&dirp->locs,
+	(struct dirstream_loc) { dirp->filepos });
+      if (dirstream_loc_has_failed (&dirp->locs))
+	goto out;
+    }
+
+  dsp.p.is_packed = 0;
+  /* This assignment might overflow, however most likely ENOMEM would happen
+     long before.  */
+  dsp.p.info = i;
+
+out:
   __libc_lock_unlock (dirp->lock);
 
+  return dsp.l;
+#else
+  long int ret;
+  __libc_lock_lock (dirp->lock);
+  ret = dirp->filepos;
+  __libc_lock_unlock (dirp->lock);
   return ret;
+#endif
 }
diff --git a/sysdeps/unix/sysv/linux/telldir.h b/sysdeps/unix/sysv/linux/telldir.h
new file mode 100644
index 0000000000..7c45886341
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/telldir.h
@@ -0,0 +1,64 @@
+/* Linux internal telldir definitions.
+   Copyright (C) 2020 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#ifndef _TELLDIR_H
+#define _TELLDIR_H 1
+
+#ifndef __LP64__
+
+/* On platforms where long int is smaller than off64_t this is how the
+   returned value is encoded and returned by 'telldir'.  If the directory
+   offset can be enconded in 31 bits it is returned in the 'info' member
+   with 'is_packed' set to 1.
+
+   Otherwise, the 'info' member describes an index in a dynamic array at
+   'DIR' structure.  */
+
+union dirstream_packed
+{
+  long int l;
+  struct
+  {
+    unsigned long is_packed:1;
+    unsigned long info:31;
+  } p;
+};
+
+_Static_assert (sizeof (long int) == sizeof (union dirstream_packed),
+		"sizeof (long int) != sizeof (union dirstream_packed)");
+
+/* telldir will mantain a list of offsets that describe the obtained diretory
+   position if it can fit this information in the returned 'dirstream_packed'
+   struct.  */
+
+struct dirstream_loc
+{
+  off64_t filepos;
+};
+
+# define DYNARRAY_STRUCT  dirstream_loc_t
+# define DYNARRAY_ELEMENT struct dirstream_loc
+# define DYNARRAY_PREFIX  dirstream_loc_
+# include <malloc/dynarray-skeleton.c>
+#else
+
+_Static_assert (sizeof (long int) == sizeof (off64_t),
+		"sizeof (long int) != sizeof (off64_t)");
+#endif /* __LP64__  */
+
+#endif /* _TELLDIR_H  */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v2 6/9] linux: Add __readdir64_unlocked
  2020-10-02 17:06 [PATCH v2 0/9] Fix getdents{64} regression on some FS Adhemerval Zanella via Libc-alpha
                   ` (4 preceding siblings ...)
  2020-10-02 17:06 ` [PATCH v2 5/9] linux: Set internal DIR filepos as off64_t [BZ #23960, BZ #24050] Adhemerval Zanella via Libc-alpha
@ 2020-10-02 17:06 ` Adhemerval Zanella via Libc-alpha
  2020-10-02 17:06 ` [PATCH v2 7/9] linux: Add __old_readdir64_unlocked Adhemerval Zanella via Libc-alpha
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 29+ messages in thread
From: Adhemerval Zanella via Libc-alpha @ 2020-10-02 17:06 UTC (permalink / raw)
  To: libc-alpha; +Cc: James Clarke, John Paul Adrian Glaubitz

And use it on readdir_r implementation.

Checked on i686-linux-gnu.
---
 include/dirent.h                      |  1 +
 sysdeps/unix/sysv/linux/readdir64.c   | 20 +++++--
 sysdeps/unix/sysv/linux/readdir64_r.c | 80 ++++++---------------------
 3 files changed, 33 insertions(+), 68 deletions(-)

diff --git a/include/dirent.h b/include/dirent.h
index 8325a19e5f..79c8fce969 100644
--- a/include/dirent.h
+++ b/include/dirent.h
@@ -21,6 +21,7 @@ extern DIR *__fdopendir (int __fd) attribute_hidden;
 extern int __closedir (DIR *__dirp) attribute_hidden;
 extern struct dirent *__readdir (DIR *__dirp) attribute_hidden;
 extern struct dirent *__readdir_unlocked (DIR *__dirp) attribute_hidden;
+extern struct dirent64 *__readdir64_unlocked (DIR *__dirp) attribute_hidden;
 extern struct dirent64 *__readdir64 (DIR *__dirp);
 libc_hidden_proto (__readdir64)
 extern int __readdir_r (DIR *__dirp, struct dirent *__entry,
diff --git a/sysdeps/unix/sysv/linux/readdir64.c b/sysdeps/unix/sysv/linux/readdir64.c
index 1aa6e2664f..a774f7c6e3 100644
--- a/sysdeps/unix/sysv/linux/readdir64.c
+++ b/sysdeps/unix/sysv/linux/readdir64.c
@@ -28,15 +28,11 @@
 
 /* Read a directory entry from DIRP.  */
 struct dirent64 *
-__readdir64 (DIR *dirp)
+__readdir64_unlocked (DIR *dirp)
 {
   struct dirent64 *dp;
   int saved_errno = errno;
 
-#if IS_IN (libc)
-  __libc_lock_lock (dirp->lock);
-#endif
-
   do
     {
       size_t reclen;
@@ -80,6 +76,20 @@ __readdir64 (DIR *dirp)
       /* Skip deleted files.  */
     } while (dp->d_ino == 0);
 
+  return dp;
+}
+
+struct dirent64 *
+__readdir64 (DIR *dirp)
+{
+  struct dirent64 *dp;
+
+#if IS_IN (libc)
+  __libc_lock_lock (dirp->lock);
+#endif
+
+  dp = __readdir64_unlocked (dirp);
+
 #if IS_IN (libc)
   __libc_lock_unlock (dirp->lock);
 #endif
diff --git a/sysdeps/unix/sysv/linux/readdir64_r.c b/sysdeps/unix/sysv/linux/readdir64_r.c
index c587787417..e5ef7be4c2 100644
--- a/sysdeps/unix/sysv/linux/readdir64_r.c
+++ b/sysdeps/unix/sysv/linux/readdir64_r.c
@@ -32,89 +32,43 @@ __readdir64_r (DIR *dirp, struct dirent64 *entry, struct dirent64 **result)
 {
   struct dirent64 *dp;
   size_t reclen;
-  const int saved_errno = errno;
-  int ret;
 
   __libc_lock_lock (dirp->lock);
-
-  do
+  while (1)
     {
-      if (dirp->offset >= dirp->size)
-	{
-	  /* We've emptied out our buffer.  Refill it.  */
-
-	  size_t maxread = dirp->allocation;
-	  ssize_t bytes;
-
-	  maxread = dirp->allocation;
-
-	  bytes = __getdents64 (dirp->fd, dirp->data, maxread);
-	  if (bytes <= 0)
-	    {
-	      /* On some systems getdents fails with ENOENT when the
-		 open directory has been rmdir'd already.  POSIX.1
-		 requires that we treat this condition like normal EOF.  */
-	      if (bytes < 0 && errno == ENOENT)
-		{
-		  bytes = 0;
-		  __set_errno (saved_errno);
-		}
-	      if (bytes < 0)
-		dirp->errcode = errno;
-
-	      dp = NULL;
-	      break;
-	    }
-	  dirp->size = (size_t) bytes;
-
-	  /* Reset the offset into the buffer.  */
-	  dirp->offset = 0;
-	}
-
-      dp = (struct dirent64 *) &dirp->data[dirp->offset];
+      dp = __readdir64_unlocked (dirp);
+      if (dp == NULL)
+	break;
 
       reclen = dp->d_reclen;
+      if (reclen <= offsetof (struct dirent64, d_name) + NAME_MAX + 1)
+	break;
 
-      dirp->offset += reclen;
-
-      dirp->filepos = dp->d_off;
-
-      if (reclen > offsetof (struct dirent64, d_name) + NAME_MAX + 1)
+      /* The record is very long.  It could still fit into the caller-supplied
+	 buffer if we can skip padding at the end.  */
+      size_t namelen = _D_EXACT_NAMLEN (dp);
+      if (namelen <= NAME_MAX)
 	{
-	  /* The record is very long.  It could still fit into the
-	     caller-supplied buffer if we can skip padding at the
-	     end.  */
-	  size_t namelen = _D_EXACT_NAMLEN (dp);
-	  if (namelen <= NAME_MAX)
-	    reclen = offsetof (struct dirent64, d_name) + namelen + 1;
-	  else
-	    {
-	      /* The name is too long.  Ignore this file.  */
-	      dirp->errcode = ENAMETOOLONG;
-	      dp->d_ino = 0;
-	      continue;
-	    }
+	  reclen = offsetof (struct dirent64, d_name) + namelen + 1;
+	  break;
 	}
 
-      /* Skip deleted and ignored files.  */
+      /* The name is too long.  Ignore this file.  */
+      dirp->errcode = ENAMETOOLONG;
+      dp->d_ino = 0;
     }
-  while (dp->d_ino == 0);
 
   if (dp != NULL)
     {
       *result = memcpy (entry, dp, reclen);
       entry->d_reclen = reclen;
-      ret = 0;
     }
   else
-    {
-      *result = NULL;
-      ret = dirp->errcode;
-    }
+    *result = NULL;
 
   __libc_lock_unlock (dirp->lock);
 
-  return ret;
+  return dp != NULL ? 0 : dirp->errcode;
 }
 
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v2 7/9] linux: Add __old_readdir64_unlocked
  2020-10-02 17:06 [PATCH v2 0/9] Fix getdents{64} regression on some FS Adhemerval Zanella via Libc-alpha
                   ` (5 preceding siblings ...)
  2020-10-02 17:06 ` [PATCH v2 6/9] linux: Add __readdir64_unlocked Adhemerval Zanella via Libc-alpha
@ 2020-10-02 17:06 ` Adhemerval Zanella via Libc-alpha
  2020-10-02 17:06 ` [PATCH v2 8/9] linux: Use getdents64 on readdir64 compat implementation Adhemerval Zanella via Libc-alpha
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 29+ messages in thread
From: Adhemerval Zanella via Libc-alpha @ 2020-10-02 17:06 UTC (permalink / raw)
  To: libc-alpha; +Cc: James Clarke, John Paul Adrian Glaubitz

And use it __old_readdir64_r.

Checked on i686-linux-gnu.
---
 sysdeps/unix/sysv/linux/olddirent.h   |  2 +
 sysdeps/unix/sysv/linux/readdir64.c   | 21 +++++--
 sysdeps/unix/sysv/linux/readdir64_r.c | 79 ++++++---------------------
 3 files changed, 35 insertions(+), 67 deletions(-)

diff --git a/sysdeps/unix/sysv/linux/olddirent.h b/sysdeps/unix/sysv/linux/olddirent.h
index 2d2559c1e9..b118b0ef96 100644
--- a/sysdeps/unix/sysv/linux/olddirent.h
+++ b/sysdeps/unix/sysv/linux/olddirent.h
@@ -32,6 +32,8 @@ struct __old_dirent64
 /* Now define the internal interfaces.  */
 extern struct __old_dirent64 *__old_readdir64 (DIR *__dirp);
 libc_hidden_proto (__old_readdir64);
+extern struct __old_dirent64 *__old_readdir64_unlocked (DIR *__dirp)
+        attribute_hidden;
 extern int __old_readdir64_r (DIR *__dirp, struct __old_dirent64 *__entry,
 			  struct __old_dirent64 **__result);
 extern __ssize_t __old_getdents64 (int __fd, char *__buf, size_t __nbytes)
diff --git a/sysdeps/unix/sysv/linux/readdir64.c b/sysdeps/unix/sysv/linux/readdir64.c
index a774f7c6e3..ce50067637 100644
--- a/sysdeps/unix/sysv/linux/readdir64.c
+++ b/sysdeps/unix/sysv/linux/readdir64.c
@@ -116,15 +116,11 @@ versioned_symbol (libc, __readdir64, readdir64, GLIBC_2_2);
 
 attribute_compat_text_section
 struct __old_dirent64 *
-__old_readdir64 (DIR *dirp)
+__old_readdir64_unlocked (DIR *dirp)
 {
   struct __old_dirent64 *dp;
   int saved_errno = errno;
 
-#if IS_IN (libc)
-  __libc_lock_lock (dirp->lock);
-#endif
-
   do
     {
       size_t reclen;
@@ -168,6 +164,21 @@ __old_readdir64 (DIR *dirp)
       /* Skip deleted files.  */
     } while (dp->d_ino == 0);
 
+  return dp;
+}
+
+attribute_compat_text_section
+struct __old_dirent64 *
+__old_readdir64 (DIR *dirp)
+{
+  struct __old_dirent64 *dp;
+
+#if IS_IN (libc)
+  __libc_lock_lock (dirp->lock);
+#endif
+
+  dp = __old_readdir64_unlocked (dirp);
+
 #if IS_IN (libc)
   __libc_lock_unlock (dirp->lock);
 #endif
diff --git a/sysdeps/unix/sysv/linux/readdir64_r.c b/sysdeps/unix/sysv/linux/readdir64_r.c
index e5ef7be4c2..133c534a69 100644
--- a/sysdeps/unix/sysv/linux/readdir64_r.c
+++ b/sysdeps/unix/sysv/linux/readdir64_r.c
@@ -91,89 +91,44 @@ __old_readdir64_r (DIR *dirp, struct __old_dirent64 *entry,
 {
   struct __old_dirent64 *dp;
   size_t reclen;
-  const int saved_errno = errno;
-  int ret;
 
   __libc_lock_lock (dirp->lock);
 
-  do
+  while (1)
     {
-      if (dirp->offset >= dirp->size)
-	{
-	  /* We've emptied out our buffer.  Refill it.  */
-
-	  size_t maxread = dirp->allocation;
-	  ssize_t bytes;
-
-	  maxread = dirp->allocation;
-
-	  bytes = __old_getdents64 (dirp->fd, dirp->data, maxread);
-	  if (bytes <= 0)
-	    {
-	      /* On some systems getdents fails with ENOENT when the
-		 open directory has been rmdir'd already.  POSIX.1
-		 requires that we treat this condition like normal EOF.  */
-	      if (bytes < 0 && errno == ENOENT)
-		{
-		  bytes = 0;
-		  __set_errno (saved_errno);
-		}
-	      if (bytes < 0)
-		dirp->errcode = errno;
-
-	      dp = NULL;
-	      break;
-	    }
-	  dirp->size = (size_t) bytes;
-
-	  /* Reset the offset into the buffer.  */
-	  dirp->offset = 0;
-	}
-
-      dp = (struct __old_dirent64 *) &dirp->data[dirp->offset];
+      dp = __old_readdir64_unlocked (dirp);
+      if (dp == NULL)
+	break;
 
       reclen = dp->d_reclen;
+      if (reclen <= offsetof (struct __old_dirent64, d_name) + NAME_MAX + 1)
+	break;
 
-      dirp->offset += reclen;
-
-      dirp->filepos = dp->d_off;
-
-      if (reclen > offsetof (struct __old_dirent64, d_name) + NAME_MAX + 1)
+      /* The record is very long.  It could still fit into the caller-supplied
+	 buffer if we can skip padding at the end.  */
+      size_t namelen = _D_EXACT_NAMLEN (dp);
+      if (namelen <= NAME_MAX)
 	{
-	  /* The record is very long.  It could still fit into the
-	     caller-supplied buffer if we can skip padding at the
-	     end.  */
-	  size_t namelen = _D_EXACT_NAMLEN (dp);
-	  if (namelen <= NAME_MAX)
-	    reclen = offsetof (struct __old_dirent64, d_name) + namelen + 1;
-	  else
-	    {
-	      /* The name is too long.  Ignore this file.  */
-	      dirp->errcode = ENAMETOOLONG;
-	      dp->d_ino = 0;
-	      continue;
-	    }
+	  reclen = offsetof (struct dirent64, d_name) + namelen + 1;
+	  break;
 	}
 
-      /* Skip deleted and ignored files.  */
+      /* The name is too long.  Ignore this file.  */
+      dirp->errcode = ENAMETOOLONG;
+      dp->d_ino = 0;
     }
-  while (dp->d_ino == 0);
 
   if (dp != NULL)
     {
       *result = memcpy (entry, dp, reclen);
       entry->d_reclen = reclen;
-      ret = 0;
     }
   else
-    {
-      *result = NULL;
-      ret = dirp->errcode;
-    }
+    *result = NULL;
 
   __libc_lock_unlock (dirp->lock);
 
-  return ret;
+  return dp != NULL ? 0 : dirp->errcode;
 }
 
 compat_symbol (libc, __old_readdir64_r, readdir64_r, GLIBC_2_1);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v2 8/9] linux: Use getdents64 on readdir64 compat implementation
  2020-10-02 17:06 [PATCH v2 0/9] Fix getdents{64} regression on some FS Adhemerval Zanella via Libc-alpha
                   ` (6 preceding siblings ...)
  2020-10-02 17:06 ` [PATCH v2 7/9] linux: Add __old_readdir64_unlocked Adhemerval Zanella via Libc-alpha
@ 2020-10-02 17:06 ` Adhemerval Zanella via Libc-alpha
  2020-10-02 17:06 ` [PATCH v2 9/9] dirent: Deprecate getdirentries Adhemerval Zanella via Libc-alpha
  2020-10-04 13:08 ` [PATCH v2 0/9] Fix getdents{64} regression on some FS Dave Flogeras via Libc-alpha
  9 siblings, 0 replies; 29+ messages in thread
From: Adhemerval Zanella via Libc-alpha @ 2020-10-02 17:06 UTC (permalink / raw)
  To: libc-alpha; +Cc: James Clarke, John Paul Adrian Glaubitz

It uses a similar strategy from the non-LFS readdir that also
uses getdents64 internally and reserves some space on the allocated
internal DIR buffer to be used as a temporary buffer. The kernel
obtained dirent64 data are copied to the temporary buffer for each
readdir compat call.

It allows to remove __old_getdents64.

Checked on i686-linux-gnu.
---
 sysdeps/unix/sysv/linux/getdents64.c | 95 +---------------------------
 sysdeps/unix/sysv/linux/olddirent.h  |  2 -
 sysdeps/unix/sysv/linux/readdir.h    | 47 +++++++++++++-
 sysdeps/unix/sysv/linux/readdir64.c  | 27 ++++----
 4 files changed, 58 insertions(+), 113 deletions(-)

diff --git a/sysdeps/unix/sysv/linux/getdents64.c b/sysdeps/unix/sysv/linux/getdents64.c
index 75892c2823..f40dfacbfe 100644
--- a/sysdeps/unix/sysv/linux/getdents64.c
+++ b/sysdeps/unix/sysv/linux/getdents64.c
@@ -36,97 +36,4 @@ weak_alias (__getdents64, getdents64)
 
 #if _DIRENT_MATCHES_DIRENT64
 strong_alias (__getdents64, __getdents)
-#else
-# include <shlib-compat.h>
-
-# if SHLIB_COMPAT(libc, GLIBC_2_1, GLIBC_2_2)
-#  include <olddirent.h>
-#  include <unistd.h>
-
-static ssize_t
-handle_overflow (int fd, __off64_t offset, ssize_t count)
-{
-  /* If this is the first entry in the buffer, we can report the
-     error.  */
-  if (offset == 0)
-    {
-      __set_errno (EOVERFLOW);
-      return -1;
-    }
-
-  /* Otherwise, seek to the overflowing entry, so that the next call
-     will report the error, and return the data read so far.  */
-  if (__lseek64 (fd, offset, SEEK_SET) != 0)
-    return -1;
-  return count;
-}
-
-ssize_t
-__old_getdents64 (int fd, char *buf, size_t nbytes)
-{
-  /* We do not move the individual directory entries.  This is only
-     possible if the target type (struct __old_dirent64) is smaller
-     than the source type.  */
-  _Static_assert (offsetof (struct __old_dirent64, d_name)
-		  <= offsetof (struct dirent64, d_name),
-		  "__old_dirent64 is larger than dirent64");
-  _Static_assert (__alignof__ (struct __old_dirent64)
-		  <= __alignof__ (struct dirent64),
-		  "alignment of __old_dirent64 is larger than dirent64");
-
-  ssize_t retval = INLINE_SYSCALL_CALL (getdents64, fd, buf, nbytes);
-  if (retval > 0)
-    {
-      /* This is the marker for the first entry.  Offset 0 is reserved
-	 for the first entry (see rewinddir).  Here, we use it as a
-	 marker for the first entry in the buffer.  We never actually
-	 seek to offset 0 because handle_overflow reports the error
-	 directly, so it does not matter that the offset is incorrect
-	 if entries have been read from the descriptor before (so that
-	 the descriptor is not actually at offset 0).  */
-      __off64_t previous_offset = 0;
-
-      char *p = buf;
-      char *end = buf + retval;
-      while (p < end)
-	{
-	  struct dirent64 *source = (struct dirent64 *) p;
-
-	  /* Copy out the fixed-size data.  */
-	  __ino_t ino = source->d_ino;
-	  __off64_t offset = source->d_off;
-	  unsigned int reclen = source->d_reclen;
-	  unsigned char type = source->d_type;
-
-	  /* Check for ino_t overflow.  */
-	  if (__glibc_unlikely (ino != source->d_ino))
-	    return handle_overflow (fd, previous_offset, p - buf);
-
-	  /* Convert to the target layout.  Use a separate struct and
-	     memcpy to side-step aliasing issues.  */
-	  struct __old_dirent64 result;
-	  result.d_ino = ino;
-	  result.d_off = offset;
-	  result.d_reclen = reclen;
-	  result.d_type = type;
-
-	  /* Write the fixed-sized part of the result to the
-	     buffer.  */
-	  size_t result_name_offset = offsetof (struct __old_dirent64, d_name);
-	  memcpy (p, &result, result_name_offset);
-
-	  /* Adjust the position of the name if necessary.  Copy
-	     everything until the end of the record, including the
-	     terminating NUL byte.  */
-	  if (result_name_offset != offsetof (struct dirent64, d_name))
-	    memmove (p + result_name_offset, source->d_name,
-		     reclen - offsetof (struct dirent64, d_name));
-
-	  p += reclen;
-	  previous_offset = offset;
-	}
-     }
-  return retval;
-}
-# endif /* SHLIB_COMPAT(libc, GLIBC_2_1, GLIBC_2_2)  */
-#endif /* _DIRENT_MATCHES_DIRENT64  */
+#endif
diff --git a/sysdeps/unix/sysv/linux/olddirent.h b/sysdeps/unix/sysv/linux/olddirent.h
index b118b0ef96..63d8e97868 100644
--- a/sysdeps/unix/sysv/linux/olddirent.h
+++ b/sysdeps/unix/sysv/linux/olddirent.h
@@ -36,8 +36,6 @@ extern struct __old_dirent64 *__old_readdir64_unlocked (DIR *__dirp)
         attribute_hidden;
 extern int __old_readdir64_r (DIR *__dirp, struct __old_dirent64 *__entry,
 			  struct __old_dirent64 **__result);
-extern __ssize_t __old_getdents64 (int __fd, char *__buf, size_t __nbytes)
-	attribute_hidden;
 int __old_scandir64 (const char * __dir,
 		     struct __old_dirent64 *** __namelist,
 		     int (*__selector) (const struct __old_dirent64 *),
diff --git a/sysdeps/unix/sysv/linux/readdir.h b/sysdeps/unix/sysv/linux/readdir.h
index 7f6cf3783d..234a7f26e6 100644
--- a/sysdeps/unix/sysv/linux/readdir.h
+++ b/sysdeps/unix/sysv/linux/readdir.h
@@ -21,6 +21,7 @@
 
 #if !_DIRENT_MATCHES_DIRENT64
 # include <dirstream.h>
+# include <olddirent.h>
 
 /* getdents64 is used internally for both LFS and non-LFS implementations.
    The non-LFS interface reserves part of the allocated buffer to return the
@@ -106,9 +107,53 @@ dirstream_ret_entry (struct __dirstream *ds)
 
   return dp;
 }
+
+/* Return the allocated buffer used on LFS compat readdir call.  */
+static inline struct __old_dirent64 *
+dirstream_ret64_compat (struct __dirstream *ds)
+{
+  return (struct __old_dirent64 *) ds->data;
+}
+
+static inline struct __old_dirent64 *
+dirstream_ret_entry64_compat (struct __dirstream *ds)
+{
+  struct dirent64 *dp64 = dirstream_entry (ds);
+  struct __old_dirent64 *dp64_compat = dirstream_ret64_compat (ds);
+
+  dp64_compat->d_ino = dp64->d_ino;
+  if (dp64_compat->d_ino != dp64->d_ino)
+    /* Overflow.  */
+    return NULL;
+
+  dp64_compat->d_off = dp64->d_off;
+
+  const size_t size_diff = (offsetof (struct dirent64, d_name)
+			    - offsetof (struct __old_dirent64, d_name));
+  const size_t alignment = _Alignof (struct __old_dirent64);
+  size_t new_reclen  = (dp64->d_reclen - size_diff + alignment - 1)
+			& ~(alignment - 1);
+  if (new_reclen > return_buffer_size)
+    /* Overflow.  */
+    return NULL;
+
+  /* The compat symbol report the kernel obtained d_reclen, even though
+     it has an incompatible dirent layout.  */
+  dp64_compat->d_reclen = dp64->d_reclen;
+
+  dp64_compat->d_type = dp64->d_type;
+
+  memcpy (dp64_compat->d_name, dp64->d_name,
+	  dp64->d_reclen - offsetof (struct dirent64, d_name));
+
+  ds->offset += dp64->d_reclen;
+  ds->filepos = dp64->d_off;
+
+  return dp64_compat;
+}
+
 #else
 /* No need to reserve an buffer space if dirent has already LFS support.  */
 enum { return_buffer_size = 0 };
 #endif /* _DIRENT_MATCHES_DIRENT64  */
-
 #endif
diff --git a/sysdeps/unix/sysv/linux/readdir64.c b/sysdeps/unix/sysv/linux/readdir64.c
index ce50067637..79a2027fd1 100644
--- a/sysdeps/unix/sysv/linux/readdir64.c
+++ b/sysdeps/unix/sysv/linux/readdir64.c
@@ -112,27 +112,23 @@ weak_alias (__readdir64, readdir64)
 versioned_symbol (libc, __readdir64, readdir64, GLIBC_2_2);
 # endif
 # if SHLIB_COMPAT(libc, GLIBC_2_1, GLIBC_2_2)
-#  include <olddirent.h>
+#  include <readdir.h>
+#  include <unistd.h>
 
 attribute_compat_text_section
 struct __old_dirent64 *
 __old_readdir64_unlocked (DIR *dirp)
 {
   struct __old_dirent64 *dp;
-  int saved_errno = errno;
+  const int saved_errno = errno;
 
   do
     {
-      size_t reclen;
-
       if (dirp->offset >= dirp->size)
 	{
 	  /* We've emptied out our buffer.  Refill it.  */
-
-	  size_t maxread = dirp->allocation;
-	  ssize_t bytes;
-
-	  bytes = __old_getdents64 (dirp->fd, dirp->data, maxread);
+	  ssize_t bytes = __getdents64 (dirp->fd, dirstream_data (dirp),
+					dirstream_alloc_size (dirp));
 	  if (bytes <= 0)
 	    {
 	      /* On some systems getdents fails with ENOENT when the
@@ -153,13 +149,12 @@ __old_readdir64_unlocked (DIR *dirp)
 	  dirp->offset = 0;
 	}
 
-      dp = (struct __old_dirent64 *) &dirp->data[dirp->offset];
-
-      reclen = dp->d_reclen;
-
-      dirp->offset += reclen;
-
-      dirp->filepos = dp->d_off;
+      dp = dirstream_ret_entry64_compat (dirp);
+      if (dp == NULL)
+	{
+	  __set_errno (EOVERFLOW);
+	  break;
+	}
 
       /* Skip deleted files.  */
     } while (dp->d_ino == 0);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v2 9/9] dirent: Deprecate getdirentries
  2020-10-02 17:06 [PATCH v2 0/9] Fix getdents{64} regression on some FS Adhemerval Zanella via Libc-alpha
                   ` (7 preceding siblings ...)
  2020-10-02 17:06 ` [PATCH v2 8/9] linux: Use getdents64 on readdir64 compat implementation Adhemerval Zanella via Libc-alpha
@ 2020-10-02 17:06 ` Adhemerval Zanella via Libc-alpha
  2020-10-04 13:08 ` [PATCH v2 0/9] Fix getdents{64} regression on some FS Dave Flogeras via Libc-alpha
  9 siblings, 0 replies; 29+ messages in thread
From: Adhemerval Zanella via Libc-alpha @ 2020-10-02 17:06 UTC (permalink / raw)
  To: libc-alpha; +Cc: James Clarke, John Paul Adrian Glaubitz

The interface has some issues:

  1. It is build on top getdents on Linux and requires handling
     non-LFS call using LFS getdents.

  2. It is not wildly used and the non-LFS support is as problematic
     as non-LFS readdir.  glibc only exports the LFS getdents.

  3. It is not a direct replacement over BSD since on some plataform
     its signature has changed (FreeBSD 11, for instance, used to
     set the offset as a 'long' and changed to 'off_t' on version 12).

The idea is to eventually move the symbols to compat ones.
---
 NEWS                             |  3 +++
 dirent/dirent.h                  | 11 +++++++----
 sysdeps/unix/sysv/linux/Makefile |  3 +++
 3 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/NEWS b/NEWS
index ce05d05b16..43dab4d519 100644
--- a/NEWS
+++ b/NEWS
@@ -33,6 +33,9 @@ Deprecated and removed features, and other changes affecting compatibility:
 * The mallinfo function is marked deprecated.  Callers should call
   mallinfo2 instead.
 
+* The function getdirentries is now deprecated, applications should use
+  either getdents64, readdir64 or readdir.
+
 Changes to build and runtime requirements:
 
   [Add changes to build and runtime requirements here]
diff --git a/dirent/dirent.h b/dirent/dirent.h
index 92d0925047..1e93f2fbcf 100644
--- a/dirent/dirent.h
+++ b/dirent/dirent.h
@@ -348,29 +348,32 @@ extern int alphasort64 (const struct dirent64 **__e1,
 /* Read directory entries from FD into BUF, reading at most NBYTES.
    Reading starts at offset *BASEP, and *BASEP is updated with the new
    position after reading.  Returns the number of bytes read; zero when at
-   end of directory; or -1 for errors.  */
+   end of directory; or -1 for errors.
+   This is deprecated and getdents64 or readdir should be used instead.  */
 # ifndef __USE_FILE_OFFSET64
 extern __ssize_t getdirentries (int __fd, char *__restrict __buf,
 				size_t __nbytes,
 				__off_t *__restrict __basep)
-     __THROW __nonnull ((2, 4));
+     __THROW __nonnull ((2, 4)) __attribute_deprecated__;
 # else
 #  ifdef __REDIRECT
 extern __ssize_t __REDIRECT_NTH (getdirentries,
 				 (int __fd, char *__restrict __buf,
 				  size_t __nbytes,
 				  __off64_t *__restrict __basep),
-				 getdirentries64) __nonnull ((2, 4));
+				 getdirentries64)
+     __THROW __nonnull ((2, 4)) __attribute_deprecated__;
 #  else
 #   define getdirentries getdirentries64
 #  endif
 # endif
 
 # ifdef __USE_LARGEFILE64
+/* This is deprecated and getdents64 or readdir64 should be used instead.  */
 extern __ssize_t getdirentries64 (int __fd, char *__restrict __buf,
 				  size_t __nbytes,
 				  __off64_t *__restrict __basep)
-     __THROW __nonnull ((2, 4));
+     __THROW __nonnull ((2, 4)) __attribute_deprecated__;
 # endif
 #endif /* Use misc.  */
 
diff --git a/sysdeps/unix/sysv/linux/Makefile b/sysdeps/unix/sysv/linux/Makefile
index 3bd3106ef9..55b8df59c5 100644
--- a/sysdeps/unix/sysv/linux/Makefile
+++ b/sysdeps/unix/sysv/linux/Makefile
@@ -260,6 +260,9 @@ ifeq ($(subdir),dirent)
 sysdep_routines += getdirentries getdirentries64
 tests += tst-getdents64
 tests-internal += tst-readdir64-compat
+
+# Avoid the warning for the weak_alias for _DIRENT_MATCHES_DIRENT64
+CFLAGS-getdirentries64.c = -Wno-deprecated-declarations
 endif
 
 ifeq ($(subdir),nis)
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 0/9] Fix getdents{64} regression on some FS
  2020-10-02 17:06 [PATCH v2 0/9] Fix getdents{64} regression on some FS Adhemerval Zanella via Libc-alpha
                   ` (8 preceding siblings ...)
  2020-10-02 17:06 ` [PATCH v2 9/9] dirent: Deprecate getdirentries Adhemerval Zanella via Libc-alpha
@ 2020-10-04 13:08 ` Dave Flogeras via Libc-alpha
  9 siblings, 0 replies; 29+ messages in thread
From: Dave Flogeras via Libc-alpha @ 2020-10-04 13:08 UTC (permalink / raw)
  To: Adhemerval Zanella; +Cc: James Clarke, libc-alpha, John Paul Adrian Glaubitz

On Fri, Oct 2, 2020 at 2:06 PM Adhemerval Zanella <
adhemerval.zanella@linaro.org> wrote:

> I have reports from both Debian and Gentoo developers that this
> patchset fixes the issues they saw with recent glibc on some
> scenarios (schroot plus qemu) and on some filesystem (ext4).  They
> can correct me with their testing status.
>
>
I can confirm that the initial version of this patch (discussed privately
via email) has been working on Gentoo armv6j and armv7a for me for months.
This includes both testing within a qemu chroot (where the problem was
noticed), and also on the target hardware.  On the target the symptoms do
not show up even unpatched, as it is 32 bit userspace calling a 32 bit
kernel, but I thought I'd mention that I observed no ill effects there with
the patchset.

I would definitely appreciate it if this was eventually back-ported to at
least 2.32, since Gentoo policy likely prevents patching system packages
like glibc without upstream endorsement.  It would be great to have this
fixed, since it has been an issue for over a year.

PS. I'm not a full Gentoo developer, just a longtime member of the
community, and power user.  I will however push these issues through the
proper channels.

Thanks!
Dave

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 1/9] linux: Move posix dir implementations to Linux
  2020-10-02 17:06 ` [PATCH v2 1/9] linux: Move posix dir implementations to Linux Adhemerval Zanella via Libc-alpha
@ 2020-10-13 15:33   ` Florian Weimer via Libc-alpha
  2020-10-15 14:08     ` Adhemerval Zanella via Libc-alpha
  0 siblings, 1 reply; 29+ messages in thread
From: Florian Weimer via Libc-alpha @ 2020-10-13 15:33 UTC (permalink / raw)
  To: Adhemerval Zanella via Libc-alpha; +Cc: James Clarke, John Paul Adrian Glaubitz

* Adhemerval Zanella via Libc-alpha:

> This generic implementation already expects a getdents API which
> is Linux specific.  It also allows simplify it by assuming
> _DIRENT_HAVE_D_RECLEN and _DIRENT_HAVE_D_OFF support.
>
> The readdir are also expanded on each required implementation,
> futher fixes and improvements will make parametrize the
> implementation more complex.
>
> Checked on x86_64-linux-gnu, i686-linux-gnu, and with a build
> for all affected ABIs.

This looks okay to me.  It carries forward bug 12165, but that's okay
with me.  (That loop should be removed in a separate patch, although the
code duplication now makes this slightly more work.)

Thanks,
Florian
-- 
Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 2/9] linux: Simplify opendir buffer allocation
  2020-10-02 17:06 ` [PATCH v2 2/9] linux: Simplify opendir buffer allocation Adhemerval Zanella via Libc-alpha
@ 2020-10-13 15:34   ` Florian Weimer via Libc-alpha
  0 siblings, 0 replies; 29+ messages in thread
From: Florian Weimer via Libc-alpha @ 2020-10-13 15:34 UTC (permalink / raw)
  To: Adhemerval Zanella via Libc-alpha; +Cc: James Clarke, John Paul Adrian Glaubitz

* Adhemerval Zanella via Libc-alpha:

> The fallback allocation is removed, so the possible size constraint
> should be analyzed just once; __alloc_dir assumes that 'statp'
> argument is non-null, and the max_buffer_size move to close its
> used.
>
> Checked on x86_64-linux-gnu and i686-linux-gnu.

Looks okay to me.  Thanks.

Florian
-- 
Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 3/9] linux: Add __readdir_unlocked
  2020-10-02 17:06 ` [PATCH v2 3/9] linux: Add __readdir_unlocked Adhemerval Zanella via Libc-alpha
@ 2020-10-13 15:43   ` Florian Weimer via Libc-alpha
  2020-10-15 14:10     ` Adhemerval Zanella via Libc-alpha
  0 siblings, 1 reply; 29+ messages in thread
From: Florian Weimer via Libc-alpha @ 2020-10-13 15:43 UTC (permalink / raw)
  To: Adhemerval Zanella via Libc-alpha; +Cc: James Clarke, John Paul Adrian Glaubitz

* Adhemerval Zanella via Libc-alpha:

> diff --git a/sysdeps/unix/sysv/linux/readdir_r.c b/sysdeps/unix/sysv/linux/readdir_r.c
> index 0069041394..a01d2845a6 100644
> --- a/sysdeps/unix/sysv/linux/readdir_r.c
> +++ b/sysdeps/unix/sysv/linux/readdir_r.c
> @@ -25,89 +25,44 @@ __readdir_r (DIR *dirp, struct dirent *entry, struct dirent **result)

> +      /* The name is too long.  Ignore this file.  */
> +      dirp->errcode = ENAMETOOLONG;
> +      dp->d_ino = 0;

I believe the dp->d_ino assignment is no longer needed.  Rest looks okay
to me.

Thanks,
Florian
-- 
Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 4/9] linux: Use getdents64 on non-LFS readdir
  2020-10-02 17:06 ` [PATCH v2 4/9] linux: Use getdents64 on non-LFS readdir Adhemerval Zanella via Libc-alpha
@ 2020-10-13 15:59   ` Florian Weimer via Libc-alpha
  2020-10-15 14:25     ` Adhemerval Zanella via Libc-alpha
  0 siblings, 1 reply; 29+ messages in thread
From: Florian Weimer via Libc-alpha @ 2020-10-13 15:59 UTC (permalink / raw)
  To: Adhemerval Zanella via Libc-alpha; +Cc: James Clarke, John Paul Adrian Glaubitz

* Adhemerval Zanella via Libc-alpha:

> It reserves some space on the allocated internal buffer to be
> used as a the returned dirent struct.  The kernel obtained dirent64
> struct are copied to the temporary buffer on each readdir call.
>
> The overflow test is moved once the dirent64 entry is copied
> to the temporary buffer, and a subsequent readdir will obtain the
> next entry.  The idea is an overflow fails to return the entry on
> readdir, but a next readdir might still obtain the next entry.
> (for filesystem that does not have the concept of sequential d_off,
> such as ext4).

Can you make the translation buffer a separate allocation?  Then you can
resize it to accommodate long names.

Thanks,
Florian
-- 
Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 5/9] linux: Set internal DIR filepos as off64_t [BZ #23960, BZ #24050]
  2020-10-02 17:06 ` [PATCH v2 5/9] linux: Set internal DIR filepos as off64_t [BZ #23960, BZ #24050] Adhemerval Zanella via Libc-alpha
@ 2020-10-13 16:00   ` Florian Weimer via Libc-alpha
  2020-10-15 14:26     ` Adhemerval Zanella via Libc-alpha
  0 siblings, 1 reply; 29+ messages in thread
From: Florian Weimer via Libc-alpha @ 2020-10-13 16:00 UTC (permalink / raw)
  To: Adhemerval Zanella via Libc-alpha; +Cc: James Clarke, John Paul Adrian Glaubitz

* Adhemerval Zanella via Libc-alpha:

> diff --git a/support/temp_file.c b/support/temp_file.c
> index 277c5e0cf1..98bd235526 100644
> --- a/support/temp_file.c
> +++ b/support/temp_file.c
> @@ -60,14 +60,12 @@ add_temp_file (const char *name)
>  }
>  
>  int
> -create_temp_file (const char *base, char **filename)
> +create_temp_file_in_dir (const char *base, const char *dir, char **filename)
>  {
>    char *fname;
>    int fd;
>  
> -  fname = (char *) xmalloc (strlen (test_dir) + 1 + strlen (base)
> -			    + sizeof ("XXXXXX"));
> -  strcpy (stpcpy (stpcpy (stpcpy (fname, test_dir), "/"), base), "XXXXXX");
> +  fname = xasprintf ("%s/%sXXXXXX", dir, base);
>  
>    fd = mkstemp (fname);
>    if (fd == -1)
> @@ -86,6 +84,12 @@ create_temp_file (const char *base, char **filename)
>    return fd;
>  }
>  
> +int
> +create_temp_file (const char *base, char **filename)
> +{
> +  return create_temp_file_in_dir (base, test_dir, filename);
> +}
> +
>  char *
>  support_create_temp_directory (const char *base)
>  {
> diff --git a/support/temp_file.h b/support/temp_file.h
> index 8b6303a6e4..ac61105428 100644
> --- a/support/temp_file.h
> +++ b/support/temp_file.h
> @@ -32,6 +32,13 @@ void add_temp_file (const char *name);
>     *FILENAME.  */
>  int create_temp_file (const char *base, char **filename);
>  
> +/* Create a temporary file in directory DIR.  Return the opened file
> +   descriptor on success, or -1 on failure.  Write the file name to
> +   *FILENAME if FILENAME is not NULL.  In this case, the caller is
> +   expected to free *FILENAME.  */
> +int create_temp_file_in_dir (const char *base, const char *dir,
> +			     char **filename);
> +
>  /* Create a temporary directory and schedule it for deletion.  BASE is
>     used as a prefix for the unique directory name, which the function
>     returns.  The caller should free this string.  */

This part is okay.  Please commit it separately in case other tests grow
a dependency on it (to help with their backporting).

Thanks,
Florian
-- 
Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 1/9] linux: Move posix dir implementations to Linux
  2020-10-13 15:33   ` Florian Weimer via Libc-alpha
@ 2020-10-15 14:08     ` Adhemerval Zanella via Libc-alpha
  0 siblings, 0 replies; 29+ messages in thread
From: Adhemerval Zanella via Libc-alpha @ 2020-10-15 14:08 UTC (permalink / raw)
  To: Florian Weimer, Adhemerval Zanella via Libc-alpha
  Cc: James Clarke, John Paul Adrian Glaubitz



On 13/10/2020 12:33, Florian Weimer wrote:
> * Adhemerval Zanella via Libc-alpha:
> 
>> This generic implementation already expects a getdents API which
>> is Linux specific.  It also allows simplify it by assuming
>> _DIRENT_HAVE_D_RECLEN and _DIRENT_HAVE_D_OFF support.
>>
>> The readdir are also expanded on each required implementation,
>> futher fixes and improvements will make parametrize the
>> implementation more complex.
>>
>> Checked on x86_64-linux-gnu, i686-linux-gnu, and with a build
>> for all affected ABIs.
> 
> This looks okay to me.  It carries forward bug 12165, but that's okay
> with me.  (That loop should be removed in a separate patch, although the
> code duplication now makes this slightly more work.)

I will send a fix for BZ#12165.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 3/9] linux: Add __readdir_unlocked
  2020-10-13 15:43   ` Florian Weimer via Libc-alpha
@ 2020-10-15 14:10     ` Adhemerval Zanella via Libc-alpha
  0 siblings, 0 replies; 29+ messages in thread
From: Adhemerval Zanella via Libc-alpha @ 2020-10-15 14:10 UTC (permalink / raw)
  To: Florian Weimer, Adhemerval Zanella via Libc-alpha
  Cc: James Clarke, John Paul Adrian Glaubitz



On 13/10/2020 12:43, Florian Weimer wrote:
> * Adhemerval Zanella via Libc-alpha:
> 
>> diff --git a/sysdeps/unix/sysv/linux/readdir_r.c b/sysdeps/unix/sysv/linux/readdir_r.c
>> index 0069041394..a01d2845a6 100644
>> --- a/sysdeps/unix/sysv/linux/readdir_r.c
>> +++ b/sysdeps/unix/sysv/linux/readdir_r.c
>> @@ -25,89 +25,44 @@ __readdir_r (DIR *dirp, struct dirent *entry, struct dirent **result)
> 
>> +      /* The name is too long.  Ignore this file.  */
>> +      dirp->errcode = ENAMETOOLONG;
>> +      dp->d_ino = 0;
> 
> I believe the dp->d_ino assignment is no longer needed.  Rest looks okay
> to me.

Ack, I removed the assignment. 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 4/9] linux: Use getdents64 on non-LFS readdir
  2020-10-13 15:59   ` Florian Weimer via Libc-alpha
@ 2020-10-15 14:25     ` Adhemerval Zanella via Libc-alpha
  2020-10-19  8:18       ` Florian Weimer via Libc-alpha
  0 siblings, 1 reply; 29+ messages in thread
From: Adhemerval Zanella via Libc-alpha @ 2020-10-15 14:25 UTC (permalink / raw)
  To: Florian Weimer, Adhemerval Zanella via Libc-alpha
  Cc: James Clarke, John Paul Adrian Glaubitz



On 13/10/2020 12:59, Florian Weimer wrote:
> * Adhemerval Zanella via Libc-alpha:
> 
>> It reserves some space on the allocated internal buffer to be
>> used as a the returned dirent struct.  The kernel obtained dirent64
>> struct are copied to the temporary buffer on each readdir call.
>>
>> The overflow test is moved once the dirent64 entry is copied
>> to the temporary buffer, and a subsequent readdir will obtain the
>> next entry.  The idea is an overflow fails to return the entry on
>> readdir, but a next readdir might still obtain the next entry.
>> (for filesystem that does not have the concept of sequential d_off,
>> such as ext4).
> 
> Can you make the translation buffer a separate allocation?  Then you can
> resize it to accommodate long names.

I followed the idea of mips64 getdents64 (which came from your suggestion)
and it seems that at least for most majority of filesystems [1], the buffer
size should accommodate the maximum supported file size (ReiserFS seems to
be an outlier, but it also seems that Linux VFS limits the maximum size
to 255 characters).

Using a single allocation make the code slight faster (which not sure if
plays much difference since a lot of memory copy would be required for 
non-LFS) and it does not require to handle the possible reallocation
failure on readdir.

[1] https://en.wikipedia.org/wiki/Comparison_of_file_systems#Limits (Not
    sure how exact this reflect the latest Linux version though)

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 5/9] linux: Set internal DIR filepos as off64_t [BZ #23960, BZ #24050]
  2020-10-13 16:00   ` Florian Weimer via Libc-alpha
@ 2020-10-15 14:26     ` Adhemerval Zanella via Libc-alpha
  0 siblings, 0 replies; 29+ messages in thread
From: Adhemerval Zanella via Libc-alpha @ 2020-10-15 14:26 UTC (permalink / raw)
  To: Florian Weimer, Adhemerval Zanella via Libc-alpha
  Cc: James Clarke, John Paul Adrian Glaubitz



On 13/10/2020 13:00, Florian Weimer wrote:
> * Adhemerval Zanella via Libc-alpha:
> 
>> diff --git a/support/temp_file.c b/support/temp_file.c
>> index 277c5e0cf1..98bd235526 100644
>> --- a/support/temp_file.c
>> +++ b/support/temp_file.c
>> @@ -60,14 +60,12 @@ add_temp_file (const char *name)
>>  }
>>  
>>  int
>> -create_temp_file (const char *base, char **filename)
>> +create_temp_file_in_dir (const char *base, const char *dir, char **filename)
>>  {
>>    char *fname;
>>    int fd;
>>  
>> -  fname = (char *) xmalloc (strlen (test_dir) + 1 + strlen (base)
>> -			    + sizeof ("XXXXXX"));
>> -  strcpy (stpcpy (stpcpy (stpcpy (fname, test_dir), "/"), base), "XXXXXX");
>> +  fname = xasprintf ("%s/%sXXXXXX", dir, base);
>>  
>>    fd = mkstemp (fname);
>>    if (fd == -1)
>> @@ -86,6 +84,12 @@ create_temp_file (const char *base, char **filename)
>>    return fd;
>>  }
>>  
>> +int
>> +create_temp_file (const char *base, char **filename)
>> +{
>> +  return create_temp_file_in_dir (base, test_dir, filename);
>> +}
>> +
>>  char *
>>  support_create_temp_directory (const char *base)
>>  {
>> diff --git a/support/temp_file.h b/support/temp_file.h
>> index 8b6303a6e4..ac61105428 100644
>> --- a/support/temp_file.h
>> +++ b/support/temp_file.h
>> @@ -32,6 +32,13 @@ void add_temp_file (const char *name);
>>     *FILENAME.  */
>>  int create_temp_file (const char *base, char **filename);
>>  
>> +/* Create a temporary file in directory DIR.  Return the opened file
>> +   descriptor on success, or -1 on failure.  Write the file name to
>> +   *FILENAME if FILENAME is not NULL.  In this case, the caller is
>> +   expected to free *FILENAME.  */
>> +int create_temp_file_in_dir (const char *base, const char *dir,
>> +			     char **filename);
>> +
>>  /* Create a temporary directory and schedule it for deletion.  BASE is
>>     used as a prefix for the unique directory name, which the function
>>     returns.  The caller should free this string.  */
> 
> This part is okay.  Please commit it separately in case other tests grow
> a dependency on it (to help with their backporting).

Ack, I commit the libsupport as separated patch.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 4/9] linux: Use getdents64 on non-LFS readdir
  2020-10-15 14:25     ` Adhemerval Zanella via Libc-alpha
@ 2020-10-19  8:18       ` Florian Weimer via Libc-alpha
  2020-10-19 20:00         ` Adhemerval Zanella via Libc-alpha
  0 siblings, 1 reply; 29+ messages in thread
From: Florian Weimer via Libc-alpha @ 2020-10-19  8:18 UTC (permalink / raw)
  To: Adhemerval Zanella
  Cc: James Clarke, Adhemerval Zanella via Libc-alpha,
	John Paul Adrian Glaubitz

* Adhemerval Zanella:

> Using a single allocation make the code slight faster (which not sure if
> plays much difference since a lot of memory copy would be required for 
> non-LFS) and it does not require to handle the possible reallocation
> failure on readdir.

But you'd still have to handle the case where name is longer than the
allocation, to avoid a buffer overflow.  So an error case still exists.

Thanks,
Florian
-- 
Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 4/9] linux: Use getdents64 on non-LFS readdir
  2020-10-19  8:18       ` Florian Weimer via Libc-alpha
@ 2020-10-19 20:00         ` Adhemerval Zanella via Libc-alpha
  2020-10-19 20:50           ` Florian Weimer via Libc-alpha
  0 siblings, 1 reply; 29+ messages in thread
From: Adhemerval Zanella via Libc-alpha @ 2020-10-19 20:00 UTC (permalink / raw)
  To: Florian Weimer
  Cc: James Clarke, Adhemerval Zanella via Libc-alpha,
	John Paul Adrian Glaubitz



On 19/10/2020 05:18, Florian Weimer wrote:
> * Adhemerval Zanella:
> 
>> Using a single allocation make the code slight faster (which not sure if
>> plays much difference since a lot of memory copy would be required for 
>> non-LFS) and it does not require to handle the possible reallocation
>> failure on readdir.
> 
> But you'd still have to handle the case where name is longer than the
> allocation, to avoid a buffer overflow.  So an error case still exists.
> 
> Thanks,
> Florian

The dirstream_ret_entry already handles it by returning EOVERFLOW for such
case.  Also keep in mind this limitation is only for the old non-LFS
interface, for the LFS one it will use all the available allocated buffer
created by the opendir (it still have the 32kb limit for getdents64 call).
I can add a more complex buffer handling to resize the auxiliary buffer,
but I am not sure it really pays of the complexity.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 4/9] linux: Use getdents64 on non-LFS readdir
  2020-10-19 20:00         ` Adhemerval Zanella via Libc-alpha
@ 2020-10-19 20:50           ` Florian Weimer via Libc-alpha
  2020-10-19 21:09             ` Adhemerval Zanella via Libc-alpha
  0 siblings, 1 reply; 29+ messages in thread
From: Florian Weimer via Libc-alpha @ 2020-10-19 20:50 UTC (permalink / raw)
  To: Adhemerval Zanella
  Cc: James Clarke, Adhemerval Zanella via Libc-alpha,
	John Paul Adrian Glaubitz

* Adhemerval Zanella:

> On 19/10/2020 05:18, Florian Weimer wrote:
>> * Adhemerval Zanella:
>> 
>>> Using a single allocation make the code slight faster (which not sure if
>>> plays much difference since a lot of memory copy would be required for 
>>> non-LFS) and it does not require to handle the possible reallocation
>>> failure on readdir.
>> 
>> But you'd still have to handle the case where name is longer than the
>> allocation, to avoid a buffer overflow.  So an error case still exists.
>> 
>> Thanks,
>> Florian
>
> The dirstream_ret_entry already handles it by returning EOVERFLOW for such
> case.  Also keep in mind this limitation is only for the old non-LFS
> interface, for the LFS one it will use all the available allocated buffer
> created by the opendir (it still have the 32kb limit for getdents64 call).
> I can add a more complex buffer handling to resize the auxiliary buffer,
> but I am not sure it really pays of the complexity.

Hmm.  The error code should be ENAMETOOLONG.  Ideally, it should be
delayed until the end of the stream, so that the rest of the directory
can be listed (similar to what we do for readdir_r).  It's probably okay
to report an ENOMEM failure immediately.

Anyway, I'll look at the overflow check and see if we can move this
forward, and then revisit the translation buffer allocation in a
separate patch.

Thanks,
Florian
-- 
Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 4/9] linux: Use getdents64 on non-LFS readdir
  2020-10-19 20:50           ` Florian Weimer via Libc-alpha
@ 2020-10-19 21:09             ` Adhemerval Zanella via Libc-alpha
  2020-10-20  7:38               ` Florian Weimer via Libc-alpha
  0 siblings, 1 reply; 29+ messages in thread
From: Adhemerval Zanella via Libc-alpha @ 2020-10-19 21:09 UTC (permalink / raw)
  To: Florian Weimer
  Cc: James Clarke, Adhemerval Zanella via Libc-alpha,
	John Paul Adrian Glaubitz



On 19/10/2020 17:50, Florian Weimer wrote:
> * Adhemerval Zanella:
> 
>> On 19/10/2020 05:18, Florian Weimer wrote:
>>> * Adhemerval Zanella:
>>>
>>>> Using a single allocation make the code slight faster (which not sure if
>>>> plays much difference since a lot of memory copy would be required for 
>>>> non-LFS) and it does not require to handle the possible reallocation
>>>> failure on readdir.
>>>
>>> But you'd still have to handle the case where name is longer than the
>>> allocation, to avoid a buffer overflow.  So an error case still exists.
>>>
>>> Thanks,
>>> Florian
>>
>> The dirstream_ret_entry already handles it by returning EOVERFLOW for such
>> case.  Also keep in mind this limitation is only for the old non-LFS
>> interface, for the LFS one it will use all the available allocated buffer
>> created by the opendir (it still have the 32kb limit for getdents64 call).
>> I can add a more complex buffer handling to resize the auxiliary buffer,
>> but I am not sure it really pays of the complexity.
> 
> Hmm.  The error code should be ENAMETOOLONG.  Ideally, it should be
> delayed until the end of the stream, so that the rest of the directory
> can be listed (similar to what we do for readdir_r).  It's probably okay
> to report an ENOMEM failure immediately.

ENAMETOOLONG does make more sense, I have fixed it locally.  I am not
sure I understood by 'delayed', it it report on the readdir call, but
the next entry will still be reported in the next readdir call.  The
construction such as won't work:

  while (readdir (dp) != NULL)
    {
      [...]
    }

But this will work as intended:

  while (1)
    {
      errno = 0;
      struct dirent *entry = readdir (dp);
      if (entry == NULL)
	{
	  if (errno == ENAMETOOLONG)
	    continue;
	  break;
	}
    }

Even if go to buffer resize, application will still need to handle
ENOMEM so I am not sure if using separated buffer is really an
improvement here (at least on trying to adequate usual way to call
opendir functions).


> 
> Anyway, I'll look at the overflow check and see if we can move this
> forward, and then revisit the translation buffer allocation in a
> separate patch.

Thanks.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 4/9] linux: Use getdents64 on non-LFS readdir
  2020-10-19 21:09             ` Adhemerval Zanella via Libc-alpha
@ 2020-10-20  7:38               ` Florian Weimer via Libc-alpha
  2020-10-20 12:05                 ` Adhemerval Zanella via Libc-alpha
  0 siblings, 1 reply; 29+ messages in thread
From: Florian Weimer via Libc-alpha @ 2020-10-20  7:38 UTC (permalink / raw)
  To: Adhemerval Zanella via Libc-alpha; +Cc: James Clarke, John Paul Adrian Glaubitz

* Adhemerval Zanella via Libc-alpha:

> ENAMETOOLONG does make more sense, I have fixed it locally.  I am not
> sure I understood by 'delayed', it it report on the readdir call, but
> the next entry will still be reported in the next readdir call.  The
> construction such as won't work:
>
>   while (readdir (dp) != NULL)
>     {
>       [...]
>     }
>
> But this will work as intended:
>
>   while (1)
>     {
>       errno = 0;
>       struct dirent *entry = readdir (dp);
>       if (entry == NULL)
> 	{
> 	  if (errno == ENAMETOOLONG)
> 	    continue;
> 	  break;
> 	}
>     }

I think it's unreasonable to expect that this pattern will work.  It
assumes that readdir updates dp despite the error.  In other
implementations, this may produce an endless loop.  This is why for
readdir_r, we record the ENAMETOOLONG error as pending, and report it
only at the end.  (We should do the same for EOVERFLOW errors.)

> Even if go to buffer resize, application will still need to handle
> ENOMEM so I am not sure if using separated buffer is really an
> improvement here (at least on trying to adequate usual way to call
> opendir functions).

ENOMEM is a temporary error, ENAMETOOLONG depends on file system content
and may be much more difficult to resolve.

Thanks,
Florian
-- 
Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 4/9] linux: Use getdents64 on non-LFS readdir
  2020-10-20  7:38               ` Florian Weimer via Libc-alpha
@ 2020-10-20 12:05                 ` Adhemerval Zanella via Libc-alpha
  2020-10-20 12:35                   ` Florian Weimer via Libc-alpha
  0 siblings, 1 reply; 29+ messages in thread
From: Adhemerval Zanella via Libc-alpha @ 2020-10-20 12:05 UTC (permalink / raw)
  To: Florian Weimer, Adhemerval Zanella via Libc-alpha
  Cc: James Clarke, John Paul Adrian Glaubitz



On 20/10/2020 04:38, Florian Weimer wrote:
> * Adhemerval Zanella via Libc-alpha:
> 
>> ENAMETOOLONG does make more sense, I have fixed it locally.  I am not
>> sure I understood by 'delayed', it it report on the readdir call, but
>> the next entry will still be reported in the next readdir call.  The
>> construction such as won't work:
>>
>>   while (readdir (dp) != NULL)
>>     {
>>       [...]
>>     }
>>
>> But this will work as intended:
>>
>>   while (1)
>>     {
>>       errno = 0;
>>       struct dirent *entry = readdir (dp);
>>       if (entry == NULL)
>> 	{
>> 	  if (errno == ENAMETOOLONG)
>> 	    continue;
>> 	  break;
>> 	}
>>     }
> 
> I think it's unreasonable to expect that this pattern will work.  It
> assumes that readdir updates dp despite the error.  In other
> implementations, this may produce an endless loop.  This is why for
> readdir_r, we record the ENAMETOOLONG error as pending, and report it
> only at the end.  (We should do the same for EOVERFLOW errors.)

This is what POSIX states as the way to check for errors:

  "Applications wishing to check for error situations should set errno 
   to 0 before calling readdir(). If errno is set to non-zero on return, 
   an error occurred."

What the standard is not clear is whether the position as the directory 
stream is updated in a case of failure.  I think it is a fair assumption
since it also states that 'When the end of the directory is encountered, 
a null pointer shall be returned and errno is not changed.'.  So there
is a distinction between end of directory stream and a failure (in the
example above checking for 'errno == 0' should indicate the end of the
stream).

And it also defines EOVERFLOW as possible transient error.  It
is possible to do for readdir_r because the function return is different
than the returned 'direct' itself, for readdir what we have is the
'errno' to signal such failure.

> 
>> Even if go to buffer resize, application will still need to handle
>> ENOMEM so I am not sure if using separated buffer is really an
>> improvement here (at least on trying to adequate usual way to call
>> opendir functions).
> 
> ENOMEM is a temporary error, ENAMETOOLONG depends on file system content
> and may be much more difficult to resolve.

Even with ENOMEM, application will require to actually handle a possible
readdir failure and acts accordingly (either by retrying or skip the
file).  But I consider failing with ENOMEM in fact much more confusing,
it exposes internal implementation that needs to be handled by the
application (the internal memory reallocation), where ENAMETOOLONG is
indeed a system limitation that can be resolved by using LFS support.

That's why this limitation is only for non-LFS interface.  This whole fix
is a best effort to a deprecated interface.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 4/9] linux: Use getdents64 on non-LFS readdir
  2020-10-20 12:05                 ` Adhemerval Zanella via Libc-alpha
@ 2020-10-20 12:35                   ` Florian Weimer via Libc-alpha
  2020-10-20 14:09                     ` Adhemerval Zanella via Libc-alpha
  2020-10-20 17:42                     ` Adhemerval Zanella via Libc-alpha
  0 siblings, 2 replies; 29+ messages in thread
From: Florian Weimer via Libc-alpha @ 2020-10-20 12:35 UTC (permalink / raw)
  To: Adhemerval Zanella
  Cc: James Clarke, Adhemerval Zanella via Libc-alpha,
	John Paul Adrian Glaubitz

* Adhemerval Zanella:

> On 20/10/2020 04:38, Florian Weimer wrote:
>> * Adhemerval Zanella via Libc-alpha:
>> 
>>> ENAMETOOLONG does make more sense, I have fixed it locally.  I am not
>>> sure I understood by 'delayed', it it report on the readdir call, but
>>> the next entry will still be reported in the next readdir call.  The
>>> construction such as won't work:
>>>
>>>   while (readdir (dp) != NULL)
>>>     {
>>>       [...]
>>>     }
>>>
>>> But this will work as intended:
>>>
>>>   while (1)
>>>     {
>>>       errno = 0;
>>>       struct dirent *entry = readdir (dp);
>>>       if (entry == NULL)
>>> 	{
>>> 	  if (errno == ENAMETOOLONG)
>>> 	    continue;
>>> 	  break;
>>> 	}
>>>     }
>> 
>> I think it's unreasonable to expect that this pattern will work.  Itg
>> assumes that readdir updates dp despite the error.  In other
>> implementations, this may produce an endless loop.  This is why for
>> readdir_r, we record the ENAMETOOLONG error as pending, and report it
>> only at the end.  (We should do the same for EOVERFLOW errors.)
>
> This is what POSIX states as the way to check for errors:
>
>   "Applications wishing to check for error situations should set errno 
>    to 0 before calling readdir(). If errno is set to non-zero on return, 
>    an error occurred."
>
> What the standard is not clear is whether the position as the directory 
> stream is updated in a case of failure.

Yes, that was my concern.  I found this in POSIX (in 2.3 Error Numbers):

| If an error condition is detected, the action requested may have been
| partially performed, unless otherwise stated.

But in general we would be rather concerned if a regular read or write
on a file descriptor failed *and* still updated the file offset.  Not
sure about directory streams.

> I think it is a fair assumption
> since it also states that 'When the end of the directory is encountered, 
> a null pointer shall be returned and errno is not changed.'.  So there
> is a distinction between end of directory stream and a failure (in the
> example above checking for 'errno == 0' should indicate the end of the
> stream).

The question is whether you are supposed to be able to continue reading
and encounter errno == 0 eventually.  In general, this is simply not
true, consider EIO or EUCLEAN.  This is why I went with delayed error
reporting for readdir_r: it is simply not reasonable to expect that
applications continue calling readddir_r or readdir after the first
error because it's not clear how to can avoid an endless loop (even
assuming that the failing readdir call somehow advanced the read
pointer).

I hope this explains my point of view.

> And it also defines EOVERFLOW as possible transient error.  It
> is possible to do for readdir_r because the function return is different
> than the returned 'direct' itself, for readdir what we have is the
> 'errno' to signal such failure.

I don't understand this point.  The approach to detect EOF vs error is
different for readdir_r, but the question whether continue after the
first error is equally relevant.

>>> Even if go to buffer resize, application will still need to handle
>>> ENOMEM so I am not sure if using separated buffer is really an
>>> improvement here (at least on trying to adequate usual way to call
>>> opendir functions).
>> 
>> ENOMEM is a temporary error, ENAMETOOLONG depends on file system content
>> and may be much more difficult to resolve.
>
> Even with ENOMEM, application will require to actually handle a possible
> readdir failure and acts accordingly (either by retrying or skip the
> file).  But I consider failing with ENOMEM in fact much more confusing,
> it exposes internal implementation that needs to be handled by the
> application (the internal memory reallocation), where ENAMETOOLONG is
> indeed a system limitation that can be resolved by using LFS support.

I think readdir can already fail with ENOMEM if the error comes from the
kernel, for example from ext4_ind_map_blocks and ext4_alloc_branch.  So
adding another corner case where ENOMEM can result does not seem
egregiously bad to me.

Thanks,
Florian
-- 
Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 4/9] linux: Use getdents64 on non-LFS readdir
  2020-10-20 12:35                   ` Florian Weimer via Libc-alpha
@ 2020-10-20 14:09                     ` Adhemerval Zanella via Libc-alpha
  2020-10-20 17:42                     ` Adhemerval Zanella via Libc-alpha
  1 sibling, 0 replies; 29+ messages in thread
From: Adhemerval Zanella via Libc-alpha @ 2020-10-20 14:09 UTC (permalink / raw)
  To: Florian Weimer
  Cc: James Clarke, Adhemerval Zanella via Libc-alpha,
	John Paul Adrian Glaubitz



On 20/10/2020 09:35, Florian Weimer wrote:
> * Adhemerval Zanella:
> 
>> On 20/10/2020 04:38, Florian Weimer wrote:
>>> * Adhemerval Zanella via Libc-alpha:
>>>
>>>> ENAMETOOLONG does make more sense, I have fixed it locally.  I am not
>>>> sure I understood by 'delayed', it it report on the readdir call, but
>>>> the next entry will still be reported in the next readdir call.  The
>>>> construction such as won't work:
>>>>
>>>>   while (readdir (dp) != NULL)
>>>>     {
>>>>       [...]
>>>>     }
>>>>
>>>> But this will work as intended:
>>>>
>>>>   while (1)
>>>>     {
>>>>       errno = 0;
>>>>       struct dirent *entry = readdir (dp);
>>>>       if (entry == NULL)
>>>> 	{
>>>> 	  if (errno == ENAMETOOLONG)
>>>> 	    continue;
>>>> 	  break;
>>>> 	}
>>>>     }
>>>
>>> I think it's unreasonable to expect that this pattern will work.  Itg
>>> assumes that readdir updates dp despite the error.  In other
>>> implementations, this may produce an endless loop.  This is why for
>>> readdir_r, we record the ENAMETOOLONG error as pending, and report it
>>> only at the end.  (We should do the same for EOVERFLOW errors.)
>>
>> This is what POSIX states as the way to check for errors:
>>
>>   "Applications wishing to check for error situations should set errno 
>>    to 0 before calling readdir(). If errno is set to non-zero on return, 
>>    an error occurred."
>>
>> What the standard is not clear is whether the position as the directory 
>> stream is updated in a case of failure.
> 
> Yes, that was my concern.  I found this in POSIX (in 2.3 Error Numbers):
> 
> | If an error condition is detected, the action requested may have been
> | partially performed, unless otherwise stated.
> 
> But in general we would be rather concerned if a regular read or write
> on a file descriptor failed *and* still updated the file offset.  Not
> sure about directory streams.

I think the snippet you stated does not really prevent us to return an
empty entry with a transient error (ENAMETOOLONG), since it is clear
different than entries read error for getdents itself (EINVAL, ENOENT).

Former mean that only the referred entry could not be obtained, where
latter means that either there is a EOF (if the directory has been
removed) or the buffer is too small.  For latter we might still try to
resize the internal buffer to accommodate the required size, but this
incur in the memory failure possible error nonetheless.

> 
>> I think it is a fair assumption
>> since it also states that 'When the end of the directory is encountered, 
>> a null pointer shall be returned and errno is not changed.'.  So there
>> is a distinction between end of directory stream and a failure (in the
>> example above checking for 'errno == 0' should indicate the end of the
>> stream).
> 
> The question is whether you are supposed to be able to continue reading
> and encounter errno == 0 eventually.  In general, this is simply not
> true, consider EIO or EUCLEAN.  This is why I went with delayed error
> reporting for readdir_r: it is simply not reasonable to expect that
> applications continue calling readddir_r or readdir after the first
> error because it's not clear how to can avoid an endless loop (even
> assuming that the failing readdir call somehow advanced the read
> pointer).
> 
> I hope this explains my point of view.

I understood your point, however we are constraint by opendir interface.
Different than readdir_r, for readdir we can't signal that some error
has occurred in obtained the entry while returning the partial entry
information (for the case of ENAMETOOLONG for instance).

We can just skip the ENAMETOOLONG entries is set the errno only for EOF, 
but even  with this approach we might encounter different errors (due 
getdents failure for instance) which might shadow this specific interface
error glibc might trigger.

> 
>> And it also defines EOVERFLOW as possible transient error.  It
>> is possible to do for readdir_r because the function return is different
>> than the returned 'direct' itself, for readdir what we have is the
>> 'errno' to signal such failure.
> 
> I don't understand this point.  The approach to detect EOF vs error is
> different for readdir_r, but the question whether continue after the
> first error is equally relevant.

My point is if we decide to ignore entries with ENAMETOOLONG, how to signal 
the readdir caller this has happened?

> 
>>>> Even if go to buffer resize, application will still need to handle
>>>> ENOMEM so I am not sure if using separated buffer is really an
>>>> improvement here (at least on trying to adequate usual way to call
>>>> opendir functions).
>>>
>>> ENOMEM is a temporary error, ENAMETOOLONG depends on file system content
>>> and may be much more difficult to resolve.
>>
>> Even with ENOMEM, application will require to actually handle a possible
>> readdir failure and acts accordingly (either by retrying or skip the
>> file).  But I consider failing with ENOMEM in fact much more confusing,
>> it exposes internal implementation that needs to be handled by the
>> application (the internal memory reallocation), where ENAMETOOLONG is
>> indeed a system limitation that can be resolved by using LFS support.
> 
> I think readdir can already fail with ENOMEM if the error comes from the
> kernel, for example from ext4_ind_map_blocks and ext4_alloc_branch.  So
> adding another corner case where ENOMEM can result does not seem
> egregiously bad to me.

Is it a possible error for getdents? I couldn't really track it, but I
haven't dig into kernel code that deep. In any case it would be good
to update the getdents implementation.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 4/9] linux: Use getdents64 on non-LFS readdir
  2020-10-20 12:35                   ` Florian Weimer via Libc-alpha
  2020-10-20 14:09                     ` Adhemerval Zanella via Libc-alpha
@ 2020-10-20 17:42                     ` Adhemerval Zanella via Libc-alpha
  1 sibling, 0 replies; 29+ messages in thread
From: Adhemerval Zanella via Libc-alpha @ 2020-10-20 17:42 UTC (permalink / raw)
  To: Florian Weimer
  Cc: James Clarke, Adhemerval Zanella via Libc-alpha,
	John Paul Adrian Glaubitz



On 20/10/2020 09:35, Florian Weimer wrote:
> * Adhemerval Zanella:
> 
>> On 20/10/2020 04:38, Florian Weimer wrote:
>>> * Adhemerval Zanella via Libc-alpha:
>>>
>>>> ENAMETOOLONG does make more sense, I have fixed it locally.  I am not
>>>> sure I understood by 'delayed', it it report on the readdir call, but
>>>> the next entry will still be reported in the next readdir call.  The
>>>> construction such as won't work:
>>>>
>>>>   while (readdir (dp) != NULL)
>>>>     {
>>>>       [...]
>>>>     }
>>>>
>>>> But this will work as intended:
>>>>
>>>>   while (1)
>>>>     {
>>>>       errno = 0;
>>>>       struct dirent *entry = readdir (dp);
>>>>       if (entry == NULL)
>>>> 	{
>>>> 	  if (errno == ENAMETOOLONG)
>>>> 	    continue;
>>>> 	  break;
>>>> 	}
>>>>     }
>>>
>>> I think it's unreasonable to expect that this pattern will work.  Itg
>>> assumes that readdir updates dp despite the error.  In other
>>> implementations, this may produce an endless loop.  This is why for
>>> readdir_r, we record the ENAMETOOLONG error as pending, and report it
>>> only at the end.  (We should do the same for EOVERFLOW errors.)
>>
>> This is what POSIX states as the way to check for errors:
>>
>>   "Applications wishing to check for error situations should set errno 
>>    to 0 before calling readdir(). If errno is set to non-zero on return, 
>>    an error occurred."
>>
>> What the standard is not clear is whether the position as the directory 
>> stream is updated in a case of failure.
> 
> Yes, that was my concern.  I found this in POSIX (in 2.3 Error Numbers):
> 
> | If an error condition is detected, the action requested may have been
> | partially performed, unless otherwise stated.
> 
> But in general we would be rather concerned if a regular read or write
> on a file descriptor failed *and* still updated the file offset.  Not
> sure about directory streams.
> 
>> I think it is a fair assumption
>> since it also states that 'When the end of the directory is encountered, 
>> a null pointer shall be returned and errno is not changed.'.  So there
>> is a distinction between end of directory stream and a failure (in the
>> example above checking for 'errno == 0' should indicate the end of the
>> stream).
> 
> The question is whether you are supposed to be able to continue reading
> and encounter errno == 0 eventually.  In general, this is simply not
> true, consider EIO or EUCLEAN.  This is why I went with delayed error
> reporting for readdir_r: it is simply not reasonable to expect that
> applications continue calling readddir_r or readdir after the first
> error because it's not clear how to can avoid an endless loop (even
> assuming that the failing readdir call somehow advanced the read
> pointer).
> 
> I hope this explains my point of view.

Thinking twice I think a resizable buffer does make sense in this
situation.  I will send an updated set with this modification. 

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2020-10-20 17:42 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-02 17:06 [PATCH v2 0/9] Fix getdents{64} regression on some FS Adhemerval Zanella via Libc-alpha
2020-10-02 17:06 ` [PATCH v2 1/9] linux: Move posix dir implementations to Linux Adhemerval Zanella via Libc-alpha
2020-10-13 15:33   ` Florian Weimer via Libc-alpha
2020-10-15 14:08     ` Adhemerval Zanella via Libc-alpha
2020-10-02 17:06 ` [PATCH v2 2/9] linux: Simplify opendir buffer allocation Adhemerval Zanella via Libc-alpha
2020-10-13 15:34   ` Florian Weimer via Libc-alpha
2020-10-02 17:06 ` [PATCH v2 3/9] linux: Add __readdir_unlocked Adhemerval Zanella via Libc-alpha
2020-10-13 15:43   ` Florian Weimer via Libc-alpha
2020-10-15 14:10     ` Adhemerval Zanella via Libc-alpha
2020-10-02 17:06 ` [PATCH v2 4/9] linux: Use getdents64 on non-LFS readdir Adhemerval Zanella via Libc-alpha
2020-10-13 15:59   ` Florian Weimer via Libc-alpha
2020-10-15 14:25     ` Adhemerval Zanella via Libc-alpha
2020-10-19  8:18       ` Florian Weimer via Libc-alpha
2020-10-19 20:00         ` Adhemerval Zanella via Libc-alpha
2020-10-19 20:50           ` Florian Weimer via Libc-alpha
2020-10-19 21:09             ` Adhemerval Zanella via Libc-alpha
2020-10-20  7:38               ` Florian Weimer via Libc-alpha
2020-10-20 12:05                 ` Adhemerval Zanella via Libc-alpha
2020-10-20 12:35                   ` Florian Weimer via Libc-alpha
2020-10-20 14:09                     ` Adhemerval Zanella via Libc-alpha
2020-10-20 17:42                     ` Adhemerval Zanella via Libc-alpha
2020-10-02 17:06 ` [PATCH v2 5/9] linux: Set internal DIR filepos as off64_t [BZ #23960, BZ #24050] Adhemerval Zanella via Libc-alpha
2020-10-13 16:00   ` Florian Weimer via Libc-alpha
2020-10-15 14:26     ` Adhemerval Zanella via Libc-alpha
2020-10-02 17:06 ` [PATCH v2 6/9] linux: Add __readdir64_unlocked Adhemerval Zanella via Libc-alpha
2020-10-02 17:06 ` [PATCH v2 7/9] linux: Add __old_readdir64_unlocked Adhemerval Zanella via Libc-alpha
2020-10-02 17:06 ` [PATCH v2 8/9] linux: Use getdents64 on readdir64 compat implementation Adhemerval Zanella via Libc-alpha
2020-10-02 17:06 ` [PATCH v2 9/9] dirent: Deprecate getdirentries Adhemerval Zanella via Libc-alpha
2020-10-04 13:08 ` [PATCH v2 0/9] Fix getdents{64} regression on some FS Dave Flogeras via Libc-alpha

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).