bug-gnulib@gnu.org mirror (unofficial)
 help / color / mirror / Atom feed
* [PATCH 1/2] copy-file-range: new module
@ 2019-06-05  6:09 Paul Eggert
  2019-06-05  6:09 ` [PATCH 2/2] copy-file: prefer copy_file_range Paul Eggert
  2019-06-05 13:54 ` [PATCH 1/2] copy-file-range: new module Florian Weimer
  0 siblings, 2 replies; 9+ messages in thread
From: Paul Eggert @ 2019-06-05  6:09 UTC (permalink / raw)
  To: bug-gnulib; +Cc: Paul Eggert

* MODULES.html.sh: Add copy-file-range.
* lib/copy-file-range.c, m4/copy-file-range.m4:
* modules/copy-file-range: New files.
* lib/unistd.in.h (copy_file_range): Declare.
* m4/unistd_h.m4 (gl_UNISTD_H_DEFAULTS):
Set up GNULIB_COPY_FILE_RANGE and HAVE_COPY_FILE_RANGE.
* modules/unistd (unistd.h): Substitute them.
---
 ChangeLog               |  11 +++
 MODULES.html.sh         |   1 +
 lib/copy-file-range.c   | 165 ++++++++++++++++++++++++++++++++++++++++
 lib/unistd.in.h         |  18 +++++
 m4/copy-file-range.m4   |  18 +++++
 m4/unistd_h.m4          |   4 +-
 modules/copy-file-range |  33 ++++++++
 modules/unistd          |   2 +
 8 files changed, 251 insertions(+), 1 deletion(-)
 create mode 100644 lib/copy-file-range.c
 create mode 100644 m4/copy-file-range.m4
 create mode 100644 modules/copy-file-range

diff --git a/ChangeLog b/ChangeLog
index 5e1e69ce3..9ff35e9c1 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,14 @@
+2019-06-04  Paul Eggert  <eggert@cs.ucla.edu>
+
+	copy-file-range: new module
+	* MODULES.html.sh: Add copy-file-range.
+	* lib/copy-file-range.c, m4/copy-file-range.m4:
+	* modules/copy-file-range: New files.
+	* lib/unistd.in.h (copy_file_range): Declare.
+	* m4/unistd_h.m4 (gl_UNISTD_H_DEFAULTS):
+	Set up GNULIB_COPY_FILE_RANGE and HAVE_COPY_FILE_RANGE.
+	* modules/unistd (unistd.h): Substitute them.
+
 2019-05-28  Bruno Haible  <bruno@clisp.org>
 
 	binary-io: Attempted use of O_BINARY on consoles no longer fails.
diff --git a/MODULES.html.sh b/MODULES.html.sh
index 78b6d4918..80f0c2973 100755
--- a/MODULES.html.sh
+++ b/MODULES.html.sh
@@ -2735,6 +2735,7 @@ func_all_modules ()
 
   func_begin_table
   func_module binary-io
+  func_module copy-file-range
   func_module dup3
   func_module fcntl-safer
   func_module fd-safer-flag
diff --git a/lib/copy-file-range.c b/lib/copy-file-range.c
new file mode 100644
index 000000000..a1448d0d9
--- /dev/null
+++ b/lib/copy-file-range.c
@@ -0,0 +1,165 @@
+/* Emulation of copy_file_range.
+   Copyright 2017-2019 Free Software Foundation, Inc.
+
+   This program is free software: you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <https://www.gnu.org/licenses/>.  */
+
+/* This file is adapted from glibc io/copy_file_range-compat.c, with a
+   small number of changes to port to non-glibc platforms.  */
+
+#include <libc-config.h>
+
+/* The following macros should be defined:
+
+   COPY_FILE_RANGE_DECL   Declaration specifiers for the function below.
+   COPY_FILE_RANGE        Name of the function to define.  */
+
+#define COPY_FILE_RANGE_DECL
+#define COPY_FILE_RANGE copy_file_range
+
+#include <errno.h>
+#include <fcntl.h>
+#include <inttypes.h>
+#include <limits.h>
+#include <sys/stat.h>
+#include <sys/types.h>
+#include <unistd.h>
+
+COPY_FILE_RANGE_DECL
+ssize_t
+COPY_FILE_RANGE (int infd, off_t *pinoff,
+                 int outfd, off_t *poutoff,
+                 size_t length, unsigned int flags)
+{
+  if (flags != 0)
+    {
+      __set_errno (EINVAL);
+      return -1;
+    }
+
+  {
+    struct stat instat;
+    struct stat outstat;
+    if (fstat (infd, &instat) != 0 || fstat (outfd, &outstat) != 0)
+      return -1;
+    if (S_ISDIR (instat.st_mode) || S_ISDIR (outstat.st_mode))
+      {
+        __set_errno (EISDIR);
+        return -1;
+      }
+    if (!S_ISREG (instat.st_mode) || !S_ISREG (outstat.st_mode))
+      {
+        /* We need a regular input file so that the we can seek
+           backwards in case of a write failure.  */
+        __set_errno (EINVAL);
+        return -1;
+      }
+    if (instat.st_dev != outstat.st_dev)
+      {
+        /* Cross-device copies are not supported.  */
+        __set_errno (EXDEV);
+        return -1;
+      }
+  }
+
+  /* The output descriptor must not have O_APPEND set.  */
+  {
+    int flags = fcntl (outfd, F_GETFL);
+    if (flags & O_APPEND)
+      {
+        __set_errno (EBADF);
+        return -1;
+      }
+  }
+
+  /* Avoid an overflow in the result.  */
+  if (length > SSIZE_MAX)
+    length = SSIZE_MAX;
+
+  /* Main copying loop.  The buffer size is arbitrary and is a
+     trade-off between stack size consumption, cache usage, and
+     amortization of system call overhead.  */
+  size_t copied = 0;
+  char buf[8192];
+  while (length > 0)
+    {
+      size_t to_read = length;
+      if (to_read > sizeof (buf))
+        to_read = sizeof (buf);
+
+      /* Fill the buffer.  */
+      ssize_t read_count;
+      if (pinoff == NULL)
+        read_count = read (infd, buf, to_read);
+      else
+        read_count = pread (infd, buf, to_read, *pinoff);
+      if (read_count == 0)
+        /* End of file reached prematurely.  */
+        return copied;
+      if (read_count < 0)
+        {
+          if (copied > 0)
+            /* Report the number of bytes copied so far.  */
+            return copied;
+          return -1;
+        }
+      if (pinoff != NULL)
+        *pinoff += read_count;
+
+      /* Write the buffer part which was read to the destination.  */
+      char *end = buf + read_count;
+      for (char *p = buf; p < end; )
+        {
+          ssize_t write_count;
+          if (poutoff == NULL)
+            write_count = write (outfd, p, end - p);
+          else
+            write_count = pwrite (outfd, p, end - p, *poutoff);
+          if (write_count < 0)
+            {
+              /* Adjust the input read position to match what we have
+                 written, so that the caller can pick up after the
+                 error.  */
+              size_t written = p - buf;
+              /* NB: This needs to be signed so that we can form the
+                 negative value below.  */
+              ssize_t overread = read_count - written;
+              if (pinoff == NULL)
+                {
+                  if (overread > 0)
+                    {
+                      /* We are on an error recovery path, so we
+                         cannot deal with failure here.  */
+                      int save_errno = errno;
+                      (void) lseek (infd, -overread, SEEK_CUR);
+                      __set_errno (save_errno);
+                    }
+                }
+              else /* pinoff != NULL */
+                *pinoff -= overread;
+
+              if (copied + written > 0)
+                /* Report the number of bytes copied so far.  */
+                return copied + written;
+              return -1;
+            }
+          p += write_count;
+          if (poutoff != NULL)
+            *poutoff += write_count;
+        } /* Write loop.  */
+
+      copied += read_count;
+      length -= read_count;
+    }
+  return copied;
+}
diff --git a/lib/unistd.in.h b/lib/unistd.in.h
index 0ff7396b3..9ffb2e990 100644
--- a/lib/unistd.in.h
+++ b/lib/unistd.in.h
@@ -331,6 +331,24 @@ _GL_WARN_ON_USE (close, "close does not portably work on sockets - "
 #endif
 
 
+#if @GNULIB_COPY_FILE_RANGE@
+# if !@HAVE_COPY_FILE_RANGE@
+_GL_FUNCDECL_SYS (copy_file_range, ssize_t, (int ifd, off_t *ipos,
+                                             int ofd, off_t *opos,
+                                             size_t len, unsigned flags));
+_GL_CXXALIAS_SYS (copy_file_range, ssize_t, (int ifd, off_t *ipos,
+                                             int ofd, off_t *opos,
+                                             size_t len, unsigned flags));
+# endif
+_GL_CXXALIASWARN (copy_file_range);
+#elif defined GNULIB_POSIXCHECK
+/* Assume copy_file_range is always declared.  */
+_GL_WARN_ON_USE (copy_file_range,
+                 "copy_file_range is unportable - "
+                 "use gnulib module copy_file_range for portability");
+#endif
+
+
 #if @GNULIB_DUP@
 # if @REPLACE_DUP@
 #  if !(defined __cplusplus && defined GNULIB_NAMESPACE)
diff --git a/m4/copy-file-range.m4 b/m4/copy-file-range.m4
new file mode 100644
index 000000000..edd2c4aed
--- /dev/null
+++ b/m4/copy-file-range.m4
@@ -0,0 +1,18 @@
+# copy-file-range.m4
+dnl Copyright 2019 Free Software Foundation, Inc.
+dnl This file is free software; the Free Software Foundation
+dnl gives unlimited permission to copy and/or distribute it,
+dnl with or without modifications, as long as this notice is preserved.
+
+AC_DEFUN([gl_FUNC_COPY_FILE_RANGE],
+[
+  AC_REQUIRE([gl_UNISTD_H_DEFAULTS])
+
+  dnl Persuade glibc <unistd.h> to declare copy_file_range.
+  AC_REQUIRE([AC_USE_SYSTEM_EXTENSIONS])
+
+  AC_CHECK_FUNCS_ONCE([copy_file_range])
+  if test $ac_cv_func_copy_file_range != yes; then
+    HAVE_COPY_FILE_RANGE=0
+  fi
+])
diff --git a/m4/unistd_h.m4 b/m4/unistd_h.m4
index a04055d2a..a3b3905f8 100644
--- a/m4/unistd_h.m4
+++ b/m4/unistd_h.m4
@@ -1,4 +1,4 @@
-# unistd_h.m4 serial 74
+# unistd_h.m4 serial 75
 dnl Copyright (C) 2006-2019 Free Software Foundation, Inc.
 dnl This file is free software; the Free Software Foundation
 dnl gives unlimited permission to copy and/or distribute it,
@@ -64,6 +64,7 @@ AC_DEFUN([gl_UNISTD_H_DEFAULTS],
   GNULIB_CHDIR=0;                AC_SUBST([GNULIB_CHDIR])
   GNULIB_CHOWN=0;                AC_SUBST([GNULIB_CHOWN])
   GNULIB_CLOSE=0;                AC_SUBST([GNULIB_CLOSE])
+  GNULIB_COPY_FILE_RANGE=0;      AC_SUBST([GNULIB_COPY_FILE_RANGE])
   GNULIB_DUP=0;                  AC_SUBST([GNULIB_DUP])
   GNULIB_DUP2=0;                 AC_SUBST([GNULIB_DUP2])
   GNULIB_DUP3=0;                 AC_SUBST([GNULIB_DUP3])
@@ -113,6 +114,7 @@ AC_DEFUN([gl_UNISTD_H_DEFAULTS],
   GNULIB_WRITE=0;                AC_SUBST([GNULIB_WRITE])
   dnl Assume proper GNU behavior unless another module says otherwise.
   HAVE_CHOWN=1;           AC_SUBST([HAVE_CHOWN])
+  HAVE_COPY_FILE_RANGE=1; AC_SUBST([HAVE_COPY_FILE_RANGE])
   HAVE_DUP2=1;            AC_SUBST([HAVE_DUP2])
   HAVE_DUP3=1;            AC_SUBST([HAVE_DUP3])
   HAVE_EUIDACCESS=1;      AC_SUBST([HAVE_EUIDACCESS])
diff --git a/modules/copy-file-range b/modules/copy-file-range
new file mode 100644
index 000000000..18742ef3b
--- /dev/null
+++ b/modules/copy-file-range
@@ -0,0 +1,33 @@
+Description:
+Copy parts of files
+
+Files:
+lib/copy-file-range.c
+m4/copy-file-range.m4
+
+Depends-on:
+c99
+inttypes-incomplete [test $HAVE_COPY_FILE_RANGE = 0]
+largefile
+libc-config         [test $HAVE_COPY_FILE_RANGE = 0]
+pread               [test $HAVE_COPY_FILE_RANGE = 0]
+pwrite              [test $HAVE_COPY_FILE_RANGE = 0]
+unistd
+
+configure.ac:
+gl_FUNC_COPY_FILE_RANGE
+if test $HAVE_COPY_FILE_RANGE = 0; then
+  AC_LIBOBJ([copy-file-range])
+fi
+gl_UNISTD_MODULE_INDICATOR([copy-file-range])
+
+Makefile.am:
+
+Include:
+<unistd.h>
+
+License:
+LGPLv2+
+
+Maintainer:
+all
diff --git a/modules/unistd b/modules/unistd
index fc1a27fe1..e29c1ad94 100644
--- a/modules/unistd
+++ b/modules/unistd
@@ -39,6 +39,7 @@ unistd.h: unistd.in.h $(top_builddir)/config.status $(CXXDEFS_H) $(ARG_NONNULL_H
 	      -e 's/@''GNULIB_CHDIR''@/$(GNULIB_CHDIR)/g' \
 	      -e 's/@''GNULIB_CHOWN''@/$(GNULIB_CHOWN)/g' \
 	      -e 's/@''GNULIB_CLOSE''@/$(GNULIB_CLOSE)/g' \
+	      -e 's/@''GNULIB_COPY_FILE_RANGE''@/$(GNULIB_COPY_FILE_RANGE)/g' \
 	      -e 's/@''GNULIB_DUP''@/$(GNULIB_DUP)/g' \
 	      -e 's/@''GNULIB_DUP2''@/$(GNULIB_DUP2)/g' \
 	      -e 's/@''GNULIB_DUP3''@/$(GNULIB_DUP3)/g' \
@@ -89,6 +90,7 @@ unistd.h: unistd.in.h $(top_builddir)/config.status $(CXXDEFS_H) $(ARG_NONNULL_H
 	      -e 's/@''GNULIB_WRITE''@/$(GNULIB_WRITE)/g' \
 	      < $(srcdir)/unistd.in.h | \
 	  sed -e 's|@''HAVE_CHOWN''@|$(HAVE_CHOWN)|g' \
+	      -e 's|@''HAVE_COPY_FILE_RANGE''@|$(HAVE_COPY_FILE_RANGE)|g' \
 	      -e 's|@''HAVE_DUP2''@|$(HAVE_DUP2)|g' \
 	      -e 's|@''HAVE_DUP3''@|$(HAVE_DUP3)|g' \
 	      -e 's|@''HAVE_EUIDACCESS''@|$(HAVE_EUIDACCESS)|g' \
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 2/2] copy-file: prefer copy_file_range
  2019-06-05  6:09 [PATCH 1/2] copy-file-range: new module Paul Eggert
@ 2019-06-05  6:09 ` Paul Eggert
  2019-06-05 13:54 ` [PATCH 1/2] copy-file-range: new module Florian Weimer
  1 sibling, 0 replies; 9+ messages in thread
From: Paul Eggert @ 2019-06-05  6:09 UTC (permalink / raw)
  To: bug-gnulib; +Cc: Paul Eggert

* lib/copy-file.c: Do not include xalloc.h.
(qcopy_file_preserving): Allocate a buffer only if
copy_file_range does not suffice.  If the allocation fails
don't give up; just use a small stack-based buffer.
Prefer copy_file_range if it works.
* modules/copy-file (Depends-on): Add copy-file-range.
Remove xalloc.
---
 ChangeLog         |  9 ++++++
 lib/copy-file.c   | 76 ++++++++++++++++++++++++++++-------------------
 modules/copy-file |  2 +-
 3 files changed, 56 insertions(+), 31 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 9ff35e9c1..ac9a44652 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,5 +1,14 @@
 2019-06-04  Paul Eggert  <eggert@cs.ucla.edu>
 
+	copy-file: prefer copy_file_range
+	* lib/copy-file.c: Do not include xalloc.h.
+	(qcopy_file_preserving): Allocate a buffer only if
+	copy_file_range does not suffice.  If the allocation fails
+	don't give up; just use a small stack-based buffer.
+	Prefer copy_file_range if it works.
+	* modules/copy-file (Depends-on): Add copy-file-range.
+	Remove xalloc.
+
 	copy-file-range: new module
 	* MODULES.html.sh: Add copy-file-range.
 	* lib/copy-file-range.c, m4/copy-file-range.m4:
diff --git a/lib/copy-file.c b/lib/copy-file.c
index 2ba344b97..092a040d7 100644
--- a/lib/copy-file.c
+++ b/lib/copy-file.c
@@ -38,7 +38,6 @@
 #include "binary-io.h"
 #include "quote.h"
 #include "gettext.h"
-#include "xalloc.h"
 
 #define _(str) gettext (str)
 
@@ -52,14 +51,10 @@ qcopy_file_preserving (const char *src_filename, const char *dest_filename)
   struct stat statbuf;
   int mode;
   int dest_fd;
-  char *buf = xmalloc (IO_SIZE);
 
   src_fd = open (src_filename, O_RDONLY | O_BINARY);
   if (src_fd < 0)
-    {
-      err = GL_COPY_ERR_OPEN_READ;
-      goto error;
-    }
+    return GL_COPY_ERR_OPEN_READ;
   if (fstat (src_fd, &statbuf) < 0)
     {
       err = GL_COPY_ERR_OPEN_READ;
@@ -67,6 +62,8 @@ qcopy_file_preserving (const char *src_filename, const char *dest_filename)
     }
 
   mode = statbuf.st_mode & 07777;
+  off_t inbytes = S_ISREG (statbuf.st_mode) ? statbuf.st_size : -1;
+  bool empty_regular_file = inbytes == 0;
 
   dest_fd = open (dest_filename, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY, 0600);
   if (dest_fd < 0)
@@ -75,27 +72,53 @@ qcopy_file_preserving (const char *src_filename, const char *dest_filename)
       goto error_src;
     }
 
-  /* Copy the file contents.  */
-  for (;;)
+  /* Copy the file contents.  FIXME: Do not copy holes.  */
+  while (0 < inbytes)
     {
-      size_t n_read = safe_read (src_fd, buf, IO_SIZE);
-      if (n_read == SAFE_READ_ERROR)
-        {
-          err = GL_COPY_ERR_READ;
-          goto error_src_dest;
-        }
-      if (n_read == 0)
+      size_t copy_max = -1;
+      copy_max -= copy_max % IO_SIZE;
+      size_t len = inbytes < copy_max ? inbytes : copy_max;
+      ssize_t copied = copy_file_range (src_fd, NULL, dest_fd, NULL, len, 0);
+      if (copied <= 0)
         break;
+      inbytes -= copied;
+    }
+
+  /* Finish up with read/write, in case the file was not a regular
+     file, or the file shrank or had I/O errors (in which case find
+     whether it was a read or write error).  Read empty regular files
+     since they might be in /proc with their true sizes unknown until
+     they are read.  */
+  if (inbytes != 0 || empty_regular_file)
+    {
+      char smallbuf[1024];
+      int bufsize = IO_SIZE;
+      char *buf = malloc (bufsize);
+      if (!buf)
+        buf = smallbuf, bufsize = sizeof smallbuf;
 
-      if (full_write (dest_fd, buf, n_read) < n_read)
+      while (true)
         {
-          err = GL_COPY_ERR_WRITE;
-          goto error_src_dest;
+          size_t n_read = safe_read (src_fd, buf, bufsize);
+          if (n_read == 0)
+            break;
+          if (n_read == SAFE_READ_ERROR)
+            {
+              err = GL_COPY_ERR_READ;
+              break;
+            }
+          if (full_write (dest_fd, buf, n_read) < n_read)
+            {
+              err = GL_COPY_ERR_WRITE;
+              break;
+            }
         }
-    }
 
-  free (buf);
-  buf = NULL; /* To avoid double free in error case.  */
+      if (buf != smallbuf)
+        free (buf);
+      if (err)
+        goto error_src_dest;
+    }
 
 #if !USE_ACL
   if (close (dest_fd) < 0)
@@ -104,10 +127,7 @@ qcopy_file_preserving (const char *src_filename, const char *dest_filename)
       goto error_src;
     }
   if (close (src_fd) < 0)
-    {
-      err = GL_COPY_ERR_AFTER_READ;
-      goto error;
-    }
+    return GL_COPY_ERR_AFTER_READ;
 #endif
 
   /* Preserve the access and modification times.  */
@@ -146,10 +166,7 @@ qcopy_file_preserving (const char *src_filename, const char *dest_filename)
       goto error_src;
     }
   if (close (src_fd) < 0)
-    {
-      err = GL_COPY_ERR_AFTER_READ;
-      goto error;
-    }
+    return GL_COPY_ERR_AFTER_READ;
 #endif
 
   return 0;
@@ -159,7 +176,6 @@ qcopy_file_preserving (const char *src_filename, const char *dest_filename)
  error_src:
   close (src_fd);
  error:
-  free (buf);
   return err;
 }
 
diff --git a/modules/copy-file b/modules/copy-file
index e1bda4973..bc1527757 100644
--- a/modules/copy-file
+++ b/modules/copy-file
@@ -10,6 +10,7 @@ Depends-on:
 acl
 binary-io
 close
+copy-file-range
 error
 fstat
 full-write
@@ -22,7 +23,6 @@ stat-time
 stdlib
 unistd
 utimens
-xalloc
 
 configure.ac:
 gl_COPY_FILE
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] copy-file-range: new module
  2019-06-05  6:09 [PATCH 1/2] copy-file-range: new module Paul Eggert
  2019-06-05  6:09 ` [PATCH 2/2] copy-file: prefer copy_file_range Paul Eggert
@ 2019-06-05 13:54 ` Florian Weimer
  2019-06-06  0:45   ` Bruno Haible
  1 sibling, 1 reply; 9+ messages in thread
From: Florian Weimer @ 2019-06-05 13:54 UTC (permalink / raw)
  To: Paul Eggert; +Cc: bug-gnulib

* Paul Eggert:

> +/* Emulation of copy_file_range.
> +   Copyright 2017-2019 Free Software Foundation, Inc.
> +
> +   This program is free software: you can redistribute it and/or modify
> +   it under the terms of the GNU General Public License as published by
> +   the Free Software Foundation; either version 3 of the License, or
> +   (at your option) any later version.
> +
> +   This program is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +   GNU General Public License for more details.
> +
> +   You should have received a copy of the GNU General Public License
> +   along with this program.  If not, see <https://www.gnu.org/licenses/>.  */
> +
> +/* This file is adapted from glibc io/copy_file_range-compat.c, with a
> +   small number of changes to port to non-glibc platforms.  */

This is not a valid implementation of copy_file_range anymore.  Please
see the discussion here:

  <https://sourceware.org/ml/libc-alpha/2019-06/msg00039.html>

If you ship this in gnulib, you should at least call this function by a
different name.

Thanks,
Florian


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] copy-file-range: new module
  2019-06-05 13:54 ` [PATCH 1/2] copy-file-range: new module Florian Weimer
@ 2019-06-06  0:45   ` Bruno Haible
  2019-06-07  8:04     ` Paul Eggert
  0 siblings, 1 reply; 9+ messages in thread
From: Bruno Haible @ 2019-06-06  0:45 UTC (permalink / raw)
  To: bug-gnulib; +Cc: Florian Weimer, Paul Eggert

Florian Weimer wrote:
> This is not a valid implementation of copy_file_range anymore.  Please
> see the discussion here:
> 
>   <https://sourceware.org/ml/libc-alpha/2019-06/msg00039.html>
> 
> If you ship this in gnulib, you should at least call this function by a
> different name.

I disagree, for two reasons:

1) Gnulib takes the freedom to adjust the behaviour of functions found in
   various system libcs. The Gnulib documentation [1] has plenty of examples
   of this practice.

   Gnulib typically documents the differences in the corresponding doc section.
   The doc does not yet have a section regarding copy_file_range, but I'll add
   it in the next couple of days.

   Alternatively, Gnulib can also adjust for the "both descriptors refer to
   the same open file" case, by testing SAME_INODE (instat, outstat) and either
   using special code with lseek for this case, or just return -1/EINVAL.

2) Shared libraries which use Gnulib functions have various techniques for
   hiding the Gnulib overrides from the main program, so that the main program
   can still use the original function from the system libc.

If you think that no other source code than libc should define a function that
has the same name as copy_file_range, you should better call that function
__copy_file_range or kernel_copy_file_range, or similar.

Bruno

[1] https://www.gnu.org/software/gnulib/manual/gnulib.html



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] copy-file-range: new module
  2019-06-06  0:45   ` Bruno Haible
@ 2019-06-07  8:04     ` Paul Eggert
  2019-06-28 18:56       ` Pádraig Brady
  0 siblings, 1 reply; 9+ messages in thread
From: Paul Eggert @ 2019-06-07  8:04 UTC (permalink / raw)
  To: Bruno Haible; +Cc: Florian Weimer, bug-gnulib

[-- Attachment #1: Type: text/plain, Size: 1344 bytes --]

Although I agree with both of Bruno's reasons, on further thought it makes sense 
for Gnulib to provide a stub instead of an emulation for this particular function.

The primary use for copy_file_range is to copy files more efficiently than a 
read+write loop. However, the read+write loop will typically still be needed 
anyway, to handle non-regular files, copying across file system boundaries, or 
even copying regular input files whose size shrinks while copying as 
copy_file_range (mistakenly, in my view) reports an error if asked to copy past 
end of file.

Because of all these special cases, a reliable use of copy_file_ranges will need 
to be followed by a read+write loop that handles the special cases where 
copy_file_ranges fails. And this read+write loop will work even if 
copy_file_range is a stub that always fails. So there's no practical need to 
emulate copy_file_range.

For an example of the code I'm thinking of, see the call to copy_file_range that 
I recently introduced into GNU Emacs:

https://git.savannah.gnu.org/cgit/emacs.git/commit/?id=486a81f387bb59b2fbbc6aff7b41adbe1621394e

Because of the above, I simplified Gnulib's implementation of copy_file_range by 
installing the attached. If we find a good use for copy_file_range that could 
take advantage of an emulator rather than a stub, we can revert this.

[-- Attachment #2: 0001-copy-file-range-simplify-into-a-stub.txt --]
[-- Type: text/plain, Size: 9275 bytes --]

From b0d998d2e04050dd3c65816b277cf0f91c9ca74c Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Thu, 6 Jun 2019 15:44:33 -0700
Subject: [PATCH 1/2] copy-file-range: simplify into a stub
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Based on a comment by Florian Weimer in:
https://lists.gnu.org/r/bug-gnulib/2019-06/msg00007.html
It turns out that Emacs (which will use this module) won’t need an
emulation and I suspect other programs won’t either, because these
programs will need to fall back on read+write anyway.  Perhaps I
am wrong and other programs will be able to use an emulation; if
so, this patch can be reverted.
* lib/copy-file-range.c (COPY_FILE_RANGE): Replace with a stub.
Just call it copy_file_range.
* m4/copy-file-range.m4 (gl_FUNC_COPY_FILE_RANGE):
Check via AC_LINK_IFELSE.
* modules/copy-file-range (Depends-on): Remove modules no longer used.
---
 ChangeLog               |  16 +++++
 lib/copy-file-range.c   | 152 +++-------------------------------------
 m4/copy-file-range.m4   |  22 +++++-
 modules/copy-file-range |   5 --
 4 files changed, 46 insertions(+), 149 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index ac9a44652..5e51f2dbd 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,19 @@
+2019-06-06  Paul Eggert  <eggert@cs.ucla.edu>
+
+	copy-file-range: simplify into a stub
+	Based on a comment by Florian Weimer in:
+	https://lists.gnu.org/r/bug-gnulib/2019-06/msg00007.html
+	It turns out that Emacs (which will use this module) won’t need an
+	emulation and I suspect other programs won’t either, because these
+	programs will need to fall back on read+write anyway.  Perhaps I
+	am wrong and other programs will be able to use an emulation; if
+	so, this patch can be reverted.
+	* lib/copy-file-range.c (COPY_FILE_RANGE): Replace with a stub.
+	Just call it copy_file_range.
+	* m4/copy-file-range.m4 (gl_FUNC_COPY_FILE_RANGE):
+	Check via AC_LINK_IFELSE.
+	* modules/copy-file-range (Depends-on): Remove modules no longer used.
+
 2019-06-04  Paul Eggert  <eggert@cs.ucla.edu>
 
 	copy-file: prefer copy_file_range
diff --git a/lib/copy-file-range.c b/lib/copy-file-range.c
index a1448d0d9..39b5efbc1 100644
--- a/lib/copy-file-range.c
+++ b/lib/copy-file-range.c
@@ -1,5 +1,5 @@
-/* Emulation of copy_file_range.
-   Copyright 2017-2019 Free Software Foundation, Inc.
+/* Stub for copy_file_range
+   Copyright 2019 Free Software Foundation, Inc.
 
    This program is free software: you can redistribute it and/or modify
    it under the terms of the GNU General Public License as published by
@@ -14,152 +14,20 @@
    You should have received a copy of the GNU General Public License
    along with this program.  If not, see <https://www.gnu.org/licenses/>.  */
 
-/* This file is adapted from glibc io/copy_file_range-compat.c, with a
-   small number of changes to port to non-glibc platforms.  */
+#include <config.h>
 
-#include <libc-config.h>
-
-/* The following macros should be defined:
-
-   COPY_FILE_RANGE_DECL   Declaration specifiers for the function below.
-   COPY_FILE_RANGE        Name of the function to define.  */
-
-#define COPY_FILE_RANGE_DECL
-#define COPY_FILE_RANGE copy_file_range
+#include <unistd.h>
 
 #include <errno.h>
-#include <fcntl.h>
-#include <inttypes.h>
-#include <limits.h>
-#include <sys/stat.h>
-#include <sys/types.h>
-#include <unistd.h>
 
-COPY_FILE_RANGE_DECL
 ssize_t
-COPY_FILE_RANGE (int infd, off_t *pinoff,
+copy_file_range (int infd, off_t *pinoff,
                  int outfd, off_t *poutoff,
                  size_t length, unsigned int flags)
 {
-  if (flags != 0)
-    {
-      __set_errno (EINVAL);
-      return -1;
-    }
-
-  {
-    struct stat instat;
-    struct stat outstat;
-    if (fstat (infd, &instat) != 0 || fstat (outfd, &outstat) != 0)
-      return -1;
-    if (S_ISDIR (instat.st_mode) || S_ISDIR (outstat.st_mode))
-      {
-        __set_errno (EISDIR);
-        return -1;
-      }
-    if (!S_ISREG (instat.st_mode) || !S_ISREG (outstat.st_mode))
-      {
-        /* We need a regular input file so that the we can seek
-           backwards in case of a write failure.  */
-        __set_errno (EINVAL);
-        return -1;
-      }
-    if (instat.st_dev != outstat.st_dev)
-      {
-        /* Cross-device copies are not supported.  */
-        __set_errno (EXDEV);
-        return -1;
-      }
-  }
-
-  /* The output descriptor must not have O_APPEND set.  */
-  {
-    int flags = fcntl (outfd, F_GETFL);
-    if (flags & O_APPEND)
-      {
-        __set_errno (EBADF);
-        return -1;
-      }
-  }
-
-  /* Avoid an overflow in the result.  */
-  if (length > SSIZE_MAX)
-    length = SSIZE_MAX;
-
-  /* Main copying loop.  The buffer size is arbitrary and is a
-     trade-off between stack size consumption, cache usage, and
-     amortization of system call overhead.  */
-  size_t copied = 0;
-  char buf[8192];
-  while (length > 0)
-    {
-      size_t to_read = length;
-      if (to_read > sizeof (buf))
-        to_read = sizeof (buf);
-
-      /* Fill the buffer.  */
-      ssize_t read_count;
-      if (pinoff == NULL)
-        read_count = read (infd, buf, to_read);
-      else
-        read_count = pread (infd, buf, to_read, *pinoff);
-      if (read_count == 0)
-        /* End of file reached prematurely.  */
-        return copied;
-      if (read_count < 0)
-        {
-          if (copied > 0)
-            /* Report the number of bytes copied so far.  */
-            return copied;
-          return -1;
-        }
-      if (pinoff != NULL)
-        *pinoff += read_count;
-
-      /* Write the buffer part which was read to the destination.  */
-      char *end = buf + read_count;
-      for (char *p = buf; p < end; )
-        {
-          ssize_t write_count;
-          if (poutoff == NULL)
-            write_count = write (outfd, p, end - p);
-          else
-            write_count = pwrite (outfd, p, end - p, *poutoff);
-          if (write_count < 0)
-            {
-              /* Adjust the input read position to match what we have
-                 written, so that the caller can pick up after the
-                 error.  */
-              size_t written = p - buf;
-              /* NB: This needs to be signed so that we can form the
-                 negative value below.  */
-              ssize_t overread = read_count - written;
-              if (pinoff == NULL)
-                {
-                  if (overread > 0)
-                    {
-                      /* We are on an error recovery path, so we
-                         cannot deal with failure here.  */
-                      int save_errno = errno;
-                      (void) lseek (infd, -overread, SEEK_CUR);
-                      __set_errno (save_errno);
-                    }
-                }
-              else /* pinoff != NULL */
-                *pinoff -= overread;
-
-              if (copied + written > 0)
-                /* Report the number of bytes copied so far.  */
-                return copied + written;
-              return -1;
-            }
-          p += write_count;
-          if (poutoff != NULL)
-            *poutoff += write_count;
-        } /* Write loop.  */
-
-      copied += read_count;
-      length -= read_count;
-    }
-  return copied;
+  /* There is little need to emulate copy_file_range with read+write,
+     since programs that use copy_file_range must fall back on
+     read+write anyway.  */
+  errno = ENOSYS;
+  return -1;
 }
diff --git a/m4/copy-file-range.m4 b/m4/copy-file-range.m4
index edd2c4aed..20fd05697 100644
--- a/m4/copy-file-range.m4
+++ b/m4/copy-file-range.m4
@@ -11,8 +11,26 @@ AC_DEFUN([gl_FUNC_COPY_FILE_RANGE],
   dnl Persuade glibc <unistd.h> to declare copy_file_range.
   AC_REQUIRE([AC_USE_SYSTEM_EXTENSIONS])
 
-  AC_CHECK_FUNCS_ONCE([copy_file_range])
-  if test $ac_cv_func_copy_file_range != yes; then
+  dnl Use AC_LINK_IFELSE, rather than AC_CHECK_FUNCS or a variant,
+  dnl since we don't want AC_CHECK_FUNCS's checks for glibc stubs.
+  dnl Programs that use copy_file_range must fall back on read+write
+  dnl anyway, and there's little point to substituting the Gnulib stub
+  dnl for a glibc stub.
+  AC_CACHE_CHECK([for copy_file_range], [gl_cv_func_copy_file_range],
+    [AC_LINK_IFELSE(
+       [AC_LANG_PROGRAM(
+          [[#include <unistd.h>
+          ]],
+          [[ssize_t (*func) (int, off_t *, int, off_t, size_t, unsigned)
+              = copy_file_range;
+            return func (0, 0, 0, 0, 0, 0) & 127;
+          ]])
+       ],
+       [gl_cv_func_copy_file_range=yes],
+       [gl_cv_func_copy_file_range=no])
+    ])
+
+  if test "$gl_cv_func_copy_file_range" != yes; then
     HAVE_COPY_FILE_RANGE=0
   fi
 ])
diff --git a/modules/copy-file-range b/modules/copy-file-range
index 18742ef3b..42d6e6f31 100644
--- a/modules/copy-file-range
+++ b/modules/copy-file-range
@@ -6,12 +6,7 @@ lib/copy-file-range.c
 m4/copy-file-range.m4
 
 Depends-on:
-c99
-inttypes-incomplete [test $HAVE_COPY_FILE_RANGE = 0]
 largefile
-libc-config         [test $HAVE_COPY_FILE_RANGE = 0]
-pread               [test $HAVE_COPY_FILE_RANGE = 0]
-pwrite              [test $HAVE_COPY_FILE_RANGE = 0]
 unistd
 
 configure.ac:
-- 
2.17.1


[-- Attachment #3: 0002-copy-file-fix-typo.txt --]
[-- Type: text/plain, Size: 1067 bytes --]

From 30b6215942ffadd705f7641c0725e425057b73a5 Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Thu, 6 Jun 2019 15:45:14 -0700
Subject: [PATCH 2/2] copy-file: fix typo

* lib/copy-file.c (qcopy_file_preserving): Remove unused label.
---
 ChangeLog       | 3 +++
 lib/copy-file.c | 1 -
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/ChangeLog b/ChangeLog
index 5e51f2dbd..cbb95d2f5 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,5 +1,8 @@
 2019-06-06  Paul Eggert  <eggert@cs.ucla.edu>
 
+	copy-file: fix typo
+	* lib/copy-file.c (qcopy_file_preserving): Remove unused label.
+
 	copy-file-range: simplify into a stub
 	Based on a comment by Florian Weimer in:
 	https://lists.gnu.org/r/bug-gnulib/2019-06/msg00007.html
diff --git a/lib/copy-file.c b/lib/copy-file.c
index 092a040d7..91c920c33 100644
--- a/lib/copy-file.c
+++ b/lib/copy-file.c
@@ -175,7 +175,6 @@ qcopy_file_preserving (const char *src_filename, const char *dest_filename)
   close (dest_fd);
  error_src:
   close (src_fd);
- error:
   return err;
 }
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] copy-file-range: new module
  2019-06-07  8:04     ` Paul Eggert
@ 2019-06-28 18:56       ` Pádraig Brady
  2019-06-28 19:30         ` Bruno Haible
  0 siblings, 1 reply; 9+ messages in thread
From: Pádraig Brady @ 2019-06-28 18:56 UTC (permalink / raw)
  To: Paul Eggert, Bruno Haible; +Cc: Florian Weimer, bug-gnulib

On 07/06/19 09:04, Paul Eggert wrote:
> Although I agree with both of Bruno's reasons, on further thought it makes sense 
> for Gnulib to provide a stub instead of an emulation for this particular function.
> 
> The primary use for copy_file_range is to copy files more efficiently than a 
> read+write loop. However, the read+write loop will typically still be needed 
> anyway, to handle non-regular files, copying across file system boundaries, or 
> even copying regular input files whose size shrinks while copying as 
> copy_file_range (mistakenly, in my view) reports an error if asked to copy past 
> end of file.
> 
> Because of all these special cases, a reliable use of copy_file_ranges will need 
> to be followed by a read+write loop that handles the special cases where 
> copy_file_ranges fails. And this read+write loop will work even if 
> copy_file_range is a stub that always fails. So there's no practical need to 
> emulate copy_file_range.
> 
> For an example of the code I'm thinking of, see the call to copy_file_range that 
> I recently introduced into GNU Emacs:
> 
> https://git.savannah.gnu.org/cgit/emacs.git/commit/?id=486a81f387bb59b2fbbc6aff7b41adbe1621394e
> 
> Because of the above, I simplified Gnulib's implementation of copy_file_range by 
> installing the attached. If we find a good use for copy_file_range that could 
> take advantage of an emulator rather than a stub, we can revert this.

I definitely agree with this decision.
I also see that yesterday glibc also reverted to a stub rather than emulation:
https://sourceware.org/ml/libc-alpha/2019-06/msg00873.html

cheers,
Pádraig


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] copy-file-range: new module
  2019-06-28 18:56       ` Pádraig Brady
@ 2019-06-28 19:30         ` Bruno Haible
  2019-06-28 20:18           ` Florian Weimer
  0 siblings, 1 reply; 9+ messages in thread
From: Bruno Haible @ 2019-06-28 19:30 UTC (permalink / raw)
  To: Florian Weimer; +Cc: bug-gnulib, Paul Eggert

Pádraig Brady wrote:
> https://sourceware.org/ml/libc-alpha/2019-06/msg00873.html

In the NEWS entry, Florian writes:

   Applications which use the
   copy_file_range function will have to be run on kernels which implement
   the copy_file_range system call.

I find this misleading. Coreutils will want to continue to use copy_file_range
whenever the libc provides this function, and at the same time be able to run
on all Linux kernel versions.

Paul's argument convinced me:
> > ... a reliable use of copy_file_ranges will need 
> > to be followed by a read+write loop that handles the special cases where 
> > copy_file_ranges fails. And this read+write loop will work even if 
> > copy_file_range is a stub that always fails. So there's no practical need to 
> > emulate copy_file_range [in glibc nor gnulib].

Maybe the NEWS entry could be changed to state something similar?

Bruno



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] copy-file-range: new module
  2019-06-28 19:30         ` Bruno Haible
@ 2019-06-28 20:18           ` Florian Weimer
  2019-06-28 22:46             ` Bruno Haible
  0 siblings, 1 reply; 9+ messages in thread
From: Florian Weimer @ 2019-06-28 20:18 UTC (permalink / raw)
  To: Bruno Haible; +Cc: bug-gnulib, Paul Eggert

* Bruno Haible:

> Pádraig Brady wrote:
>> https://sourceware.org/ml/libc-alpha/2019-06/msg00873.html
>
> In the NEWS entry, Florian writes:
>
>    Applications which use the
>    copy_file_range function will have to be run on kernels which implement
>    the copy_file_range system call.
>
> I find this misleading. Coreutils will want to continue to use
> copy_file_range whenever the libc provides this function, and at the
> same time be able to run on all Linux kernel versions.

What's misleading about it?  If you want copy_file_range functionality,
you will need an implementation of it, and since glibc no longer
contains one after this change, you need one from the kernel.

You can still call the function, of course, but it's use is really
limited because it sets errno ENOSYS and returns -1.

Thanks,
Florian


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] copy-file-range: new module
  2019-06-28 20:18           ` Florian Weimer
@ 2019-06-28 22:46             ` Bruno Haible
  0 siblings, 0 replies; 9+ messages in thread
From: Bruno Haible @ 2019-06-28 22:46 UTC (permalink / raw)
  To: Florian Weimer; +Cc: bug-gnulib, Paul Eggert

Florian Weimer wrote:
> > In the NEWS entry, Florian writes:
> >
> >    Applications which use the
> >    copy_file_range function will have to be run on kernels which implement
> >    the copy_file_range system call.
> >
> > I find this misleading. Coreutils will want to continue to use
> > copy_file_range whenever the libc provides this function, and at the
> > same time be able to run on all Linux kernel versions.
> 
> What's misleading about it?  If you want copy_file_range functionality,
> you will need an implementation of it, and since glibc no longer
> contains one after this change, you need one from the kernel.

Like Paul said: Any reliable use of copy_file_range will need to be
followed by fallback code that does the copy through user-space.

> You can still call the function, of course, but it's use is really
> limited because it sets errno ENOSYS and returns -1.

Yes. This information is missing from the NEWS entry that you added.

Bruno



^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2019-06-28 22:47 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-05  6:09 [PATCH 1/2] copy-file-range: new module Paul Eggert
2019-06-05  6:09 ` [PATCH 2/2] copy-file: prefer copy_file_range Paul Eggert
2019-06-05 13:54 ` [PATCH 1/2] copy-file-range: new module Florian Weimer
2019-06-06  0:45   ` Bruno Haible
2019-06-07  8:04     ` Paul Eggert
2019-06-28 18:56       ` Pádraig Brady
2019-06-28 19:30         ` Bruno Haible
2019-06-28 20:18           ` Florian Weimer
2019-06-28 22:46             ` Bruno Haible

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).