git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Johannes Schindelin <Johannes.Schindelin@gmx.de>,
	Jeff King <peff@peff.net>, Git Mailing List <git@vger.kernel.org>,
	Lars Schneider <larsxschneider@gmail.com>,
	Eric Wong <e@80x24.org>
Subject: Re: [PATCH v3 2/3] sha1_file: open window into packfiles with O_CLOEXEC
Date: Mon, 31 Oct 2016 10:37:41 -0700	[thread overview]
Message-ID: <xmqq1sywv2qi.fsf@gitster.mtv.corp.google.com> (raw)
In-Reply-To: <CA+55aFxsjiuR8cp9SiPS88OnzmCiNN3B-gybz1CS71avsU8OOw@mail.gmail.com> (Linus Torvalds's message of "Sat, 29 Oct 2016 10:06:59 -0700")

Linus Torvalds <torvalds@linux-foundation.org> writes:

> On Sat, Oct 29, 2016 at 1:25 AM, Johannes Schindelin
> <Johannes.Schindelin@gmx.de> wrote:
>>
>> Correct. We cannot change an open file handle's state ("FD_CLOEXEC") on
>> Windows, but we can ask open() to open said file handle with the correct
>> flag ("O_CLOEXEC", which is mapped to O_NOINHERIT on Windows.
>
> Ok. So then I have no issues with it, and let's use O_CLOEXEC if it
> exists and fcntl(FD_CLOEXEC) if O_CLOEXEC doesn't exist.

I am not sure if the fallback to FD_CLOEXEC is worth doing, because
it does not exist on Windows, where leaking an open file descriptor
to a child process matters more than other platforms [*1*].  

Of course, the world is not all Windows, and there may be other
platforms this fallback may help.  But there is a chance that some
other thread may fork between the time we open(2) and the time we
call fcntl(2), leaving FD_CLOEXEC ineffective and unreliable, so
that's another thing that makes me doubt if FD_CLOEXEC fallback is
worth doing.

Having said that, here is a rewrite of [*2*] to use the fallback
would look like.  After this topic settles, we may want to also
update the tempfile.c::create_tempfile() implementation to use the
helper function we use here.


[Footnotes]

*1* The call in ce_compare_data() is to see if the contents in the
    working tree file match what is in the index.  Perhaps the
    program is "git checkout another-branch" that knows this path is
    absent in the branch being switched to and trying to make sure
    we lose no local modification.  When the path needs to be passed
    through the clean/smudge filter before hashing to see if the
    working tree contents hashes to the object in the index, we need
    to fork and then after the filter finishes its processing, we
    close the file descriptor and then unlink(2) the path if it is
    clean.  We are about to add a variant of clean/smudge mechanism
    where a lazily spawned process can clean contents of multiple
    paths and that is where cloexec starts to matter.  This function
    starts to inspect one path, opens a fd to it, finds that it
    needs filtering, spawns a long-lingering process, while leaking
    the original fd to it.  After feeding the contents via pipe and
    then receiving the filtered contents (the protocol is not full
    duplex and there won't be a stuck pipe), the parent would decide
    that the path is clean and attempts to unlink(2).  Windows does
    not let you unlink a path with open filedescriptor held, causing
    "checkout" to fail.  As it is only one fd that leaks to the
    child, no matter how many subsequent paths are filtered by the
    same long-lingering child process, it will be a far less
    important issue on other platforms that lets you unlink(2) such
    a file.

*2* http://public-inbox.org/git/xmqqh97w38gj.fsf@gitster.mtv.corp.google.com

---
 cache.h           |  1 +
 git-compat-util.h |  4 ++++
 read-cache.c      |  9 +--------
 sha1_file.c       | 47 ++++++++++++++++++++++++++++-------------------
 4 files changed, 34 insertions(+), 27 deletions(-)

diff --git a/cache.h b/cache.h
index a902ca1f8e..6f1c21a352 100644
--- a/cache.h
+++ b/cache.h
@@ -1122,6 +1122,7 @@ extern int write_sha1_file(const void *buf, unsigned long len, const char *type,
 extern int hash_sha1_file_literally(const void *buf, unsigned long len, const char *type, unsigned char *sha1, unsigned flags);
 extern int pretend_sha1_file(void *, unsigned long, enum object_type, unsigned char *);
 extern int force_object_loose(const unsigned char *sha1, time_t mtime);
+extern int git_open_cloexec(const char *name, int flags);
 extern int git_open(const char *name);
 extern void *map_sha1_file(const unsigned char *sha1, unsigned long *size);
 extern int unpack_sha1_header(git_zstream *stream, unsigned char *map, unsigned long mapsize, void *buffer, unsigned long bufsiz);
diff --git a/git-compat-util.h b/git-compat-util.h
index 43718dabae..64407c701c 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -683,6 +683,10 @@ char *gitstrdup(const char *s);
 #define O_CLOEXEC 0
 #endif
 
+#ifndef FD_CLOEXEC
+#define FD_CLOEXEC 0
+#endif
+
 #ifdef FREAD_READS_DIRECTORIES
 #ifdef fopen
 #undef fopen
diff --git a/read-cache.c b/read-cache.c
index db5d910642..c27d3e240b 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -156,14 +156,7 @@ void fill_stat_cache_info(struct cache_entry *ce, struct stat *st)
 static int ce_compare_data(const struct cache_entry *ce, struct stat *st)
 {
 	int match = -1;
-	static int cloexec = O_CLOEXEC;
-	int fd = open(ce->name, O_RDONLY | cloexec);
-
-	if ((cloexec & O_CLOEXEC) && fd < 0 && errno == EINVAL) {
-		/* Try again w/o O_CLOEXEC: the kernel might not support it */
-		cloexec &= ~O_CLOEXEC;
-		fd = open(ce->name, O_RDONLY | cloexec);
-	}
+	int fd = git_open_cloexec(ce->name, O_RDONLY);
 
 	if (fd >= 0) {
 		unsigned char sha1[20];
diff --git a/sha1_file.c b/sha1_file.c
index 09045df1dc..fc8613a847 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1559,31 +1559,40 @@ int check_sha1_signature(const unsigned char *sha1, void *map,
 	return hashcmp(sha1, real_sha1) ? -1 : 0;
 }
 
-int git_open(const char *name)
+int git_open_cloexec(const char *name, int flags)
 {
-	static int sha1_file_open_flag = O_NOATIME | O_CLOEXEC;
+	int fd;
+	static int o_cloexec = O_CLOEXEC;
+	static int fd_cloexec = FD_CLOEXEC;
 
-	for (;;) {
-		int fd;
+	fd = open(name, flags | o_cloexec);
+	if ((o_cloexec & O_CLOEXEC) && fd < 0 && errno == EINVAL) {
+		/* Try again w/o O_CLOEXEC: the kernel might not support it */
+		o_cloexec &= ~O_CLOEXEC;
+		fd = open(name, flags | o_cloexec);
+	}
 
-		errno = 0;
-		fd = open(name, O_RDONLY | sha1_file_open_flag);
-		if (fd >= 0)
-			return fd;
+	if (!o_cloexec && 0 <= fd && fd_cloexec) {
+		/* Opened w/o O_CLOEXEC?  try with fcntl(2) to add it */
+		int flags = fcntl(fd, F_GETFL);
+		if (fcntl(fd, F_SETFL, flags | fd_cloexec))
+			fd_cloexec = 0;
+	}
 
-		/* Try again w/o O_CLOEXEC: the kernel might not support it */
-		if ((sha1_file_open_flag & O_CLOEXEC) && errno == EINVAL) {
-			sha1_file_open_flag &= ~O_CLOEXEC;
-			continue;
-		}
+	return fd;
+}
 
-		/* Might the failure be due to O_NOATIME? */
-		if (errno != ENOENT && (sha1_file_open_flag & O_NOATIME)) {
-			sha1_file_open_flag &= ~O_NOATIME;
-			continue;
-		}
-		return -1;
+int git_open(const char *name)
+{
+	static int noatime = O_NOATIME;
+	int fd = git_open_cloexec(name, O_RDONLY);
+
+	if (0 <= fd && (noatime & O_NOATIME)) {
+		int flags = fcntl(fd, F_GETFL);
+		if (fcntl(fd, F_SETFL, flags | noatime))
+			noatime = 0;
 	}
+	return fd;
 }
 
 static int stat_sha1_file(const unsigned char *sha1, struct stat *st)



  reply	other threads:[~2016-10-31 17:37 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-10-24 18:02 [PATCH v2 0/2] Use CLOEXEC to avoid fd leaks larsxschneider
2016-10-24 18:02 ` [PATCH v2 1/2] sha1_file: open window into packfiles with CLOEXEC larsxschneider
2016-10-25 10:27   ` Johannes Schindelin
2016-10-25 16:58     ` Junio C Hamano
2016-10-24 18:03 ` [PATCH v2 2/2] read-cache: make sure file handles are not inherited by child processes larsxschneider
2016-10-24 18:39   ` Eric Wong
2016-10-24 19:53     ` Junio C Hamano
2016-10-25 10:33       ` Johannes Schindelin
2016-10-25 17:02         ` Junio C Hamano
2016-10-24 19:22   ` Johannes Sixt
2016-10-24 19:53     ` Lars Schneider
2016-10-25 21:39       ` Johannes Sixt
2016-10-24 18:23 ` [PATCH v2 0/2] Use CLOEXEC to avoid fd leaks Junio C Hamano
2016-10-25 11:27 ` Johannes Schindelin
2016-10-25 18:16   ` [PATCH v3 0/3] quick reroll of Lars's git_open() w/ O_CLOEXEC Junio C Hamano
2016-10-25 18:16     ` [PATCH v3 1/3] sha1_file: rename git_open_noatime() to git_open() Junio C Hamano
2016-10-25 18:16     ` [PATCH v3 2/3] sha1_file: open window into packfiles with O_CLOEXEC Junio C Hamano
2016-10-26  4:25       ` Jeff King
2016-10-26 16:23         ` Junio C Hamano
2016-10-26 16:47           ` Jeff King
2016-10-26 17:52             ` Junio C Hamano
2016-10-26 20:17               ` Jeff King
2016-10-26 21:15                 ` Junio C Hamano
2016-10-27 10:24                   ` Jeff King
2016-10-27 21:49                     ` Junio C Hamano
2016-10-27 22:38                     ` Linus Torvalds
2016-10-27 22:56                       ` Junio C Hamano
2016-10-27 23:09                         ` Linus Torvalds
2016-10-27 23:19                           ` Linus Torvalds
2016-10-27 23:36                             ` Junio C Hamano
2016-10-27 23:44                               ` Linus Torvalds
2016-10-28  1:08                                 ` Junio C Hamano
2016-10-28  2:37                                   ` Junio C Hamano
2016-10-28  5:51                                     ` Eric Wong
2016-10-28 11:11                                     ` Johannes Schindelin
2016-10-28 16:13                                       ` Linus Torvalds
2016-10-28 16:48                                         ` Junio C Hamano
2016-10-28 17:38                                           ` Linus Torvalds
2016-10-28 17:47                                             ` Junio C Hamano
2016-10-29  1:26                                             ` Junio C Hamano
2016-10-29  8:25                                               ` Johannes Schindelin
2016-10-29 17:06                                                 ` Linus Torvalds
2016-10-31 17:37                                                   ` Junio C Hamano [this message]
2016-10-31 13:56                                         ` Jeff King
2016-10-31 17:55                                           ` Junio C Hamano
2016-10-31 18:05                                             ` Jeff King
2016-10-28 13:32                                     ` Junio C Hamano
2016-10-28 13:33                                       ` Junio C Hamano
2016-10-28  7:51                       ` Jeff King
2016-10-25 18:16     ` [PATCH v3 3/3] read-cache: make sure file handles are not inherited by child processes Junio C Hamano
2016-10-25 21:33       ` Eric Wong
2016-10-25 22:54         ` Junio C Hamano
2016-10-25 21:48     ` [PATCH v3 0/3] quick reroll of Lars's git_open() w/ O_CLOEXEC Lars Schneider
2016-10-25 22:56       ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqq1sywv2qi.fsf@gitster.mtv.corp.google.com \
    --to=gitster@pobox.com \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=e@80x24.org \
    --cc=git@vger.kernel.org \
    --cc=larsxschneider@gmail.com \
    --cc=peff@peff.net \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).