git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Michael Haggerty <mhagger@alum.mit.edu>
To: Junio C Hamano <gitster@pobox.com>
Cc: Jeff King <peff@peff.net>, Johan Herland <johan@herland.net>,
	git@vger.kernel.org, Michael Haggerty <mhagger@alum.mit.edu>
Subject: [PATCH 4/4] resolve_ref_unsafe(): close race condition reading loose refs
Date: Tue, 11 Jun 2013 16:26:20 +0200	[thread overview]
Message-ID: <1370960780-1055-5-git-send-email-mhagger@alum.mit.edu> (raw)
In-Reply-To: <1370960780-1055-1-git-send-email-mhagger@alum.mit.edu>

We read loose references in two steps.  The code is roughly:

    lstat()
    if error ENOENT:
        loose ref is missing; look for corresponding packed ref
    else if S_ISLNK:
        readlink()
        if error:
            report failure
    else if S_ISDIR:
        report failure
    else
        open()
        if error:
            report failure
        read()

The problem is that the first filesystem call, to lstat(), is not
atomic with the second filesystem call, to readlink() or open().
Therefore it is possible for another process to change the file
between our two calls, for example:

* If the other process deletes the file, our second call will fail
  with ENOENT, which we *should* interpret as "loose ref is missing;
  look for corresponding packed ref".  This can arise if the other
  process is pack-refs; it might have just written a new packed-refs
  file containing the old contents of the reference then deleted the
  loose ref.

* If the other process changes a symlink into a plain file, our call
  to readlink() will fail with EINVAL, which we *should* respond to by
  trying to open() and read() the file.

The old code treats the reference as missing in both of these cases,
which is incorrect.

So instead, wrap the above code in a loop.  If the result of
readline()/open() is a failure that is inconsistent with the result of
the previous lstat(), then just loop again starting with a fresh call
to lstat().

One race is still possible and undetected: another process could
change the file from a regular file into a symlink between the call to
lstat and the call to open().  The open() call would silently follow
the symlink and not know that something is wrong.  I don't see a way
to detect this situation without the use of the O_NOFOLLOW option,
which is not portable and is not used elsewhere in our code base.

However, we don't use symlinks anymore, so this situation is unlikely.
And it doesn't appear that treating a symlink as a regular file would
have grave consequences; after all, this is exactly how the code
handles non-relative symlinks.

Note that this solves only the part of the race within
resolve_ref_unsafe. In the situation described above, we may still be
depending on a cached view of the packed-refs file; that race will be
dealt with in a future patch.

This problem was reported and diagnosed by Jeff King <peff@peff.net>,
and this solution is derived from his patch.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
Please note that if there is some bizarre filesystem somewhere for
which, for a single, static file

    lstat() reports S_ISLNK and readlink() fails with ENOENT or EINVAL

or

    lstat() reports neither S_ISLNK nor S_ISDIR and open() fails with ENOENT

then the inner loop would never terminate.  I can't imagine this
happening, but if somebody is worried about it the solution is simple:
limit the inner loop to 3 iterations or so.

Another obvious way to solve the problem would be to skip the lstat()
call altogether, and to rely on errno to distinguish the cases:

    readlink()
    if error ENOENT:
        loose ref is missing; look for corresponding packed ref
    else if error EINVAL:
        must not be a symlink; fall through to open()
    else if other error:
        report failure
    else:
        handle as symlink

    open()
    if error ENOENT:
        file must have been deleted since readlink(); look for packed ref
    else if other error:
        report failure
    else:
        read() and handle file contents

There is still a gap between readlink() and open(), during which the
file could have been changed from a regular file into a symlink, so
the remaining race mentioned above would still be undetected.  I don't
see a strong reason to prefer the looping code in this patch vs. the
no-lstat() alternative so I took the variant that was closer to the
old code.

 refs.c | 27 +++++++++++++++++++++++----
 1 file changed, 23 insertions(+), 4 deletions(-)

diff --git a/refs.c b/refs.c
index 7a77d76..6e7c1fd 100644
--- a/refs.c
+++ b/refs.c
@@ -1248,6 +1248,15 @@ const char *resolve_ref_unsafe(const char *refname, unsigned char *sha1, int rea
 
 		git_snpath(path, sizeof(path), "%s", refname);
 
+		/*
+		 * This loop is to avoid race conditions.  First we
+		 * lstat() the file, then we try to read it as a link
+		 * or as a file.  But if somebody changes the type of
+		 * the file (file <-> directory <-> symlink) between
+		 * the lstat() and reading, then we don't want to
+		 * report that as an error but rather try again
+		 * starting with the lstat().
+		 */
 		for (;;) {
 			if (lstat(path, &st) < 0) {
 				if (errno == ENOENT)
@@ -1261,8 +1270,13 @@ const char *resolve_ref_unsafe(const char *refname, unsigned char *sha1, int rea
 			/* Follow "normalized" - ie "refs/.." symlinks by hand */
 			if (S_ISLNK(st.st_mode)) {
 				len = readlink(path, buffer, sizeof(buffer)-1);
-				if (len < 0)
-					return NULL;
+				if (len < 0) {
+					if (errno == ENOENT || errno == EINVAL)
+						/* inconsistent with lstat; retry */
+						continue;
+					else
+						return NULL;
+				}
 				buffer[len] = 0;
 				if (!prefixcmp(buffer, "refs/") &&
 				    !check_refname_format(buffer, 0)) {
@@ -1285,8 +1299,13 @@ const char *resolve_ref_unsafe(const char *refname, unsigned char *sha1, int rea
 			 * a ref
 			 */
 			fd = open(path, O_RDONLY);
-			if (fd < 0)
-				return NULL;
+			if (fd < 0) {
+				if (errno == ENOENT)
+					/* inconsistent with lstat; retry */
+					continue;
+				else
+					return NULL;
+			}
 			len = read_in_full(fd, buffer, sizeof(buffer)-1);
 			close(fd);
 			if (len < 0)
-- 
1.8.3

  parent reply	other threads:[~2013-06-11 14:27 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-05-03  8:38 another packed-refs race Jeff King
2013-05-03  9:26 ` Johan Herland
2013-05-03 17:28   ` Jeff King
2013-05-03 18:26     ` Jeff King
2013-05-03 21:02       ` Johan Herland
2013-05-06 12:12     ` Michael Haggerty
2013-05-06 18:44       ` Jeff King
2013-05-03 21:21 ` Jeff King
2013-05-06 12:03 ` Michael Haggerty
2013-05-06 18:41   ` Jeff King
2013-05-06 22:18     ` Jeff King
2013-05-07  4:32     ` Michael Haggerty
2013-05-07  4:44       ` Jeff King
2013-05-07  8:03         ` Michael Haggerty
2013-05-07  2:36 ` [PATCH 0/4] fix packed-refs races Jeff King
2013-05-07  2:38   ` [PATCH 1/4] resolve_ref: close race condition for packed refs Jeff King
2013-05-12 22:56     ` Michael Haggerty
2013-05-16  3:47       ` Jeff King
2013-05-16  5:50         ` Michael Haggerty
2013-05-12 23:26     ` Michael Haggerty
2013-06-11 14:26     ` [PATCH 0/4] Fix a race condition when reading loose refs Michael Haggerty
2013-06-11 14:26       ` [PATCH 1/4] resolve_ref_unsafe(): extract function handle_missing_loose_ref() Michael Haggerty
2013-06-11 14:26       ` [PATCH 2/4] resolve_ref_unsafe(): handle the case of an SHA-1 within loop Michael Haggerty
2013-06-11 14:26       ` [PATCH 3/4] resolve_ref_unsafe(): nest reference-reading code in an infinite loop Michael Haggerty
2013-06-11 14:26       ` Michael Haggerty [this message]
2013-06-12  8:04         ` [PATCH 4/4] resolve_ref_unsafe(): close race condition reading loose refs Jeff King
2013-06-13  8:22         ` Thomas Rast
2013-06-14  7:17           ` Michael Haggerty
2013-06-11 20:57       ` [PATCH 0/4] Fix a race condition when " Junio C Hamano
2013-05-07  2:39   ` [PATCH 2/4] add a stat_validity struct Jeff King
2013-05-13  2:29     ` Michael Haggerty
2013-05-13  3:00       ` [RFC 0/2] Separate stat_data from cache_entry Michael Haggerty
2013-05-13  3:00         ` [RFC 1/2] Extract a struct " Michael Haggerty
2013-05-13  3:00         ` [RFC 2/2] add a stat_validity struct Michael Haggerty
2013-05-13  5:10         ` [RFC 0/2] Separate stat_data from cache_entry Junio C Hamano
2013-05-16  3:51       ` [PATCH 2/4] add a stat_validity struct Jeff King
2013-05-07  2:43   ` [PATCH 3/4] get_packed_refs: reload packed-refs file when it changes Jeff King
2013-05-07  2:54     ` [PATCH 0/2] peel_ref cleanups changes Jeff King
2013-05-07  2:56       ` [PATCH 1/2] peel_ref: rename "sha1" argument to "peeled" Jeff King
2013-05-07  3:06       ` [PATCH 2/2] peel_ref: refactor for safety with simultaneous update Jeff King
2013-05-09 19:18     ` [PATCH 3/4] get_packed_refs: reload packed-refs file when it changes Eric Sunshine
2013-05-13  2:43     ` Michael Haggerty
2013-05-07  2:51   ` [PATCH 4/4] for_each_ref: load all loose refs before packed refs Jeff King
2013-05-07  6:40   ` [PATCH 0/4] fix packed-refs races Junio C Hamano
2013-05-07 14:19     ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1370960780-1055-5-git-send-email-mhagger@alum.mit.edu \
    --to=mhagger@alum.mit.edu \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=johan@herland.net \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).