git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: Eric Sunshine <sunshine@sunshineco.com>
Cc: Johannes Schindelin via GitGitGadget <gitgitgadget@gmail.com>,
	Git List <git@vger.kernel.org>,
	Johannes Schindelin <johannes.schindelin@gmx.de>,
	Pierre Garnier <pgarnier@mega.com>
Subject: Re: [PATCH] refs: work around network caching on Windows
Date: Fri, 15 Jul 2022 10:47:50 -0700	[thread overview]
Message-ID: <xmqqy1wuz2rd.fsf@gitster.g> (raw)
In-Reply-To: <CAPig+cT17vZcsf2pGQh-S6UjZib3=4Am7RVf=gQq_sDZzKD08w@mail.gmail.com> (Eric Sunshine's message of "Fri, 15 Jul 2022 04:29:11 -0400")

Eric Sunshine <sunshine@sunshineco.com> writes:

>> The underlying culprit is that `GetFileAttributesExW()` that is called from
>> `mingw_lstat()` can raise an error `ERROR_PATH_NOT_FOUND`, which is
>> translated to `ENOTDIR`, as opposed to `ENOENT` as expected on Linux.
>> ...
>> @@ -480,7 +480,7 @@ static int load_contents(struct snapshot *snapshot)
>> -               if (errno == ENOENT) {
>> +               if (errno == ENOENT || errno == ENOTDIR) {
>
> The first question which popped into my mind upon reading the patch
> was why these changes need to be made to files-backend.c and
> packed-backend.c rather than "fixing" mingw_lstat() to return ENOENT
> instead of ENOTDIR. Patching mingw_lstat() seems more tractable and
> less likely to lead to discovery of other code in the future which
> needs to be patched in a similar way to how files-backend.c and
> packed-backend.c are being patched here.
>
> Perhaps it's a silly question and the answer is perfectly obvious to
> folks directly involved in Git development on Windows, but the commit
> message doesn't seem to answer it for people who don't have such
> inside knowledge.

FWIW, I had the same reaction.  ENOTDIR does not mean an attempt to
access "test_dir/test_tag" failed because "test_dir" did not
exist---it means "test_dir" exists as something that is not a
directory (hence there is no way "test_dir/test_file" to exist).

In the example scenario in the proposed log message, a new tag
"test_dir/test_tag" is created, and (even though the proposed log
message does not explicitly say so) presumably, "test_dir" needs to
be created while doing so.  A failure to access "test_dir/test_file"
due to lack of "test_dir" shouldn't report ENOTDIR but should report
ENOENT.  So something is fishy.

Unless the internal implementation of the filesystem creates a
placeholder that is not a directory at "test_dir", returns the
control to the application, and does further work to turn that
placeholder object into a directory in the background, and the
application attempts to create "test_dir/test_tag" in the meantime,
racing with the filesystem, or something silly like that?

It sounds like a platform specific bug that is not specific to the
ref-files subsystem.  If it can be fixed at lstat() emulation, that
would benefit other checks for ENOENT.

Having said all that.

Stepping back a bit, one situation where we would want to special
case ENOENT is when we have an optional file at known location and
it is OK for the file to be missing.  We may attempt to read from
"$XDG_CONFIG_HOME/git/config" and it may fail due to ENOENT or
ENOTDIR because the user may not be using XDG config location at all
(hence $XDG_CONFIG_HOME or XDG_CONFIG_HOME/git may be missing) or
the leading directories may be there but not the file at the leaf
level.  In such a use case, we should ignore ENOTDIR just like we
ignore ENOENT as an error that does not matter.

In any case, the posted patch may hide a repository corruption from
the codepath affected and cause it to silently return a bogus
answer.  The first hunk touches read_ref_internal() where "path"
variable contains the path we expect to find the on-disk loose ref
file

	if (lstat(path, &st) < 0) {
		int ignore_errno;
		myerr = errno;
		if (myerr != ENOENT || skip_packed_refs)
			goto out;
		if (refs_read_raw_ref(refs->packed_ref_store, refname, oid,
				      referent, type, &ignore_errno)) {
			myerr = ENOENT;
			goto out;
		}
		ret = 0;
		goto out;
	}

The idea is that a ref does not have to exist as a loose ref file,
so an error from lstat() is not immediately an error.  If the ref
were previously packed, then we should fall back to read it from the
packed-ref file.

So we say that if the error is *NOT* ENOENT, we jump to 'out' label
to report the failed lstat() as an error".  Otherwise we proceed to
attempt to read from the packed-ref file.  Is there any case where
an attempt to read from a loose ref fails with ENOTDIR and it is OK
that we can proceed without reading from the packed-refs file?

If we have a branch "js/foo" that is packed, and then if we removed
it, and then created a branch "js", the original "js/foo" should not
be in the packed-refs file, but supposed a repository is corrupt and
"js/foo" remains in the packed-refs file.  Now imagine that we ask
for "js/foo".  What happens?

We fail to lstat ".git/refs/heads/js/foo" due to ENOTDIR, because
the "js" branch exists loose at ".git/refs/heads/js".  In the
original, because ENOTDIR is not ENOENT, we jumped to "out" to
report the error.  The patched code to allow ENOTDIR will instead
happily read the stale version of "js/foo" out of the packed-refs
file.


  parent reply	other threads:[~2022-07-15 17:49 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-15  8:06 [PATCH] refs: work around network caching on Windows Johannes Schindelin via GitGitGadget
2022-07-15  8:29 ` Eric Sunshine
2022-07-15 12:11   ` Johannes Schindelin
2022-07-15 17:47   ` Junio C Hamano [this message]
2022-07-15  8:30 ` Ævar Arnfjörð Bjarmason
2022-07-28 14:29 ` [PATCH v2] lstat(mingw): correctly detect ENOTDIR scenarios Johannes Schindelin via GitGitGadget
2022-07-28 17:37   ` Junio C Hamano
2022-07-28 23:27   ` Eric Sunshine
2022-07-29 10:05   ` [PATCH v3] " Johannes Schindelin via GitGitGadget

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqy1wuz2rd.fsf@gitster.g \
    --to=gitster@pobox.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=johannes.schindelin@gmx.de \
    --cc=pgarnier@mega.com \
    --cc=sunshine@sunshineco.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).