git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Elijah Newren <newren@gmail.com>
To: Jonathan Nieder <jrnieder@gmail.com>
Cc: "Jonathan Tan" <jonathantanmy@google.com>,
	"Git Mailing List" <git@vger.kernel.org>,
	"SZEDER Gábor" <szeder.dev@gmail.com>
Subject: Re: merge-recursive thinks symlink's child dirs are "real"
Date: Tue, 17 Sep 2019 09:02:07 -0700	[thread overview]
Message-ID: <CABPp-BFqZS5WaR1b-MtQ0eMrsDJNSNjSrC0NjPpKk6+5WtNNWw@mail.gmail.com> (raw)
In-Reply-To: <20190917000902.GC67467@google.com>

On Mon, Sep 16, 2019 at 5:09 PM Jonathan Nieder <jrnieder@gmail.com> wrote:
>
> Jonathan Tan wrote:
>
> > This was raised by a coworker at $DAYJOB. I run the following script:
> [reproduction recipe from (*) snipped]
> > The cherry-pick must be manually resolved, when I would expect it to
> > happen without needing user intervention.
> >
> > You can see that at the point of the cherry-pick, in the working
> > directory, ./foo is a symlink and ./foo/bar is a directory. I traced the
> > code that ran during the cherry-pick to process_entry() in
> > merge-recursive.c. When processing "foo/bar", control flow correctly
> > reaches "Case B: Added in one", but the dir_in_way() invocation returns
> > true, since lstat() indeed reveals that "foo/bar" is a directory.
>
> Gábor covered the "what happened", so let me say a little more about
> the motivation.
>
> The "foo" symlink is being replaced by a "foo" directory containing a
> "bar" file.  We're pretty far along now: we want to write actual files
> to disk.  Using the index we know where we were going from and to, but
> not everything in the world is tracked in the index: there could be
> build outputs under "foo/bar" blocking the operation from moving
> forward.
>
> So we check whether there's a directory there.  Once we are writing
> things out, there won't be, but the symlink confuses us.  Nice find.

Yep.

>
> [...]
> > Is this use case something that Git should be able to handle, and if
> > yes, is the correct solution to teach dir_in_way() that dirs reachable
> > from symlinks are not really in the way (presumably the implementation
> > would climb directories until we reach the root or we reach a filesystem
> > boundary, similar to what we do when we search for the .git directory)?
>
> The crucial detail here is that "foo" is going to be removed before we
> write "foo/bar".  We should be able to notice that and skip the
> dir_in_way check.

I know what you're getting at from a high level view, but this view is
incompatible with the machinery's internals.  In particular,
merge-recursive's design provides no way to "notice" that something is
"*going* to be removed" (every path is updated on the fly as it
processes it); the dir_in_way() check is there precisely to determine
if it's safe to write to the given path -- which basically means if
there is no directory in the way (a "foo/bar/" directory, in this
case).  So we definitely do not want to skip the dir_in_way() check,
we want to modify it to be aware that a leading path being a symlink
doesn't count as in the way (much as a the existence of a file on disk
corresponding to one of our leading paths doesn't count as in the way
for our purposes).

  reply	other threads:[~2019-09-17 16:02 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-16 21:47 merge-recursive thinks symlink's child dirs are "real" Jonathan Tan
2019-09-16 22:15 ` SZEDER Gábor
2019-09-17 15:54   ` Elijah Newren
2019-09-17  0:09 ` Jonathan Nieder
2019-09-17 16:02   ` Elijah Newren [this message]
2019-09-17 15:48 ` Elijah Newren
2019-09-17 21:50   ` [RFC PATCH] merge-recursive: symlink's descendants not in way Jonathan Tan
2019-09-17 22:23     ` Junio C Hamano
2019-09-17 22:32       ` Jonathan Tan
2019-09-17 22:37         ` Junio C Hamano
2019-09-17 22:49           ` Jonathan Tan
2019-09-17 23:02     ` SZEDER Gábor
2019-09-18  0:35     ` Elijah Newren

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CABPp-BFqZS5WaR1b-MtQ0eMrsDJNSNjSrC0NjPpKk6+5WtNNWw@mail.gmail.com \
    --to=newren@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=jonathantanmy@google.com \
    --cc=jrnieder@gmail.com \
    --cc=szeder.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).