git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jan Harkes <jaharkes@cs.cmu.edu>
To: Linus Torvalds <torvalds@osdl.org>
Cc: Git Mailing List <git@vger.kernel.org>
Subject: Re: Careful object writing..
Date: Tue, 3 May 2005 16:59:57 -0400	[thread overview]
Message-ID: <20050503205957.GA25253@delft.aura.cs.cmu.edu> (raw)
In-Reply-To: <Pine.LNX.4.58.0505031306310.26698@ppc970.osdl.org>

On Tue, May 03, 2005 at 01:11:47PM -0700, Linus Torvalds wrote:
> On Tue, 3 May 2005, Jan Harkes wrote:
> > I tried to pull in the latest version of your tree, but it doesn't look
> > like this commit has propagated to rsync.kernel.org yet. Hopefully you
> > will accept a small patch (should be < 5 lines) that makes git work
> > nicely when Coda complains about the cross-directory hardlink without
> > affecting the reliability of using link/unlink on normal filesystems.
> 
> What is it that coda wants to do, and is there some portable way to get 
> there? 

Short summary:

    rc = link(old, new);
    if (rc == -1 && errno == EXDEV)
	rc = rename(old, new);

On Coda, the cross-directory link fails, the following cross-directory
rename will work fine.  On a normal filesystem, if the link fails with
EXDEV, the rename will fail with the same.

Because our cache consistency model is fairly optimistic, we already
have to deal with potential problems with a rename removing an unwanted
target. So if we are logging write operations, and the link operation
did not return EEXISTS, then the rename will be marked as not having
removed any target file. If the target did happen to exist on the server
by the time we reintegrate the operations we end up with a reintegration
conflict.


Longer version:

When a server performs conflict resolution it happens on a per-directory
basis. So any cross-directory operation already a special case.

We cannot guarantee which directory will be resolved first, so if it is
the destination of a link or rename the object itself might not exist
yet. The advantage of a rename operation is that it contains a reference
to both the source and the destination directories. If we don't yet know
the renamed object we resolve the source first. That creates the object
and allows us to complete the rename operation.

However with a link we only have a reference to an object and the
directory where the link should be added. But again, the object might
not yet exist on all servers. At this point things get a bit more
complicated because we don't enforce access based on per object UNIX
mode bits, but rely on directory ACLs. So we can't just add a reference
to an unknown object in the destination directory because until we know
where else this object is located, we can't tell if the user actually is
allowed to access the object.

> Is it just that you want to stay within the directory? Or is it any link 
> action that is nasty?

We do allow links within the same directory, mostly because that often
happens in places like /usr/bin and we know that whenever we encounter
the link operation in the resolution log, that the object creation has
already been processed. We also know that the new link can't give a user
any rights he didn't already have.

> What makes resolving renames hard when the file contents are the same? 

Renames mostly work, there are only a few corner cases left. One is
where something is moved up in the directory tree and the source
directory is then removed. We end up screwing ourselves because we
append the childs logs on removal and the create operation ends up
behind the rename operation. But that is a dumb implementation problem.
Another issue is of course when someone (validly) hardlinks a file in
the same directory and then moves one of the links to another directory.

We definitely are not a typical filesystem with UNIX semantics, which is
why it is unusual to find an application that seems so well suited for
disconnected and weakly connected operation.

Jan


  reply	other threads:[~2005-05-03 20:54 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-05-03 19:15 Careful object writing Linus Torvalds
2005-05-03 19:27 ` Chris Wedgwood
2005-05-03 19:47   ` Linus Torvalds
2005-05-03 19:47     ` Chris Wedgwood
2005-05-03 19:56       ` Linus Torvalds
2005-05-03 20:02   ` Daniel Barkalow
2005-05-03 20:00 ` Jan Harkes
2005-05-03 20:11   ` Linus Torvalds
2005-05-03 20:59     ` Jan Harkes [this message]
2005-05-03 22:13       ` Linus Torvalds
2005-05-03 22:37   ` Linus Torvalds
2005-05-03 22:40     ` H. Peter Anvin
2005-05-03 23:04 ` Alex Riesen
2005-05-03 23:22   ` Linus Torvalds
2005-05-03 23:25     ` Alex Riesen
2005-05-03 23:22   ` Junio C Hamano
2005-05-04  4:07 ` [PATCH] Careful object pulling Daniel Barkalow
2005-05-04  9:35   ` Morten Welinder
2005-05-04 16:16     ` Daniel Barkalow

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20050503205957.GA25253@delft.aura.cs.cmu.edu \
    --to=jaharkes@cs.cmu.edu \
    --cc=git@vger.kernel.org \
    --cc=torvalds@osdl.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).