git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* Worktree creation race
@ 2019-01-15 14:03 Marketa Calabkova
  2019-01-28 12:58 ` Marketa Calabkova
  0 siblings, 1 reply; 7+ messages in thread
From: Marketa Calabkova @ 2019-01-15 14:03 UTC (permalink / raw)
  To: git


[-- Attachment #1.1: Type: text/plain, Size: 1322 bytes --]

Hello,

I am writing to report a bug. The original report is from my colleague, I am also providing his suggestions.

There is insufficient locking for worktree addition. Adding worktree may fail.

The problem is that git reads the directory entries in $GIT_DIR/worktrees, 
finds a worktree name that does not exist, tries to create it, and if an 
error is returned adding the worktree fails. When multiple git processes 
do this in parallel only one adds a worktree and the others fail. Git should 
reread the directory and find a new name that does not exist when creating 
the worktree directory fails because another git process already created it.

I suppose adding PID in the tree name would mitigate the issue to the point it will be very unlikely to encounter.

I need more than the tree in the temporary directory so using the temporary directory directly as a tree is out of question.

to test:

cd /dev/shm
mkdir gittest
cd gittest
git init gitrepo
cd gitrepo
git commit --allow-empty -m Empty
for n in $(seq 10000) ; do ( tmp=$(mktemp -d /dev/shm/gittest/test.XXXXXXXXXXX) ; mkdir $tmp/test ; git worktree add --detach $tmp/test ; ) & done

(you should see many messages like: 
fatal: could not create directory of '.git/worktrees/test284': File exists)

Greetings,
Marketa



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Worktree creation race
  2019-01-15 14:03 Worktree creation race Marketa Calabkova
@ 2019-01-28 12:58 ` Marketa Calabkova
  2019-02-01  6:27   ` Eric Sunshine
  0 siblings, 1 reply; 7+ messages in thread
From: Marketa Calabkova @ 2019-01-28 12:58 UTC (permalink / raw)
  To: git


[-- Attachment #1.1: Type: text/plain, Size: 1704 bytes --]

Hi,

you have probably overseen my email :) . Please, I would like to get it
fixed.

Does anyone has a suggestion what to do with this bug? It looks like a
one-line fix probably in builtin/worktree.c, but I have no idea how to
do it. Sorry.

Thank you,
Marketa

On 15/01/2019 15:03, Marketa Calabkova wrote:
> Hello,
>
> I am writing to report a bug. The original report is from my colleague, I am also providing his suggestions.
>
> There is insufficient locking for worktree addition. Adding worktree may fail.
>
> The problem is that git reads the directory entries in $GIT_DIR/worktrees, 
> finds a worktree name that does not exist, tries to create it, and if an 
> error is returned adding the worktree fails. When multiple git processes 
> do this in parallel only one adds a worktree and the others fail. Git should 
> reread the directory and find a new name that does not exist when creating 
> the worktree directory fails because another git process already created it.
>
> I suppose adding PID in the tree name would mitigate the issue to the point it will be very unlikely to encounter.
>
> I need more than the tree in the temporary directory so using the temporary directory directly as a tree is out of question.
>
> to test:
>
> cd /dev/shm
> mkdir gittest
> cd gittest
> git init gitrepo
> cd gitrepo
> git commit --allow-empty -m Empty
> for n in $(seq 10000) ; do ( tmp=$(mktemp -d /dev/shm/gittest/test.XXXXXXXXXXX) ; mkdir $tmp/test ; git worktree add --detach $tmp/test ; ) & done
>
> (you should see many messages like: 
> fatal: could not create directory of '.git/worktrees/test284': File exists)
>
> Greetings,
> Marketa
>
>



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Worktree creation race
  2019-01-28 12:58 ` Marketa Calabkova
@ 2019-02-01  6:27   ` Eric Sunshine
  2019-02-01  7:06     ` Duy Nguyen
  2019-02-01 18:06     ` Junio C Hamano
  0 siblings, 2 replies; 7+ messages in thread
From: Eric Sunshine @ 2019-02-01  6:27 UTC (permalink / raw)
  To: Marketa Calabkova; +Cc: Git List, Nguyễn Thái Ngọc Duy

On Mon, Jan 28, 2019 at 7:58 AM Marketa Calabkova <mcalabkova@suse.cz> wrote:
> On 15/01/2019 15:03, Marketa Calabkova wrote:
> > I am writing to report a bug. The original report is from my colleague, I am also providing his suggestions.
> >
> > There is insufficient locking for worktree addition. Adding worktree may fail.
> >
> > The problem is that git reads the directory entries in $GIT_DIR/worktrees,
> > finds a worktree name that does not exist, tries to create it, and if an
> > error is returned adding the worktree fails. When multiple git processes
> > do this in parallel only one adds a worktree and the others fail. Git should
> > reread the directory and find a new name that does not exist when creating
> > the worktree directory fails because another git process already created it.
> >
> > I suppose adding PID in the tree name would mitigate the issue to the point it will be very unlikely to encounter.
> >
> > I need more than the tree in the temporary directory so using the temporary directory directly as a tree is out of question.
> >
> > cd gitrepo
> > git commit --allow-empty -m Empty
> > for n in $(seq 10000) ; do ( tmp=$(mktemp -d /dev/shm/gittest/test.XXXXXXXXXXX) ; mkdir $tmp/test ; git worktree add --detach $tmp/test ; ) & done
> >
> > (you should see many messages like:
> > fatal: could not create directory of '.git/worktrees/test284': File exists)
> >
> Does anyone has a suggestion what to do with this bug? It looks like a
> one-line fix probably in builtin/worktree.c, but I have no idea how to
> do it. Sorry.

I doubt this is a one-line fix, and I don't think it has anything to
do with reading entries in $GIT_DIR/worktrees.

add_worktree() already attempts to give a unique identifier to each
worktree by adding a numeric suffix and incrementing that suffix if
the name already exists (such as the 284 in your example error
message) but there is definitely a race-condition between the time it
stat()s the name and the time it mkdir()s it.

One possible fix would be to unconditionally use the PID, as you
suggest, though, this is not necessarily foolproof against races
either (though it makes collisions very unlikely).

Another possibility would be to skip the stat() and instead do the
mkdir() in a loop, incrementing the sequence number each time through
the loop. That should eliminate the race entirely (I think).

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Worktree creation race
  2019-02-01  6:27   ` Eric Sunshine
@ 2019-02-01  7:06     ` Duy Nguyen
  2019-02-01 13:16       ` Duy Nguyen
  2019-02-01 18:06     ` Junio C Hamano
  1 sibling, 1 reply; 7+ messages in thread
From: Duy Nguyen @ 2019-02-01  7:06 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: Marketa Calabkova, Git List

Thanks for including me. Apparently I did miss some emails :)

On Fri, Feb 1, 2019 at 1:27 PM Eric Sunshine <sunshine@sunshineco.com> wrote:
>
> On Mon, Jan 28, 2019 at 7:58 AM Marketa Calabkova <mcalabkova@suse.cz> wrote:
> > On 15/01/2019 15:03, Marketa Calabkova wrote:
> > > I am writing to report a bug. The original report is from my colleague, I am also providing his suggestions.
> > >
> > > There is insufficient locking for worktree addition. Adding worktree may fail.
> > >
> > > The problem is that git reads the directory entries in $GIT_DIR/worktrees,
> > > finds a worktree name that does not exist, tries to create it, and if an
> > > error is returned adding the worktree fails. When multiple git processes
> > > do this in parallel only one adds a worktree and the others fail. Git should
> > > reread the directory and find a new name that does not exist when creating
> > > the worktree directory fails because another git process already created it.
> > >
> > > I suppose adding PID in the tree name would mitigate the issue to the point it will be very unlikely to encounter.
> > >
> > > I need more than the tree in the temporary directory so using the temporary directory directly as a tree is out of question.
> > >
> > > cd gitrepo
> > > git commit --allow-empty -m Empty
> > > for n in $(seq 10000) ; do ( tmp=$(mktemp -d /dev/shm/gittest/test.XXXXXXXXXXX) ; mkdir $tmp/test ; git worktree add --detach $tmp/test ; ) & done
> > >
> > > (you should see many messages like:
> > > fatal: could not create directory of '.git/worktrees/test284': File exists)
> > >
> > Does anyone has a suggestion what to do with this bug? It looks like a
> > one-line fix probably in builtin/worktree.c, but I have no idea how to
> > do it. Sorry.
>
> I doubt this is a one-line fix, and I don't think it has anything to
> do with reading entries in $GIT_DIR/worktrees.

I never thought people would create worktrees "like crazy" to end up
worrying about races like this. The mkdir loop would be one way to go.
But I'm going to add a new option to let the user control this
directory name. This is necessary since this name is now exposed via
"worktrees/<name>" reference space and should also be reported in "git
worktree list". Avoiding the race is a nice bonus.

> add_worktree() already attempts to give a unique identifier to each
> worktree by adding a numeric suffix and incrementing that suffix if
> the name already exists (such as the 284 in your example error
> message) but there is definitely a race-condition between the time it
> stat()s the name and the time it mkdir()s it.
>
> One possible fix would be to unconditionally use the PID, as you
> suggest, though, this is not necessarily foolproof against races
> either (though it makes collisions very unlikely).
>
> Another possibility would be to skip the stat() and instead do the
> mkdir() in a loop, incrementing the sequence number each time through
> the loop. That should eliminate the race entirely (I think).
-- 
Duy

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Worktree creation race
  2019-02-01  7:06     ` Duy Nguyen
@ 2019-02-01 13:16       ` Duy Nguyen
  2019-02-01 19:54         ` Eric Sunshine
  0 siblings, 1 reply; 7+ messages in thread
From: Duy Nguyen @ 2019-02-01 13:16 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: Marketa Calabkova, Git List

On Fri, Feb 01, 2019 at 02:06:43PM +0700, Duy Nguyen wrote:
> worrying about races like this. The mkdir loop would be one way to go.
> But I'm going to add a new option to let the user control this
> directory name. This is necessary since this name is now exposed via
> "worktrees/<name>" reference space and should also be reported in "git
> worktree list". Avoiding the race is a nice bonus.

I'm not going to bother you with code yet (although if you want, you
can check out branch worktree-name on my gitlab repo), but this is
what the user facing changes look like. Looking good?

PS. I think this also calls for a command to rename working trees.
Sigh.. this worktree thingy never ends.

-- 8< --
diff --git a/Documentation/git-worktree.txt b/Documentation/git-worktree.txt
index 18469e202e..d5db69dec7 100644
--- a/Documentation/git-worktree.txt
+++ b/Documentation/git-worktree.txt
@@ -9,7 +9,7 @@ git-worktree - Manage multiple working trees
 SYNOPSIS
 --------
 [verse]
-'git worktree add' [-f] [--detach] [--checkout] [--lock] [-b <new-branch>] <path> [<commit-ish>]
+'git worktree add' [-f] [--detach] [--checkout] [--lock] [-b <new-branch>] [--name <name>] <path> [<commit-ish>]
 'git worktree list' [--porcelain]
 'git worktree lock' [--reason <string>] <worktree>
 'git worktree move' <worktree> <new-path>
@@ -195,6 +195,13 @@ This can also be set up as the default behaviour by using the
 --reason <string>::
 	With `lock`, an explanation why the working tree is locked.
 
+--name <name>::
+	Name of the working tree. Each working tree must have a unique
+	name. This is also the directory name containing all working
+	tree's specific information under `$GIT_COMMON_DIR/worktrees`.
+	If `--name` is not given, the name is based on basename(3)
+	optionally with a number suffix to make it unique.
+
 <worktree>::
 	Working trees can be identified by path, either relative or
 	absolute.
@@ -323,11 +330,15 @@ details on a single line with columns.  For example:
 
 ------------
 $ git worktree list
-/path/to/bare-source            (bare)
-/path/to/linked-worktree        abcd1234 [master]
-/path/to/other-linked-worktree  1234abc  (detached HEAD)
+(main) /path/to/bare-source            (bare)
+linked /path/to/linked-worktree        abcd1234 [master]
+other  /path/to/other-linked-worktree  1234abc  (detached HEAD)
 ------------
 
+The first column is the name of the working tree. The second column is
+the location of the working tree. The third column is the short commit
+name of the current branch. The branch name is put in square brackets.
+
 Porcelain Format
 ~~~~~~~~~~~~~~~~
 The porcelain format has a line per attribute.  Attributes are listed with a
@@ -341,10 +352,12 @@ $ git worktree list --porcelain
 worktree /path/to/bare-source
 bare
 
+name linked
 worktree /path/to/linked-worktree
 HEAD abcd1234abcd1234abcd1234abcd1234abcd1234
 branch refs/heads/master
 
+name other
 worktree /path/to/other-linked-worktree
 HEAD 1234abc1234abc1234abc1234abc1234abc1234a
 detached
-- 8< --
--
Duy

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: Worktree creation race
  2019-02-01  6:27   ` Eric Sunshine
  2019-02-01  7:06     ` Duy Nguyen
@ 2019-02-01 18:06     ` Junio C Hamano
  1 sibling, 0 replies; 7+ messages in thread
From: Junio C Hamano @ 2019-02-01 18:06 UTC (permalink / raw)
  To: Eric Sunshine
  Cc: Marketa Calabkova, Git List,
	Nguyễn Thái Ngọc Duy

Eric Sunshine <sunshine@sunshineco.com> writes:

> Another possibility would be to skip the stat() and instead do the
> mkdir() in a loop, incrementing the sequence number each time through
> the loop. That should eliminate the race entirely (I think).

Yeah, I think that is the only sensible option to rely on the
atomicity of mkdir() call.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Worktree creation race
  2019-02-01 13:16       ` Duy Nguyen
@ 2019-02-01 19:54         ` Eric Sunshine
  0 siblings, 0 replies; 7+ messages in thread
From: Eric Sunshine @ 2019-02-01 19:54 UTC (permalink / raw)
  To: Duy Nguyen; +Cc: Marketa Calabkova, Git List

On Fri, Feb 1, 2019 at 8:17 AM Duy Nguyen <pclouds@gmail.com> wrote:
> I'm not going to bother you with code yet (although if you want, you
> can check out branch worktree-name on my gitlab repo), but this is
> what the user facing changes look like. Looking good?
>
> -- 8< --
> diff --git a/Documentation/git-worktree.txt b/Documentation/git-worktree.txt
> @@ -9,7 +9,7 @@ git-worktree - Manage multiple working trees
>  SYNOPSIS
> +'git worktree add' [-f] [--detach] [--checkout] [--lock] [-b <new-branch>] [--name <name>] <path> [<commit-ish>]
> @@ -195,6 +195,13 @@ This can also be set up as the default behaviour by using the
> +--name <name>::
> +       Name of the working tree. Each working tree must have a unique
> +       name. This is also the directory name containing all working
> +       tree's specific information under `$GIT_COMMON_DIR/worktrees`.
> +       If `--name` is not given, the name is based on basename(3)
> +       optionally with a number suffix to make it unique.

I think the "DETAILS" section also needs to be updated. It presently says:

    The private sub-directory’s name is usually the base name of the
    linked working tree’s path, possibly appended with a number to
    make it unique.

Perhaps amend it to:

    ...to make it unique, or the name provided by `--name`.

>  Porcelain Format
>  ~~~~~~~~~~~~~~~~
>  The porcelain format has a line per attribute.  Attributes are listed with a
> @@ -341,10 +352,12 @@ $ git worktree list --porcelain
>  worktree /path/to/bare-source
>  bare
>
> +name linked
>  worktree /path/to/linked-worktree
>  HEAD abcd1234abcd1234abcd1234abcd1234abcd1234
>  branch refs/heads/master
>
> +name other
>  worktree /path/to/other-linked-worktree
>  HEAD 1234abc1234abc1234abc1234abc1234abc1234a
>  detached

Unfortunately, this will likely break existing tools. When I had
outlined the proposed porcelain format, my suggestion was that a
"worktree" line would indicate the start of a new stanza (with no
blank line), however, as it eventually got implemented (not by me),
that recommendation wasn't quite followed. Instead, what has been
documented "officially" in the git-worktree man-page is that each
stanza will start with a "worktree" line and that there will be a
blank line between stanzas. So, the new "name" line will need to be
placed somewhere after the "worktree" line in order to avoid tool
breakage.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2019-02-01 19:57 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-15 14:03 Worktree creation race Marketa Calabkova
2019-01-28 12:58 ` Marketa Calabkova
2019-02-01  6:27   ` Eric Sunshine
2019-02-01  7:06     ` Duy Nguyen
2019-02-01 13:16       ` Duy Nguyen
2019-02-01 19:54         ` Eric Sunshine
2019-02-01 18:06     ` Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).