git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* File locking issues with fetching submodules in parallel
@ 2018-02-20 16:20 Johannes Schindelin
  2018-02-20 20:40 ` Stefan Beller
  0 siblings, 1 reply; 3+ messages in thread
From: Johannes Schindelin @ 2018-02-20 16:20 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git

Hi Stefan (and other submodule wizards),

Igor Melnichenko reported a submodule problem in a Git for Windows ticket:

	https://github.com/git-for-windows/git/issues/1469

Part of it is due to Windows' behavior where you cannot read the same file
in one process while you write it in another process.

The other part is how our submodule code works in parallel. In particular,
we seem to write the `.git` file maybe even every time a submodule is
fetched, unconditionally, not even testing whether the .git file is
already there and contains the correct contents?

For some reason, the bug reporter saw a "Permission denied" on the `.git`
file when we try to write it (and I am pretty certain that I tracked it
down correctly to the `connect_work_tree_and_git_dir()` function). The
intermittent "Permission denied" error seems to suggest that *another*
process is accessing this file while we are writing it.

It also seems that this problem becomes worse if the firewall is turned
on, in which case a couple of network operations become a little slower
(which I could imagine to be the factor making the problems more likely).

A plausible workaround would be to write the `.git` file only when needed
(which also would fix a bug on macOS/Linux, although the window is
substantially smaller: the reader could potentially read a
partially-written file).

But maybe we are simply holding onto an open handle to the `.git` file too
long?

I tried to put together a bit more information here:

https://github.com/git-for-windows/git/issues/1469#issuecomment-366932746

Do you think there is an easy solution for this? You're much deeper in the
submodule code than I ever want to be...

Thanks,
Dscho

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: File locking issues with fetching submodules in parallel
  2018-02-20 16:20 File locking issues with fetching submodules in parallel Johannes Schindelin
@ 2018-02-20 20:40 ` Stefan Beller
  2018-02-20 22:27   ` Johannes Schindelin
  0 siblings, 1 reply; 3+ messages in thread
From: Stefan Beller @ 2018-02-20 20:40 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git

On Tue, Feb 20, 2018 at 8:20 AM, Johannes Schindelin
<Johannes.Schindelin@gmx.de> wrote:
> Hi Stefan (and other submodule wizards),
>
> Igor Melnichenko reported a submodule problem in a Git for Windows ticket:
>
>         https://github.com/git-for-windows/git/issues/1469
>

Thanks for bringing this up. Is it better to discuss it here on at that issue?

> Part of it is due to Windows' behavior where you cannot read the same file
> in one process while you write it in another process.
>
> The other part is how our submodule code works in parallel. In particular,
> we seem to write the `.git` file maybe even every time a submodule is
> fetched, unconditionally, not even testing whether the .git file is
> already there and contains the correct contents?

only when they need to be newly cloned (as that is when the
submodule--helper clone is invoked). When fetching, we'd not touch
the '.git' file IIRC.

> For some reason, the bug reporter saw a "Permission denied" on the `.git`
> file when we try to write it (and I am pretty certain that I tracked it
> down correctly to the `connect_work_tree_and_git_dir()` function). The
> intermittent "Permission denied" error seems to suggest that *another*
> process is accessing this file while we are writing it.

The reporter also reports this coming out of "git submodule add",
which is not parallelized, but rather in the messy state of migrating
as much code into C from shell, trying to not introduce to many bugs. ;)

So for add there is no parallelism I am aware of, which makes me wonder
how the race is coming in there.

I recall making the decision to unconditionally write out the '.git' file
for other operations like "git reset --hard --recurse-submodules" or
"git checkout --force --recurse-submodules", as there you are really asking
for a clean slate. I missed the Windows FS semantics for that case,
so I guess re-reading the file to make sure it is already correct would
avoid issues on Windows there. (though I do not yet recall how the
parallel stuff would bite us there, as the file is written as the very
first thing)

> It also seems that this problem becomes worse if the firewall is turned
> on, in which case a couple of network operations become a little slower
> (which I could imagine to be the factor making the problems more likely).
>
> A plausible workaround would be to write the `.git` file only when needed
> (which also would fix a bug on macOS/Linux, although the window is
> substantially smaller: the reader could potentially read a
> partially-written file).
>
> But maybe we are simply holding onto an open handle to the `.git` file too
> long?

That could very well be the case.

> I tried to put together a bit more information here:
>
> https://github.com/git-for-windows/git/issues/1469#issuecomment-366932746

Thanks!

> Do you think there is an easy solution for this? You're much deeper in the
> submodule code than I ever want to be...

I'll take a look for open handles, though we're already using our
wrapper functions
like 'write_file', which is supposed to close the handle gracefully (and has
undergone testing on Windows a lot as it is the standard write function)

Thanks,
Stefan

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: File locking issues with fetching submodules in parallel
  2018-02-20 20:40 ` Stefan Beller
@ 2018-02-20 22:27   ` Johannes Schindelin
  0 siblings, 0 replies; 3+ messages in thread
From: Johannes Schindelin @ 2018-02-20 22:27 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git

Hi Stefan,

On Tue, 20 Feb 2018, Stefan Beller wrote:

> On Tue, Feb 20, 2018 at 8:20 AM, Johannes Schindelin
> <Johannes.Schindelin@gmx.de> wrote:
> > Hi Stefan (and other submodule wizards),
> >
> > Igor Melnichenko reported a submodule problem in a Git for Windows ticket:
> >
> >         https://github.com/git-for-windows/git/issues/1469
> >
> 
> Thanks for bringing this up. Is it better to discuss it here on at that
> issue?

It's up to you. I have no preference.

> > Part of it is due to Windows' behavior where you cannot read the same
> > file in one process while you write it in another process.
> >
> > The other part is how our submodule code works in parallel. In
> > particular, we seem to write the `.git` file maybe even every time a
> > submodule is fetched, unconditionally, not even testing whether the
> > .git file is already there and contains the correct contents?
> 
> only when they need to be newly cloned (as that is when the
> submodule--helper clone is invoked). When fetching, we'd not touch the
> '.git' file IIRC.

Okay. I saw multiple calls to the same connect_work_tree_and_git_dir()
function in submodule.c, and was not so sure that it was only clone.

But the bug report talks about `git submodule add` with a new URL and a
new submodule location, so I guess it is going down the clone path.

> > For some reason, the bug reporter saw a "Permission denied" on the
> > `.git` file when we try to write it (and I am pretty certain that I
> > tracked it down correctly to the `connect_work_tree_and_git_dir()`
> > function). The intermittent "Permission denied" error seems to suggest
> > that *another* process is accessing this file while we are writing it.
> 
> The reporter also reports this coming out of "git submodule add",
> which is not parallelized, but rather in the messy state of migrating
> as much code into C from shell, trying to not introduce to many bugs. ;)
> 
> So for add there is no parallelism I am aware of, which makes me wonder
> how the race is coming in there.

Hrm. You're right, there is no indication in that bug report that `git
submodule` spawns multiple processes in parallel.

> I recall making the decision to unconditionally write out the '.git' file
> for other operations like "git reset --hard --recurse-submodules" or
> "git checkout --force --recurse-submodules", as there you are really asking
> for a clean slate. I missed the Windows FS semantics for that case,
> so I guess re-reading the file to make sure it is already correct would
> avoid issues on Windows there. (though I do not yet recall how the
> parallel stuff would bite us there, as the file is written as the very
> first thing)

Maybe it is not parallel, but maybe it is simply an unclosed handle to the
.git file? I do not see anything in read_gitfile_gently() (which is the
only location reading the .git file I know of)... Did I miss anything?

> > It also seems that this problem becomes worse if the firewall is turned
> > on, in which case a couple of network operations become a little slower
> > (which I could imagine to be the factor making the problems more likely).
> >
> > A plausible workaround would be to write the `.git` file only when needed
> > (which also would fix a bug on macOS/Linux, although the window is
> > substantially smaller: the reader could potentially read a
> > partially-written file).
> >
> > But maybe we are simply holding onto an open handle to the `.git` file
> > too long?
> 
> That could very well be the case.

But as I said above, I fail to find any smoking gun in the source code.
The only code path I see that reads the .git file releases the resources
correctly AFAICT.

> > I tried to put together a bit more information here:
> >
> > https://github.com/git-for-windows/git/issues/1469#issuecomment-366932746
> 
> Thanks!
> 
> > Do you think there is an easy solution for this? You're much deeper in the
> > submodule code than I ever want to be...
> 
> I'll take a look for open handles, though we're already using our
> wrapper functions
> like 'write_file', which is supposed to close the handle gracefully (and has
> undergone testing on Windows a lot as it is the standard write function)

Right, even some of my code uses this successfully...

Curious problem.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2018-02-20 22:27 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-02-20 16:20 File locking issues with fetching submodules in parallel Johannes Schindelin
2018-02-20 20:40 ` Stefan Beller
2018-02-20 22:27   ` Johannes Schindelin

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).