git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* Is finalize_object_file in sha1_file.c handling errno from "rename" correctly?
@ 2017-09-15  6:18 Wesley Smith
  2017-09-15  6:50 ` Junio C Hamano
  0 siblings, 1 reply; 4+ messages in thread
From: Wesley Smith @ 2017-09-15  6:18 UTC (permalink / raw)
  To: git@vger.kernel.org

Using git 2.14.1 for Windows

I'm seeing an issue with the follow sequence of commands:


git init D:\XXX\workspace
git fetch --no-tags --progress https://XXX/_git/PAPI +refs/heads/*:refs/remotes/origin/* --depth=20
git fetch --no-tags --progress https://XXX/_git/PAPI +refs/heads/*:refs/remotes/origin/* --depth=20
git fetch --no-tags --progress https://XXX/_git/PAPI +refs/heads/*:refs/remotes/origin/* --depth=20

The third "git fetch" command hangs forever and takes 100% of the CPU.

I've debugged this a bit, and what I've found is that after the first fetch, the .git/objects/pack directory contains 2 files:

pack-b64910484b4254836a6413ce6a94019278fc54c5.pack
pack-b64910484b4254836a6413ce6a94019278fc54c5.idx


After the second fetch, the directory contains 4 files:

pack-b64910484b4254836a6413ce6a94019278fc54c5.pack
pack-b64910484b4254836a6413ce6a94019278fc54c5.idx
pack-ae983dc9c8057f4d5d2c8cdc3485cb6badde864b.pack
pack-ae983dc9c8057f4d5d2c8cdc3485cb6badde864b.idx

When the third "git fetch" is run, it spawns this chain of commands:

git fetch --no-tags --progress https://XXX /_git/PAPI +refs/heads/*:refs/remotes/origin/* --depth=20
  git remote-https https://XXX/_git/PAPI https://XXX/_git/PAPI
    git-remote-https https://XXX/_git/PAPI https://XXX/_git/PAPI
      git fetch-pack --stateless-rpc --stdin --lock-pack --thin --depth=20 https://XXX/_git/PAPI/
         git --shallow-file D:/XXX/workspace/.git/shallow.lock index-pack --stdin -v --fix-thin "--keep=fetch-pack 15728 on DT0004" --pack_header=2,3425

It's the final of these git instances  (the --shallow-file one) that's hanging.

Upon debugging this "git --shallow-file" process, the problem is as follows:  (line numbers relative to https://github.com/git/git/blob/master/sha1_file.c)

In sha1_file.c,  finalize_object_file is called with a tmpfile value of "tmp_pack_AmXsya" and a filename of "pack-ae983dc9c8057f4d5d2c8cdc3485cb6badde864b.pack". Note that this filename already exists (it was created by the second fetch).  On line 3236, the condition (object_creation_mode == OBJECT_CREATION_USES_RENAMES) is true on Windows, so the code runs the goto try_rename.

On line 1378,  rename is called, which on Windows is defined as a specialized function called mingw_rename.  I've identified a bug in this Windows-specific mingw_rename function that causes an infinite loop if the new filename (pack-ae983dc9c8057f4d5d2c8cdc3485cb6badde864b.pack) already exists, _and_ is locked by another process.  In this case, it appears that the first "git fetch" call in the process chain has opened the pack file, which is why this process can't rename the temp file to that name.

I can fix the infinite loop in the mingw_rename function, but the question is what errno should be returned by mingw_rename, and that brings me to my question regarding the finalize_object_file function.

On UNIX-style OSes, the code would first try to perform a "link" call in line 1380.  According to my reading of the link(2) man page, I think (but haven't tested) that link call would return EEXIST in this case (the newpath already exists).  If link returns EEXIST, then the code will skip most of the rest of the code in finalize_object_file, and will return 0 (success) on line 1411.  However, on systems where object_creation_mode is OBJECT_CREATION_USES_RENAMES, then the code will call "rename" instead on line 1396.   According to my reading of the rename(2) man page,  EACCES would  be returned in this case (because the pack file is locked by another process).  Notably, EEXIST would _not_ be returned from rename, as rename only returns EEXIST if "newpath is a nonempty directory".   Since finalize_object_file doesn't have any special logic for EACCES, if I fixed the Windows version of the rename function to return the correct errno (EACCES), then the finalize_object_file will return the error "unable to write sha1 filename" on 1403 and that will cause the program to die.

My questions:

1) This bug is triggered because "git fetch" is causing a pack file to be written when that same pack file already exists.  It seems like this is harmless and shouldn't cause a problem.  Is that correct?
2) It seems that finalize_object_file is not accounting for the fact that "link" will return EEXIST if the destination file already exists but is not writeable, whereas "rename" will return EACCESS in this case.  Is that correct?  If so, should finalize_object_file be fixed to account for this? Perhaps it should check if the newfile exists before calling rename.  Or, should the Windows mingw_rename function be modified to return EEXIST in this case, even though that's not the standard errno for that situation?

Thanks for your help,

Wesley Smith

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Is finalize_object_file in sha1_file.c handling errno from "rename" correctly?
  2017-09-15  6:18 Is finalize_object_file in sha1_file.c handling errno from "rename" correctly? Wesley Smith
@ 2017-09-15  6:50 ` Junio C Hamano
  2017-09-15 18:18   ` Wesley Smith
  0 siblings, 1 reply; 4+ messages in thread
From: Junio C Hamano @ 2017-09-15  6:50 UTC (permalink / raw)
  To: Wesley Smith; +Cc: git@vger.kernel.org

Wesley Smith <wsmith@greatergiving.com> writes:

> 1) This bug is triggered because "git fetch" is causing a pack
> file to be written when that same pack file already exists.  It
> seems like this is harmless and shouldn't cause a problem.  Is
> that correct?

The final name of the packfile is derived from the entire contents
of the packfile; it should be harmless when we attempt to rename a
new file, which has exactly the same contents as an existing file,
to the existing file and see a failure out of that attempt.

> 2) It seems that finalize_object_file is not accounting for the
> fact that "link" will return EEXIST if the destination file
> already exists but is not writeable, whereas "rename" will return
> EACCESS in this case.  Is that correct?  If so, should
> finalize_object_file be fixed to account for this? Perhaps it
> should check if the newfile exists before calling rename.  Or,
> should the Windows mingw_rename function be modified to return
> EEXIST in this case, even though that's not the standard errno for
> that situation?

The codepath that is triggered by OBJECT_CREATION_USES_RENAMES ought
to behave correctly even on non-Windows platforms, so bending the
error code of rename() only on Windows to fit the existing error
handling would not be a smart thing to do.  Rather, the rename()
emulation should leave a correct errno and the caller should be
updated to be aware of that error that is not EEXIST, which it
currently knows about.

Thanks for spotting a problem and digging to its cause.

This is a #leftoverbits tangent, and should be done separately from
your "OBJECT_CREATION_USES_RENAMES is broken" topic, but I think it
is a bug to use finalize_object_file() directly to "finalize"
anything but an individual loose object file in the first place.

We should create a new shared helper that does what the function
currently does, make finalize_object_file() call that new shared
helper, and make sure finalize_object_file() is called only on a
newly created loose object file.  The codepath that creates a new
packfile and other things and moves them to the final name should
not call finalize_object_file() but the new shared helper instead.

That way, we could later implement the "collision? check" alluded by
the in-code comment in finailize_object_file(), and we won't have to
worry about affecting callers other than the one that creates a
loose object file with such an enhancement.


^ permalink raw reply	[flat|nested] 4+ messages in thread

* RE: Is finalize_object_file in sha1_file.c handling errno from "rename" correctly?
  2017-09-15  6:50 ` Junio C Hamano
@ 2017-09-15 18:18   ` Wesley Smith
  2017-09-29 23:21     ` Johannes Schindelin
  0 siblings, 1 reply; 4+ messages in thread
From: Wesley Smith @ 2017-09-15 18:18 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git@vger.kernel.org

Junio,

Thanks for your response.  I'm glad to see that you've been able to understand the problem.  I'm working with the Windows git team to properly return EACCESS when "rename" fails due to access permissions, but it also sounds like there will need to be a fix to finalize_object_file to better handle the case of an existing file when renaming.

Wesley Smith
T: 503.597.0556 | WSMITH@GREATERGIVING.COM

-----Original Message-----
From: Junio C Hamano [mailto:gitster@pobox.com] 
Sent: Thursday, September 14, 2017 11:51 PM
To: Wesley Smith
Cc: git@vger.kernel.org
Subject: Re: Is finalize_object_file in sha1_file.c handling errno from "rename" correctly?

Wesley Smith <wsmith@greatergiving.com> writes:

> 1) This bug is triggered because "git fetch" is causing a pack file to 
> be written when that same pack file already exists.  It seems like 
> this is harmless and shouldn't cause a problem.  Is that correct?

The final name of the packfile is derived from the entire contents of the packfile; it should be harmless when we attempt to rename a new file, which has exactly the same contents as an existing file, to the existing file and see a failure out of that attempt.

> 2) It seems that finalize_object_file is not accounting for the fact 
> that "link" will return EEXIST if the destination file already exists 
> but is not writeable, whereas "rename" will return EACCESS in this 
> case.  Is that correct?  If so, should finalize_object_file be fixed 
> to account for this? Perhaps it should check if the newfile exists 
> before calling rename.  Or, should the Windows mingw_rename function 
> be modified to return EEXIST in this case, even though that's not the 
> standard errno for that situation?

The codepath that is triggered by OBJECT_CREATION_USES_RENAMES ought to behave correctly even on non-Windows platforms, so bending the error code of rename() only on Windows to fit the existing error handling would not be a smart thing to do.  Rather, the rename() emulation should leave a correct errno and the caller should be updated to be aware of that error that is not EEXIST, which it currently knows about.

Thanks for spotting a problem and digging to its cause.

This is a #leftoverbits tangent, and should be done separately from your "OBJECT_CREATION_USES_RENAMES is broken" topic, but I think it is a bug to use finalize_object_file() directly to "finalize"
anything but an individual loose object file in the first place.

We should create a new shared helper that does what the function currently does, make finalize_object_file() call that new shared helper, and make sure finalize_object_file() is called only on a newly created loose object file.  The codepath that creates a new packfile and other things and moves them to the final name should not call finalize_object_file() but the new shared helper instead.

That way, we could later implement the "collision? check" alluded by the in-code comment in finailize_object_file(), and we won't have to worry about affecting callers other than the one that creates a loose object file with such an enhancement.


^ permalink raw reply	[flat|nested] 4+ messages in thread

* RE: Is finalize_object_file in sha1_file.c handling errno from "rename" correctly?
  2017-09-15 18:18   ` Wesley Smith
@ 2017-09-29 23:21     ` Johannes Schindelin
  0 siblings, 0 replies; 4+ messages in thread
From: Johannes Schindelin @ 2017-09-29 23:21 UTC (permalink / raw)
  To: Wesley Smith; +Cc: Junio C Hamano, git@vger.kernel.org

Hi,

On Fri, 15 Sep 2017, Wesley Smith wrote:

> Thanks for your response.  I'm glad to see that you've been able to understand the problem.  I'm working with the Windows git team to properly return EACCESS when "rename" fails due to access permissions, but it also sounds like there will need to be a fix to finalize_object_file to better handle the case of an existing file when renaming.

The Windows part of the problem was fixed in v2.14.2.

Ciao,
Johannes

[ leaving the rest of the quoted mail intact intentionally, to give
readers a chance to read up on the context ]

> Wesley Smith
> T: 503.597.0556 | WSMITH@GREATERGIVING.COM
> 
> -----Original Message-----
> From: Junio C Hamano [mailto:gitster@pobox.com] 
> Sent: Thursday, September 14, 2017 11:51 PM
> To: Wesley Smith
> Cc: git@vger.kernel.org
> Subject: Re: Is finalize_object_file in sha1_file.c handling errno from "rename" correctly?
> 
> Wesley Smith <wsmith@greatergiving.com> writes:
> 
> > 1) This bug is triggered because "git fetch" is causing a pack file to 
> > be written when that same pack file already exists.  It seems like 
> > this is harmless and shouldn't cause a problem.  Is that correct?
> 
> The final name of the packfile is derived from the entire contents of the packfile; it should be harmless when we attempt to rename a new file, which has exactly the same contents as an existing file, to the existing file and see a failure out of that attempt.
> 
> > 2) It seems that finalize_object_file is not accounting for the fact 
> > that "link" will return EEXIST if the destination file already exists 
> > but is not writeable, whereas "rename" will return EACCESS in this 
> > case.  Is that correct?  If so, should finalize_object_file be fixed 
> > to account for this? Perhaps it should check if the newfile exists 
> > before calling rename.  Or, should the Windows mingw_rename function 
> > be modified to return EEXIST in this case, even though that's not the 
> > standard errno for that situation?
> 
> The codepath that is triggered by OBJECT_CREATION_USES_RENAMES ought to behave correctly even on non-Windows platforms, so bending the error code of rename() only on Windows to fit the existing error handling would not be a smart thing to do.  Rather, the rename() emulation should leave a correct errno and the caller should be updated to be aware of that error that is not EEXIST, which it currently knows about.
> 
> Thanks for spotting a problem and digging to its cause.
> 
> This is a #leftoverbits tangent, and should be done separately from your "OBJECT_CREATION_USES_RENAMES is broken" topic, but I think it is a bug to use finalize_object_file() directly to "finalize"
> anything but an individual loose object file in the first place.
> 
> We should create a new shared helper that does what the function currently does, make finalize_object_file() call that new shared helper, and make sure finalize_object_file() is called only on a newly created loose object file.  The codepath that creates a new packfile and other things and moves them to the final name should not call finalize_object_file() but the new shared helper instead.
> 
> That way, we could later implement the "collision? check" alluded by the in-code comment in finailize_object_file(), and we won't have to worry about affecting callers other than the one that creates a loose object file with such an enhancement.
> 
> 

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2017-09-29 23:21 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-09-15  6:18 Is finalize_object_file in sha1_file.c handling errno from "rename" correctly? Wesley Smith
2017-09-15  6:50 ` Junio C Hamano
2017-09-15 18:18   ` Wesley Smith
2017-09-29 23:21     ` Johannes Schindelin

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).