git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* on Amazon EFS (NFS): "Reference directory conflict: refs/heads/" with status code 128
@ 2016-08-24 20:52 Alex Nauda
  2016-08-24 21:39 ` Jeff King
  0 siblings, 1 reply; 5+ messages in thread
From: Alex Nauda @ 2016-08-24 20:52 UTC (permalink / raw)
  To: git

Elastic File System (EFS) is Amazon's scalable filesystem product that
is exposed to the OS as an NFS mount. We're using EFS to host the
filesystem used by a Jenkins CI server. Sometimes when Jenkins tries
to git fetch, we get this error:
$ git -c core.askpass=true fetch --tags --progress
git@github.com:mediasilo/dodo.git
+refs/pull/*:refs/remotes/origin/pr/*
fatal: Reference directory conflict: refs/heads/
$ echo $? 128

Has anyone seen anything like this before? Any tips on how to troubleshoot it?

Related Jenkins issue: https://issues.jenkins-ci.org/browse/JENKINS-37653

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: on Amazon EFS (NFS): "Reference directory conflict: refs/heads/" with status code 128
  2016-08-24 20:52 on Amazon EFS (NFS): "Reference directory conflict: refs/heads/" with status code 128 Alex Nauda
@ 2016-08-24 21:39 ` Jeff King
  2016-08-25  6:28   ` Michael Haggerty
  0 siblings, 1 reply; 5+ messages in thread
From: Jeff King @ 2016-08-24 21:39 UTC (permalink / raw)
  To: Alex Nauda; +Cc: Michael Haggerty, git

On Wed, Aug 24, 2016 at 04:52:33PM -0400, Alex Nauda wrote:

> Elastic File System (EFS) is Amazon's scalable filesystem product that
> is exposed to the OS as an NFS mount. We're using EFS to host the
> filesystem used by a Jenkins CI server. Sometimes when Jenkins tries
> to git fetch, we get this error:
> $ git -c core.askpass=true fetch --tags --progress
> git@github.com:mediasilo/dodo.git
> +refs/pull/*:refs/remotes/origin/pr/*
> fatal: Reference directory conflict: refs/heads/
> $ echo $? 128
> 
> Has anyone seen anything like this before? Any tips on how to troubleshoot it?

No, I haven't seen it before. That's an internal assertion in the refs
code that shouldn't ever happen. It looks like it happens when the loose
refs end up with duplicate directory entries. While a bug in git is an
obvious culprit, I wonder if it's possible that your filesystem might
expose the same name twice in one set of readdir() results.

+cc Michael, who added this assertion long ago (and since this is the
first report in all these years, it does make me suspect that the
filesystem is a critical part of reproducing).

-Peff

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: on Amazon EFS (NFS): "Reference directory conflict: refs/heads/" with status code 128
  2016-08-24 21:39 ` Jeff King
@ 2016-08-25  6:28   ` Michael Haggerty
  2016-08-25 16:01     ` Alex Nauda
  0 siblings, 1 reply; 5+ messages in thread
From: Michael Haggerty @ 2016-08-25  6:28 UTC (permalink / raw)
  To: Jeff King, Alex Nauda; +Cc: git

On 08/24/2016 11:39 PM, Jeff King wrote:
> On Wed, Aug 24, 2016 at 04:52:33PM -0400, Alex Nauda wrote:
> 
>> Elastic File System (EFS) is Amazon's scalable filesystem product that
>> is exposed to the OS as an NFS mount. We're using EFS to host the
>> filesystem used by a Jenkins CI server. Sometimes when Jenkins tries
>> to git fetch, we get this error:
>> $ git -c core.askpass=true fetch --tags --progress
>> git@github.com:mediasilo/dodo.git
>> +refs/pull/*:refs/remotes/origin/pr/*
>> fatal: Reference directory conflict: refs/heads/
>> $ echo $? 128
>>
>> Has anyone seen anything like this before? Any tips on how to troubleshoot it?
> 
> No, I haven't seen it before. That's an internal assertion in the refs
> code that shouldn't ever happen. It looks like it happens when the loose
> refs end up with duplicate directory entries. While a bug in git is an
> obvious culprit, I wonder if it's possible that your filesystem might
> expose the same name twice in one set of readdir() results.
> 
> +cc Michael, who added this assertion long ago (and since this is the
> first report in all these years, it does make me suspect that the
> filesystem is a critical part of reproducing).

Thanks for the CC.

I've never heard of this problem before.

What Git version are you using?

I tried to provoke the problem by hand-corrupting the packed-refs file,
but wasn't successful.

So Peff's suggestion that the problem originates in your filesystem
seems to be to be the most likely cause. A quick Google search found,
for example,

    https://bugzilla.redhat.com/show_bug.cgi?id=739222

http://superuser.com/questions/640419/how-can-i-have-two-files-with-the-same-name-in-a-directory-when-mounted-with-nfs

though these reports seem connected with having lots of files in the
directory, which seems unlikely for `$GIT_DIR/refs/`. But I didn't do a
more careful search, and it is easily possible that there are other bugs
in NFS (or EFS) that could be affecting you.

If this were repeatable, you could run Git under strace to test Peff's
hypothesis. But I suppose it only happens rarely, right?

Is it possible that multiple clients have the same NFS filesystem
mounted while Git is running? That would seem like an especially bad
idea and I could imagine it leading to problems like this.

It's surprising that you are seeing this problem in directory `refs`,
because (1) that directory is unlikely to have very many entries, and
(2) as far as I remember, Git will never delete the directories
`refs/heads` and `refs/tags`.

Michael


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: on Amazon EFS (NFS): "Reference directory conflict: refs/heads/" with status code 128
  2016-08-25  6:28   ` Michael Haggerty
@ 2016-08-25 16:01     ` Alex Nauda
  2016-08-26  0:06       ` Michael Haggerty
  0 siblings, 1 reply; 5+ messages in thread
From: Alex Nauda @ 2016-08-25 16:01 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: Jeff King, git

On Thu, Aug 25, 2016 at 2:28 AM, Michael Haggerty <mhagger@alum.mit.edu> wrote:
> On 08/24/2016 11:39 PM, Jeff King wrote:
>> On Wed, Aug 24, 2016 at 04:52:33PM -0400, Alex Nauda wrote:
>>
>>> Elastic File System (EFS) is Amazon's scalable filesystem product that
>>> is exposed to the OS as an NFS mount. We're using EFS to host the
>>> filesystem used by a Jenkins CI server. Sometimes when Jenkins tries
>>> to git fetch, we get this error:
>>> $ git -c core.askpass=true fetch --tags --progress
>>> git@github.com:mediasilo/dodo.git
>>> +refs/pull/*:refs/remotes/origin/pr/*
>>> fatal: Reference directory conflict: refs/heads/
>>> $ echo $? 128
>>>
>>> Has anyone seen anything like this before? Any tips on how to troubleshoot it?
>>
>> No, I haven't seen it before. That's an internal assertion in the refs
>> code that shouldn't ever happen. It looks like it happens when the loose
>> refs end up with duplicate directory entries. While a bug in git is an
>> obvious culprit, I wonder if it's possible that your filesystem might
>> expose the same name twice in one set of readdir() results.
>>
>> +cc Michael, who added this assertion long ago (and since this is the
>> first report in all these years, it does make me suspect that the
>> filesystem is a critical part of reproducing).
>
> Thanks for the CC.
>
> I've never heard of this problem before.
>
> What Git version are you using?
Git client 2.7.4 against GitHub (Git 2.6.5)

>
> I tried to provoke the problem by hand-corrupting the packed-refs file,
> but wasn't successful.
>
> So Peff's suggestion that the problem originates in your filesystem
> seems to be to be the most likely cause. A quick Google search found,
> for example,
>
>     https://bugzilla.redhat.com/show_bug.cgi?id=739222
>
> http://superuser.com/questions/640419/how-can-i-have-two-files-with-the-same-name-in-a-directory-when-mounted-with-nfs
>
> though these reports seem connected with having lots of files in the
> directory, which seems unlikely for `$GIT_DIR/refs/`. But I didn't do a
> more careful search, and it is easily possible that there are other bugs
> in NFS (or EFS) that could be affecting you.
>
> If this were repeatable, you could run Git under strace to test Peff's
> hypothesis. But I suppose it only happens rarely, right?
Actually it seems to be reproducible. Here's the last portion of an strace:

[...]
stat(".git/refs/remotes/origin/pr/7/head", {st_mode=S_IFREG|0644,
st_size=41, ...}) = 0
lstat(".git/refs/remotes/origin/pr/7/head", {st_mode=S_IFREG|0644,
st_size=41, ...}) = 0
open(".git/refs/remotes/origin/pr/7/head", O_RDONLY) = 4
read(4, "5d82811a248900efd8e201c6d9232de5"..., 256) = 41
read(4, "", 215)                        = 0
close(4)                                = 0
getdents(3, /* 0 entries */, 32768)     = 0
close(3)                                = 0
open(".git/refs/remotes/origin/pr/16/",
O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
getdents(3, /* 3 entries */, 32768)     = 72
stat(".git/refs/remotes/origin/pr/16/head", {st_mode=S_IFREG|0644,
st_size=41, ...}) = 0
lstat(".git/refs/remotes/origin/pr/16/head", {st_mode=S_IFREG|0644,
st_size=41, ...}) = 0
open(".git/refs/remotes/origin/pr/16/head", O_RDONLY) = 4
read(4, "2886c4f3ba8c3b5c2306029f6e39498d"..., 256) = 41
read(4, "", 215)                        = 0
close(4)                                = 0
getdents(3, /* 0 entries */, 32768)     = 0
close(3)                                = 0
open(".git/refs/tags/", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
getdents(3, /* 2 entries */, 32768)     = 48
getdents(3, /* 0 entries */, 32768)     = 0
close(3)                                = 0
open(".git/refs/bisect/", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) =
-1 ENOENT (No such file or directory)
open(".git/packed-refs", O_RDONLY)      = -1 ENOENT (No such file or directory)
fstat(2, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 3), ...}) = 0
write(2, "fatal: Reference directory confl"..., 58fatal: Reference
directory conflict: refs/remotes/origin/
) = 58
exit_group(128)                         = ?
+++ exited with 128 +++

>
> Is it possible that multiple clients have the same NFS filesystem
> mounted while Git is running? That would seem like an especially bad
> idea and I could imagine it leading to problems like this.
>
> It's surprising that you are seeing this problem in directory `refs`,
> because (1) that directory is unlikely to have very many entries, and
> (2) as far as I remember, Git will never delete the directories
> `refs/heads` and `refs/tags`.
Seems like sometimes it happens on other directories:
refs/remotes/origin/ or refs/remotes/origin/pr/1
Then as I was stracing it again, suddenly it succeeded. Some kind of
race condition?

>
> Michael
>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: on Amazon EFS (NFS): "Reference directory conflict: refs/heads/" with status code 128
  2016-08-25 16:01     ` Alex Nauda
@ 2016-08-26  0:06       ` Michael Haggerty
  0 siblings, 0 replies; 5+ messages in thread
From: Michael Haggerty @ 2016-08-26  0:06 UTC (permalink / raw)
  To: Alex Nauda; +Cc: Jeff King, git

On 08/25/2016 06:01 PM, Alex Nauda wrote:
> On Thu, Aug 25, 2016 at 2:28 AM, Michael Haggerty <mhagger@alum.mit.edu> wrote:
>> On 08/24/2016 11:39 PM, Jeff King wrote:
>>> On Wed, Aug 24, 2016 at 04:52:33PM -0400, Alex Nauda wrote:
>>>
>>>> Elastic File System (EFS) is Amazon's scalable filesystem product that
>>>> is exposed to the OS as an NFS mount. We're using EFS to host the
>>>> filesystem used by a Jenkins CI server. Sometimes when Jenkins tries
>>>> to git fetch, we get this error:
>>>> $ git -c core.askpass=true fetch --tags --progress
>>>> git@github.com:mediasilo/dodo.git
>>>> +refs/pull/*:refs/remotes/origin/pr/*
>>>> fatal: Reference directory conflict: refs/heads/
>>>> $ echo $? 128
>>>>
>>>> Has anyone seen anything like this before? Any tips on how to troubleshoot it?
>>>
>>> No, I haven't seen it before. That's an internal assertion in the refs
>>> code that shouldn't ever happen. It looks like it happens when the loose
>>> refs end up with duplicate directory entries. While a bug in git is an
>>> obvious culprit, I wonder if it's possible that your filesystem might
>>> expose the same name twice in one set of readdir() results.
>>>
>>> +cc Michael, who added this assertion long ago (and since this is the
>>> first report in all these years, it does make me suspect that the
>>> filesystem is a critical part of reproducing).
>>
>> Thanks for the CC.
>>
>> I've never heard of this problem before.
>>
>> What Git version are you using?
> Git client 2.7.4 against GitHub (Git 2.6.5)
> 
>>
>> I tried to provoke the problem by hand-corrupting the packed-refs file,
>> but wasn't successful.
>>
>> So Peff's suggestion that the problem originates in your filesystem
>> seems to be to be the most likely cause. A quick Google search found,
>> for example,
>>
>>     https://bugzilla.redhat.com/show_bug.cgi?id=739222
>>
>> http://superuser.com/questions/640419/how-can-i-have-two-files-with-the-same-name-in-a-directory-when-mounted-with-nfs
>>
>> though these reports seem connected with having lots of files in the
>> directory, which seems unlikely for `$GIT_DIR/refs/`. But I didn't do a
>> more careful search, and it is easily possible that there are other bugs
>> in NFS (or EFS) that could be affecting you.
>>
>> If this were repeatable, you could run Git under strace to test Peff's
>> hypothesis. But I suppose it only happens rarely, right?
> Actually it seems to be reproducible. Here's the last portion of an strace:
> 
> [...]
> stat(".git/refs/remotes/origin/pr/7/head", {st_mode=S_IFREG|0644,
> st_size=41, ...}) = 0
> lstat(".git/refs/remotes/origin/pr/7/head", {st_mode=S_IFREG|0644,
> st_size=41, ...}) = 0
> open(".git/refs/remotes/origin/pr/7/head", O_RDONLY) = 4
> read(4, "5d82811a248900efd8e201c6d9232de5"..., 256) = 41
> read(4, "", 215)                        = 0
> close(4)                                = 0
> getdents(3, /* 0 entries */, 32768)     = 0
> close(3)                                = 0
> open(".git/refs/remotes/origin/pr/16/",
> O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 3
> fstat(3, {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
> getdents(3, /* 3 entries */, 32768)     = 72
> stat(".git/refs/remotes/origin/pr/16/head", {st_mode=S_IFREG|0644,
> st_size=41, ...}) = 0
> lstat(".git/refs/remotes/origin/pr/16/head", {st_mode=S_IFREG|0644,
> st_size=41, ...}) = 0
> open(".git/refs/remotes/origin/pr/16/head", O_RDONLY) = 4
> read(4, "2886c4f3ba8c3b5c2306029f6e39498d"..., 256) = 41
> read(4, "", 215)                        = 0
> close(4)                                = 0
> getdents(3, /* 0 entries */, 32768)     = 0
> close(3)                                = 0
> open(".git/refs/tags/", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 3
> fstat(3, {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
> getdents(3, /* 2 entries */, 32768)     = 48
> getdents(3, /* 0 entries */, 32768)     = 0
> close(3)                                = 0
> open(".git/refs/bisect/", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) =
> -1 ENOENT (No such file or directory)
> open(".git/packed-refs", O_RDONLY)      = -1 ENOENT (No such file or directory)
> fstat(2, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 3), ...}) = 0
> write(2, "fatal: Reference directory confl"..., 58fatal: Reference
> directory conflict: refs/remotes/origin/
> ) = 58
> exit_group(128)                         = ?
> +++ exited with 128 +++

Thanks for the additional information.

From the strace output it is clear that there is no packed-refs file at
the time of the problem, so the problem must be among the loose refs.

The error is a "Reference directory conflict", which suggests that
"refs/remotes/origin/" appears in two entries; once as a reference
directory and once as a reference. But in fact it could also mean that
"refs/remotes/origin/" appears twice, both as directories. Neither one
should happen in normal operation.

Unfortunately there is not enough strace output to see whether (in this
case) path `refs/remotes/origin` was reported twice by `getdents()`. I
think that is still the most likely hypothesis, especially since the
reports of this problem all seem to be on NFS + EBS.

One other long shot:

I see that you are fetching into refs/remotes/origin/pr/*. What is the
full refspec configuration for origin?

I know it used to be recommended to set up *two* refspecs when fetching
from GitHub:

    +refs/heads/*:refs/remotes/origin/*
    +refs/pull/*:refs/remotes/origin/pr/*

This is not such a good idea (and I think it is no longer recommended)
because, for example, a remote reference named `refs/heads/pr/5/head`
would want to end up at the same path as a remote PR branch
`refs/pull/5/head`, namely `refs/remotes/origin/pr/5/head`. That
shouldn't be allowed by Git, but since this sounds a little bit similar
to your problem I thought I'd ask anyway.

Are you using such a configuration? If so, can you reproduce the problem
with the `refs/pull/*` refspec disabled?

Michael


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2016-08-26  0:09 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-08-24 20:52 on Amazon EFS (NFS): "Reference directory conflict: refs/heads/" with status code 128 Alex Nauda
2016-08-24 21:39 ` Jeff King
2016-08-25  6:28   ` Michael Haggerty
2016-08-25 16:01     ` Alex Nauda
2016-08-26  0:06       ` Michael Haggerty

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).