git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* git hangs on pthread_join
@ 2013-05-23 13:01 Ian Kumlien
  2013-05-23 19:45 ` Martin Fick
  2013-05-28 17:51 ` Jeff King
  0 siblings, 2 replies; 5+ messages in thread
From: Ian Kumlien @ 2013-05-23 13:01 UTC (permalink / raw)
  To: git

Hi,

I'm running a rather special configuration, basically i have a gerrit
server pushing
git data over openvpn connections (company regulations n' stuff)...

git 1.8.2.1 is started by xinetd
...
        port            = 9418
        socket_type     = stream
        wait            = no
        user            = gerrit2
        server          = /usr/bin/git
        server_args     =  daemon --inetd --syslog --export-all
--enable=receive-pack --init-timeout=3 --timeout=180 --base-path=<path>
...
        nice            = 10
        per_source      = UNLIMITED
        instances       = UNLIMITED
        flags           = KEEPALIVE NODELAY
---

Keepalive and nodelay has been added post fact, the same goes for the
timeouts.

I have found "git receive-pack"s that has been running for days/weeks
without terminating....

Attaching gdb and doing a trace results in:
#0  0x0000003261207b35 in pthread_join () from /lib64/libpthread.so.0
#1  0x00000000004ce58b in finish_async ()
#2  0x000000000045744b in cmd_receive_pack ()
#3  0x0000000000404851 in handle_internal_command ()
#4  0x0000000000404c9d in main ()
(sorry don't have any debug data for the binary packages apparenlty (rpms
was
built from the official source))

(RHEL 5 machine with glibc 2.5-65.el5_7.1)

Anyone that has any clues about what could be going wrong?

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: git hangs on pthread_join
  2013-05-23 13:01 Ian Kumlien
@ 2013-05-23 19:45 ` Martin Fick
  2013-05-28 17:51 ` Jeff King
  1 sibling, 0 replies; 5+ messages in thread
From: Martin Fick @ 2013-05-23 19:45 UTC (permalink / raw)
  To: Ian Kumlien; +Cc: git

On Thursday, May 23, 2013 07:01:43 am you wrote:
> 
> I'm running a rather special configuration, basically i
> have a gerrit server pushing
... 
> I have found "git receive-pack"s that has been running
> for days/weeks without terminating....
> 
... 
> Anyone that has any clues about what could be going
> wrong? --


Have you narrowed down whether this is a git client problem, 
or a server problem (gerrit in your case).  Is this a 
repeatable issue.  Try the same operation against a clone of 
the repo using just git.  Check on the server side for .noz 
files in you repo (a jgit thing),

-Martin

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: git hangs on pthread_join
@ 2013-05-27 13:58 Ian Kumlien
  0 siblings, 0 replies; 5+ messages in thread
From: Ian Kumlien @ 2013-05-27 13:58 UTC (permalink / raw)
  To: git

I forgot to reply to the mailing list and now something went wrong with                                                                                               
the messages in mutt =P                                                                                                                                               
                                                                                                                                                                      
Recap:ing:                                     

On Thursday, May 23, 2013 07:01:43 am you wrote:                                                                                                                      
> I'm running a rather special configuration, basically i                                                                                                             
> have a gerrit server pushing                                                                                                                                        
...                                                                                                                                                                   
> I have found "git receive-pack"s that has been running                                                                                                              
> for days/weeks without terminating....                                                                                                                              
>                                                                                                                                                                     
...                                                                                                                                                                   
> Anyone that has any clues about what could be going                                                                                                                 
> wrong? --                                                                                                                                                           
                                                                                                                                                                      
                                                                                                                                                                      
Have you narrowed down whether this is a git client problem,                                                                                                          
or a server problem (gerrit in your case).  Is this a                                                                                                                 
repeatable issue.  Try the same operation against a clone of                                                                                                          
the repo using just git.  Check on the server side for .noz                                                                                                           
files in you repo (a jgit thing),                                                                                                                                     
                                                                                                                                                                      
---                                                                                                                                                                   
                                                                                                                                                                      
This happens both using gerrit and using git directly...                                                                                                              
                                                                                                                                                                      
My thought is more that git doesn't handle dodgy connections over                                                                                                     
openvpn (udp) that goes over dodgy international vpn links.                                                                                                           
                                                                                                                                                                      
I conclusion has always been that it ends up in a unpredictable state,                                                                                                
like a blocking read or so that just doesn't timeout... If it was a pipe                                                                                              
and not a socket then it'd always return... eventhough even a socket                                                                                                  
should timeout i have seen processes left like this for weeks.                                                                                                        
                                                                                                                                                                      
There was no .noz files on the master or the slave server.                                              

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: git hangs on pthread_join
  2013-05-23 13:01 Ian Kumlien
  2013-05-23 19:45 ` Martin Fick
@ 2013-05-28 17:51 ` Jeff King
  2013-05-29  8:29   ` Ian Kumlien
  1 sibling, 1 reply; 5+ messages in thread
From: Jeff King @ 2013-05-28 17:51 UTC (permalink / raw)
  To: Ian Kumlien; +Cc: Martin Fick, git

On Thu, May 23, 2013 at 03:01:43PM +0200, Ian Kumlien wrote:

> git 1.8.2.1 is started by xinetd
> [...]
> I have found "git receive-pack"s that has been running for days/weeks
> without terminating....
> 
> Attaching gdb and doing a trace results in:
> #0  0x0000003261207b35 in pthread_join () from /lib64/libpthread.so.0
> #1  0x00000000004ce58b in finish_async ()
> #2  0x000000000045744b in cmd_receive_pack ()
> #3  0x0000000000404851 in handle_internal_command ()
> #4  0x0000000000404c9d in main ()

I recently fixed a deadlock that could happen in receive-pack when
clients hung up before sending a valid pack header. The fix is commit
49ecfa1, and it's in git v1.8.2.2.

The stack trace for the deadlock fixed by 49ecfa1 would have
unpack_with_sideband between #1 and #2 above, but it is entirely
possible that it is simply inlined in your build of git, depending on
the -O level of your build (it is a static function that is only called
from one place). So it seems likely that it is the culprit.

-Peff

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: git hangs on pthread_join
  2013-05-28 17:51 ` Jeff King
@ 2013-05-29  8:29   ` Ian Kumlien
  0 siblings, 0 replies; 5+ messages in thread
From: Ian Kumlien @ 2013-05-29  8:29 UTC (permalink / raw)
  To: Jeff King; +Cc: Martin Fick, git

On Tue, May 28, 2013 at 01:51:09PM -0400, Jeff King wrote:
> On Thu, May 23, 2013 at 03:01:43PM +0200, Ian Kumlien wrote:
> 
> > git 1.8.2.1 is started by xinetd
> > [...]
> > I have found "git receive-pack"s that has been running for days/weeks
> > without terminating....
> > 
> > Attaching gdb and doing a trace results in:
> > #0  0x0000003261207b35 in pthread_join () from /lib64/libpthread.so.0
> > #1  0x00000000004ce58b in finish_async ()
> > #2  0x000000000045744b in cmd_receive_pack ()
> > #3  0x0000000000404851 in handle_internal_command ()
> > #4  0x0000000000404c9d in main ()
> 
> I recently fixed a deadlock that could happen in receive-pack when
> clients hung up before sending a valid pack header. The fix is commit
> 49ecfa1, and it's in git v1.8.2.2.

With dodgy connections this could easily happen =)

Really nice catch!

> The stack trace for the deadlock fixed by 49ecfa1 would have
> unpack_with_sideband between #1 and #2 above, but it is entirely
> possible that it is simply inlined in your build of git, depending on
> the -O level of your build (it is a static function that is only called
> from one place). So it seems likely that it is the culprit.

Yeah, since it's a RHEL 5 machine i don't even get a debug rpm package
=P

I will upgrade all machines and keep monitoring, thanks!

> -Peff

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2013-05-29  8:51 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-05-27 13:58 git hangs on pthread_join Ian Kumlien
  -- strict thread matches above, loose matches on Subject: below --
2013-05-23 13:01 Ian Kumlien
2013-05-23 19:45 ` Martin Fick
2013-05-28 17:51 ` Jeff King
2013-05-29  8:29   ` Ian Kumlien

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).