* Periodic hang during git index-pack @ 2018-12-19 22:59 Sitsofe Wheeler 2018-12-19 23:22 ` Jeff King 0 siblings, 1 reply; 6+ messages in thread From: Sitsofe Wheeler @ 2018-12-19 22:59 UTC (permalink / raw) To: git Hi, While using trying to use git to clone a remote repository git index-pack occasionally goes on to hang: export GIT_TRACE=1 git clone -n remotehost:diffusion/LIBEDIT/libedit 22:43:48.948378 git.c:418 trace: built-in: git clone -n remotehost:diffusion/LIBEDIT/libedit Cloning into 'libedit'... 22:43:48.951196 run-command.c:643 trace: run_command: unset GIT_DIR; ssh remotehost 'git-upload-pack '\''diffusion/LIBEDIT/libedit'\''' 22:43:49.130517 run-command.c:643 trace: run_command: git index-pack --stdin -v --fix-thin '--keep=fetch-pack 13914 on machine' --check-self-contained-and-connected remote: Enumerating objects: 178, done. 22:43:49.132535 git.c:418 trace: built-in: git index-pack --stdin -v --fix-thin '--keep=fetch-pack 13914 on machine' --check-self-contained-and-connected remote: Counting objects: 100% (178/178), done. remote: Compressing objects: 100% (81/81), done. remote: Total 178 (delta 95), reused 178 (delta 95) Receiving objects: 100% (178/178), 539.36 KiB | 16.85 MiB/s, done. Resolving deltas: 100% (95/95), done. 22:43:49.209193 run-command.c:643 trace: run_command: git rev-list --objects --stdin --not --all --quiet '--progress=Checking connectivity' 22:43:49.211276 git.c:418 trace: built-in: git rev-list --objects --stdin --not --all --quiet '--progress=Checking connectivity' Wed 19 Dec 22:43:49 GMT 2018 Wed 19 Dec 22:43:49 GMT 2018 22:43:49.220996 git.c:418 trace: built-in: git clone -n remotehost:diffusion/LIBEDIT/libedit Cloning into 'libedit'... 22:43:49.223462 run-command.c:643 trace: run_command: unset GIT_DIR; ssh remotehost 'git-upload-pack '\''diffusion/LIBEDIT/libedit'\''' 22:43:49.491004 run-command.c:643 trace: run_command: git index-pack --stdin -v --fix-thin '--keep=fetch-pack 13925 on machine' --check-self-contained-and-connected remote: Enumerating objects: 178, done. 22:43:49.492988 git.c:418 trace: built-in: git index-pack --stdin -v --fix-thin '--keep=fetch-pack 13925 on machine' --check-self-contained-and-connected remote: Counting objects: 100% (178/178), done. remote: Compressing objects: 100% (81/81), done. remote: Total 178 (delta 95), reused 178 (delta 95) Receiving objects: 100% (178/178), 539.36 KiB | 17.40 MiB/s, done. Resolving deltas: 100% (95/95), done. 22:43:49.569331 run-command.c:643 trace: run_command: git rev-list --objects --stdin --not --all --quiet '--progress=Checking connectivity' 22:43:49.571431 git.c:418 trace: built-in: git rev-list --objects --stdin --not --all --quiet '--progress=Checking connectivity' Wed 19 Dec 22:43:49 GMT 2018 Wed 19 Dec 22:43:49 GMT 2018 22:43:49.581282 git.c:418 trace: built-in: git clone -n remotehost:diffusion/LIBEDIT/libedit Cloning into 'libedit'... 22:43:49.584019 run-command.c:643 trace: run_command: unset GIT_DIR; ssh remotehost 'git-upload-pack '\''diffusion/LIBEDIT/libedit'\''' 22:43:49.818490 run-command.c:643 trace: run_command: git index-pack --stdin -v --fix-thin '--keep=fetch-pack 13936 on machine' --check-self-contained-and-connected remote: Enumerating objects: 178, done. 22:43:49.820477 git.c:418 trace: built-in: git index-pack --stdin -v --fix-thin '--keep=fetch-pack 13936 on machine' --check-self-contained-and-connected remote: Counting objects: 100% (178/178), done. remote: Compressing objects: 100% (81/81), done. Receiving objects: 64% (114/178) Most of the time cloning complets fine but every now and then it will hang indefinitely in the receiving objects stage. This happens with git version 2.17.1 from Ubuntu 18.04 and a hand compiled git version 2.20.1. The remote end is running through Phabricator sshd scripts (both 2.17.1 and 2.20.1 have been tried on the remote end). ssh multiplexing is on: ControlMaster auto ControlPersist 120 ControlPath /tmp/ssh-%r@%h:%p ServerAliveInterval 60 Looking at where it is stuck, git is doing read of a pipe: #0 0x00007fd1b845034e in __libc_read (fd=fd@entry=0, buf=buf@entry=0x55ab81e19d40 <input_buffer>, nbytes=nbytes@entry=4096) at ../sysdeps/unix/sysv/linux/read.c:27 #1 0x000055ab81b51b23 in read (__nbytes=4096, __buf=0x55ab81e19d40 <input_buffer>, __fd=0) at /usr/include/x86_64-linux-gnu/bits/unistd.h:44 #2 xread (fd=0, buf=0x55ab81e19d40 <input_buffer>, len=4096) at wrapper.c:260 #3 0x000055ab819d1069 in fill (min=min@entry=1) at builtin/index-pack.c:255 #4 0x000055ab819d123a in unpack_entry_data (offset=268408, size=size@entry=14675, type=<optimised out>, oid=oid@entry=0x55ab837cff40) at builtin/index-pack.c:445 #5 0x000055ab819d1bc7 in unpack_raw_entry (oid=0x55ab837cff40, ref_oid=0x7fffee137440, ofs_offset=0x55ab837ce350, obj=0x55ab837cff40) at builtin/index-pack.c:526 #6 parse_pack_objects (hash=0x7fffee1374f0 "\001") at builtin/index-pack.c:1113 #7 cmd_index_pack (argc=<optimised out>, argv=<optimised out>, prefix=<optimised out>) at builtin/index-pack.c:1775 #8 0x000055ab819934e1 in run_builtin (argv=<optimised out>, argc=<optimised out>, p=<optimised out>) at git.c:421 #9 handle_builtin (argc=<optimised out>, argv=<optimised out>) at git.c:647 #10 0x000055ab81994515 in run_argv (argv=0x7fffee137690, argcp=0x7fffee13769c) at git.c:701 #11 cmd_main (argc=<optimised out>, argv=<optimised out>) at git.c:798 #12 0x000055ab8199318f in main (argc=7, argv=0x7fffee137908) at common-main.c:45 Does anyone know what might be wrong and/or how to debug this? -- Sitsofe | http://sucs.org/~sits/ ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Periodic hang during git index-pack 2018-12-19 22:59 Periodic hang during git index-pack Sitsofe Wheeler @ 2018-12-19 23:22 ` Jeff King 2018-12-20 10:03 ` Sitsofe Wheeler 0 siblings, 1 reply; 6+ messages in thread From: Jeff King @ 2018-12-19 23:22 UTC (permalink / raw) To: Sitsofe Wheeler; +Cc: git On Wed, Dec 19, 2018 at 10:59:30PM +0000, Sitsofe Wheeler wrote: > While using trying to use git to clone a remote repository git > index-pack occasionally goes on to hang: > [...] > Looking at where it is stuck, git is doing read of a pipe: > > #0 0x00007fd1b845034e in __libc_read (fd=fd@entry=0, > buf=buf@entry=0x55ab81e19d40 <input_buffer>, nbytes=nbytes@entry=4096) > at ../sysdeps/unix/sysv/linux/read.c:27 > #1 0x000055ab81b51b23 in read (__nbytes=4096, __buf=0x55ab81e19d40 > <input_buffer>, __fd=0) > at /usr/include/x86_64-linux-gnu/bits/unistd.h:44 Index-pack is reading the pack on stdin, so it's expecting more bytes. Those bytes should be coming from the git-clone process, which is relaying the bytes from the other side. Check the backtrace of git-clone to see why it isn't feeding more data (but note that it will generally have two threads -- one processing the data from the remote, and one wait()ing for index-pack to finish). My guess, though, is that you'll find that git-clone is simply waiting on another pipe: the one from ssh. -Peff ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Periodic hang during git index-pack 2018-12-19 23:22 ` Jeff King @ 2018-12-20 10:03 ` Sitsofe Wheeler 2018-12-20 15:10 ` Jeff King 0 siblings, 1 reply; 6+ messages in thread From: Sitsofe Wheeler @ 2018-12-20 10:03 UTC (permalink / raw) To: Jeff King; +Cc: git On Wed, 19 Dec 2018 at 23:22, Jeff King <peff@peff.net> wrote: > > On Wed, Dec 19, 2018 at 10:59:30PM +0000, Sitsofe Wheeler wrote: > > > While using trying to use git to clone a remote repository git > > index-pack occasionally goes on to hang: > > [...] > > Looking at where it is stuck, git is doing read of a pipe: > > > > #0 0x00007fd1b845034e in __libc_read (fd=fd@entry=0, > > buf=buf@entry=0x55ab81e19d40 <input_buffer>, nbytes=nbytes@entry=4096) > > at ../sysdeps/unix/sysv/linux/read.c:27 > > #1 0x000055ab81b51b23 in read (__nbytes=4096, __buf=0x55ab81e19d40 > > <input_buffer>, __fd=0) > > at /usr/include/x86_64-linux-gnu/bits/unistd.h:44 > > Index-pack is reading the pack on stdin, so it's expecting more bytes. > Those bytes should be coming from the git-clone process, which is > relaying the bytes from the other side. > > Check the backtrace of git-clone to see why it isn't feeding more data > (but note that it will generally have two threads -- one processing the > data from the remote, and one wait()ing for index-pack to finish). > > My guess, though, is that you'll find that git-clone is simply waiting > on another pipe: the one from ssh. Ok here are backtraces from another run (with SSH multiplexing disabled): (gdb) thread apply all bt Thread 2 (Thread 0x7faafbf1c700 (LWP 36586)): #0 0x00007faafc805384 in __libc_read (fd=fd@entry=5, buf=buf@entry=0x7faafbf0ddec, nbytes=nbytes@entry=5) at ../sysdeps/unix/sysv/linux/read.c:27 #1 0x000055c8ca2f5b23 in read (__nbytes=5, __buf=0x7faafbf0ddec, __fd=5) at /usr/include/x86_64-linux-gnu/bits/unistd.h:44 #2 xread (fd=fd@entry=5, buf=buf@entry=0x7faafbf0ddec, len=len@entry=5) at wrapper.c:260 #3 0x000055c8ca2f5cdb in read_in_full (fd=5, buf=buf@entry=0x7faafbf0bdf0, count=5, count@entry=8193) at wrapper.c:318 #4 0x000055c8ca27222b in get_packet_data (fd=fd@entry=5, src_buf=src_buf@entry=0x0, src_size=src_size@entry=0x0, dst=dst@entry=0x7faafbf0bdf0, size=size@entry=8193, options=options@entry=0) at pkt-line.c:289 #5 0x000055c8ca272ed8 in packet_read_with_status (fd=fd@entry=5, src_buffer=src_buffer@entry=0x0, src_len=src_len@entry=0x0, buffer=buffer@entry=0x7faafbf0bdf0 "\001\344\305\066JJj\341q@\243\225ժ\350\026M\fkM9:-ƀ\253\206\336\001\275\070\325\372\250\204\232aM\221\213(\320B%\a\275\233\261g\321A\245\n \247\374\326\b'\v\252\277rA\211\312l\212j\352\177\260\317j\aT\252&t2\256\254\360\002\217V\024\061k\201ڲ;;\017`\361\020:*b\n5\222\036i\272\067}\360,\323\345Y\314ir\311\034b\232F\267\364\016]", size=size@entry=65520, pktlen=pktlen@entry=0x7faafbf0bd94, options=0) at pkt-line.c:344 #6 0x000055c8ca273078 in packet_read (fd=fd@entry=5, src_buffer=src_buffer@entry=0x0, src_len=src_len@entry=0x0, buffer=buffer@entry=0x7faafbf0bdf0 "\001\344\305\066JJj\341q@\243\225ժ\350\026M\fkM9:-ƀ\253\206\336\001\275\070\325\372\250\204\232aM\221\213(\320B%\a\275\233\261g\321A\245\n \247\374\326\b'\v\252\277rA\211\312l\212j\352\177\260\317j\aT\252&t2\256\254\360\002\217V\024\061k\201ڲ;;\017`\361\020:*b\n5\222\036i\272\067}\360,\323\345Y\314ir\311\034b\232F\267\364\016]", size=size@entry=65520, options=options@entry=0) at pkt-line.c:364 #7 0x000055c8ca2cbf73 in recv_sideband ( me=me@entry=0x55c8ca3466ed "fetch-pack", in_stream=5, out=out@entry=6) at sideband.c:143 #8 0x000055c8ca22d6cb in sideband_demux (in=<optimised out>, out=6, data=<optimised out>) at fetch-pack.c:776 #9 0x000055c8ca2a8458 in run_thread (data=0x7ffda489ef90) at run-command.c:1032 #10 0x00007faafc7fb6db in start_thread (arg=0x7faafbf1c700) at pthread_create.c:463 #11 0x00007faafc31c88f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 Thread 1 (Thread 0x7faafce3bb80 (LWP 36584)): #0 0x00007faafc805384 in __libc_read (fd=fd@entry=7, buf=buf@entry=0x7ffda489ef10, nbytes=nbytes@entry=46) at ../sysdeps/unix/sysv/linux/read.c:27 #1 0x000055c8ca2f5b23 in read (__nbytes=46, __buf=0x7ffda489ef10, __fd=7) at /usr/include/x86_64-linux-gnu/bits/unistd.h:44 #2 xread (fd=fd@entry=7, buf=buf@entry=0x7ffda489ef10, len=len@entry=46) at wrapper.c:260 #3 0x000055c8ca2f5cdb in read_in_full (fd=7, buf=buf@entry=0x7ffda489ef10, count=count@entry=46) at wrapper.c:318 #4 0x000055c8ca26a54f in index_pack_lockfile (ip_out=<optimised out>) at pack-write.c:297 #5 0x000055c8ca22e18c in get_pack (args=args@entry=0x7ffda489f2e0, xd=xd@entry=0x55c8cb4966f8, pack_lockfile=pack_lockfile@entry=0x55c8cb496660) at fetch-pack.c:881 #6 0x000055c8ca22fbfb in do_fetch_pack (pack_lockfile=<optimised out>, si=0x7ffda489f1f0, nr_sought=<optimised out>, sought=<optimised out>, orig_ref=<optimised out>, fd=<optimised out>, args=0x7ffda489f2e0) at fetch-pack.c:1019 #7 fetch_pack (args=args@entry=0x7ffda489f2e0, fd=<optimised out>, conn=<optimised out>, ref=<optimised out>, dest=dest@entry=0x55c8cb498820 "remotehost:diffusion/LIBEDIT/libedit", sought=sought@entry=0x55c8cb498580, nr_sought=<optimised out>, shallow=<optimised out>, pack_lockfile=<optimised out>, version=<optimised out>) at fetch-pack.c:1649 #8 0x000055c8ca2dddf8 in fetch_refs_via_pack (transport=0x55c8cb496620, nr_heads=2, to_fetch=0x55c8cb498580) at transport.c:365 #9 0x000055c8ca2df246 in transport_fetch_refs ( transport=transport@entry=0x55c8cb496620, refs=refs@entry=0x55c8cb4959e0) at transport.c:1295 #10 0x000055c8ca1529ab in cmd_clone (argc=<optimised out>, argv=<optimised out>, prefix=<optimised out>) at builtin/clone.c:1212 #11 0x000055c8ca1374e1 in run_builtin (argv=<optimised out>, argc=<optimised out>, p=<optimised out>) at git.c:421 #12 handle_builtin (argc=<optimised out>, argv=<optimised out>) at git.c:647 #13 0x000055c8ca138515 in run_argv (argv=0x7ffda489fa00, argcp=0x7ffda489fa0c) at git.c:701 #14 cmd_main (argc=<optimised out>, argv=<optimised out>) at git.c:798 #15 0x000055c8ca13718f in main (argc=4, argv=0x7ffda489fc78) at common-main.c:45 (gdb) thread apply all bt Thread 1 (Thread 0x7ff8a03dab80 (LWP 36587)): #0 0x00007ff89fda434e in __libc_read (fd=fd@entry=0, buf=buf@entry=0x5604bea43d40 <input_buffer>, nbytes=nbytes@entry=4096) at ../sysdeps/unix/sysv/linux/read.c:27 #1 0x00005604be77bb23 in read (__nbytes=4096, __buf=0x5604bea43d40 <input_buffer>, __fd=0) at /usr/include/x86_64-linux-gnu/bits/unistd.h:44 #2 xread (fd=0, buf=0x5604bea43d40 <input_buffer>, len=4096) at wrapper.c:260 #3 0x00005604be5fb069 in fill (min=min@entry=1) at builtin/index-pack.c:255 #4 0x00005604be5fb23a in unpack_entry_data (offset=528312, size=size@entry=19448, type=<optimised out>, oid=oid@entry=0x5604c0605ea0) at builtin/index-pack.c:445 #5 0x00005604be5fbbc7 in unpack_raw_entry (oid=0x5604c0605ea0, ref_oid=0x7ffc74cd73f0, ofs_offset=0x5604c0603580, obj=0x5604c0605ea0) at builtin/index-pack.c:526 #6 parse_pack_objects (hash=0x7ffc74cd74a0 "\001") at builtin/index-pack.c:1113 #7 cmd_index_pack (argc=<optimised out>, argv=<optimised out>, prefix=<optimised out>) at builtin/index-pack.c:1775 #8 0x00005604be5bd4e1 in run_builtin (argv=<optimised out>, argc=<optimised out>, p=<optimised out>) at git.c:421 #9 handle_builtin (argc=<optimised out>, argv=<optimised out>) at git.c:647 #10 0x00005604be5be515 in run_argv (argv=0x7ffc74cd7640, argcp=0x7ffc74cd764c) at git.c:701 #11 cmd_main (argc=<optimised out>, argv=<optimised out>) at git.c:798 #12 0x00005604be5bd18f in main (argc=7, argv=0x7ffc74cd78b8) at common-main.c:45 As before, any hints on how to debug this gratefully accepted! -- Sitsofe | http://sucs.org/~sits/ ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Periodic hang during git index-pack 2018-12-20 10:03 ` Sitsofe Wheeler @ 2018-12-20 15:10 ` Jeff King 2018-12-20 16:48 ` Sitsofe Wheeler 0 siblings, 1 reply; 6+ messages in thread From: Jeff King @ 2018-12-20 15:10 UTC (permalink / raw) To: Sitsofe Wheeler; +Cc: git On Thu, Dec 20, 2018 at 10:03:45AM +0000, Sitsofe Wheeler wrote: > > Check the backtrace of git-clone to see why it isn't feeding more data > > (but note that it will generally have two threads -- one processing the > > data from the remote, and one wait()ing for index-pack to finish). > > > > My guess, though, is that you'll find that git-clone is simply waiting > > on another pipe: the one from ssh. > > Ok here are backtraces from another run (with SSH multiplexing disabled): OK, that's about what I expected. Here we have clone's sideband-demux thread waiting to pull more packfile data from the remote: > (gdb) thread apply all bt > > Thread 2 (Thread 0x7faafbf1c700 (LWP 36586)): > #0 0x00007faafc805384 in __libc_read (fd=fd@entry=5, > buf=buf@entry=0x7faafbf0ddec, nbytes=nbytes@entry=5) > at ../sysdeps/unix/sysv/linux/read.c:27 > #1 0x000055c8ca2f5b23 in read (__nbytes=5, __buf=0x7faafbf0ddec, __fd=5) > at /usr/include/x86_64-linux-gnu/bits/unistd.h:44 > [...coming from packet_read / recv_sideband / sideband_demux...] I assume fd=5 there is a pipe connected to ssh. You could double check with "lsof" or similar, but I don't think it would ever be reading from anywhere else. And then this thread: > Thread 1 (Thread 0x7faafce3bb80 (LWP 36584)): > #0 0x00007faafc805384 in __libc_read (fd=fd@entry=7, > buf=buf@entry=0x7ffda489ef10, nbytes=nbytes@entry=46) > at ../sysdeps/unix/sysv/linux/read.c:27 > #1 0x000055c8ca2f5b23 in read (__nbytes=46, __buf=0x7ffda489ef10, __fd=7) > at /usr/include/x86_64-linux-gnu/bits/unistd.h:44 > #2 xread (fd=fd@entry=7, buf=buf@entry=0x7ffda489ef10, len=len@entry=46) > at wrapper.c:260 > #3 0x000055c8ca2f5cdb in read_in_full (fd=7, buf=buf@entry=0x7ffda489ef10, > count=count@entry=46) at wrapper.c:318 > #4 0x000055c8ca26a54f in index_pack_lockfile (ip_out=<optimised out>) > at pack-write.c:297 ...is us waiting on index-pack to single completion by printing out the name of the pack's .keep file. And then this is just index-pack: > (gdb) thread apply all bt > > Thread 1 (Thread 0x7ff8a03dab80 (LWP 36587)): > #0 0x00007ff89fda434e in __libc_read (fd=fd@entry=0, > buf=buf@entry=0x5604bea43d40 <input_buffer>, nbytes=nbytes@entry=4096) > at ../sysdeps/unix/sysv/linux/read.c:27 > #1 0x00005604be77bb23 in read (__nbytes=4096, > __buf=0x5604bea43d40 <input_buffer>, __fd=0) > at /usr/include/x86_64-linux-gnu/bits/unistd.h:44 > #2 xread (fd=0, buf=0x5604bea43d40 <input_buffer>, len=4096) at wrapper.c:260 > #3 0x00005604be5fb069 in fill (min=min@entry=1) at builtin/index-pack.c:255 ...waiting for more data. So though it goes back and forth between two processes, the sequence of blocking data is linear due to the threads: ssh -> git-clone -> index-pack -> git-clone (sideband_demux) (index_pack_lockfile) with each blocking on read() from its predecessor. So you need to find out why "ssh" is blocking. Unfortunately, short of a bug in ssh, the likely cause is either: 1. The git-upload-pack on the remote side stopped generating data for some reason. You may or may not have access on the remotehost to dig into that. It's certainly possible there's a deadlock bug between the server and client side of a Git conversation. But I'd find it extremely unlikely to find such a deadlock bug at this point in the conversation, because at this point the client side has nothing left to say to the server. The server should just be streaming out the packfile bytes and then closing the descriptor. You mentioned "Phabricator sshd scripts" running on the server. I don't know what Phabricator might be sticking in the middle of the connection, but that could be the source of the stall. 2. It's possible the network connection dropped but ssh did not notice. Maybe try turning on ssh keepalives and seeing if it eventually times out? -Peff ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Periodic hang during git index-pack 2018-12-20 15:10 ` Jeff King @ 2018-12-20 16:48 ` Sitsofe Wheeler 2019-02-03 8:16 ` Sitsofe Wheeler 0 siblings, 1 reply; 6+ messages in thread From: Sitsofe Wheeler @ 2018-12-20 16:48 UTC (permalink / raw) To: Jeff King; +Cc: git On Thu, 20 Dec 2018 at 15:11, Jeff King <peff@peff.net> wrote: > > OK, that's about what I expected. Here we have clone's sideband-demux > thread waiting to pull more packfile data from the remote: > > > (gdb) thread apply all bt > > > > Thread 2 (Thread 0x7faafbf1c700 (LWP 36586)): > > #0 0x00007faafc805384 in __libc_read (fd=fd@entry=5, > > buf=buf@entry=0x7faafbf0ddec, nbytes=nbytes@entry=5) > > at ../sysdeps/unix/sysv/linux/read.c:27 > > #1 0x000055c8ca2f5b23 in read (__nbytes=5, __buf=0x7faafbf0ddec, __fd=5) > > at /usr/include/x86_64-linux-gnu/bits/unistd.h:44 > > [...coming from packet_read / recv_sideband / sideband_demux...] > > I assume fd=5 there is a pipe connected to ssh. You could double check > with "lsof" or similar, but I don't think it would ever be reading from > anywhere else. I checked and in all cases git was reading from a pipe. [snip] > with each blocking on read() from its predecessor. So you need to find > out why "ssh" is blocking. Unfortunately, short of a bug in ssh, the > likely cause is either: > > 1. The git-upload-pack on the remote side stopped generating data for > some reason. You may or may not have access on the remotehost to > dig into that. > > It's certainly possible there's a deadlock bug between the server > and client side of a Git conversation. But I'd find it extremely > unlikely to find such a deadlock bug at this point in the > conversation, because at this point the client side has nothing > left to say to the server. The server should just be streaming out > the packfile bytes and then closing the descriptor. I think it's highly unlikely too given how many good runs we generally have. > You mentioned "Phabricator sshd scripts" running on the server. > I don't know what Phabricator might be sticking in the middle of > the connection, but that could be the source of the stall. I think you're right. I set up a seperate sshd on a different port on the same machine where there were no Phabricator callouts and the problem never manifested... > 2. It's possible the network connection dropped but ssh did not > notice. Maybe try turning on ssh keepalives and seeing if it > eventually times out? I had already done this but the problem still manifested. I went so far as checking if the problem would happen over the loopback device on the same machine (via localhost) and the problem continued happening so I'm fairly sure that rules out networking issues. -- Sitsofe | http://sucs.org/~sits/ ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Periodic hang during git index-pack 2018-12-20 16:48 ` Sitsofe Wheeler @ 2019-02-03 8:16 ` Sitsofe Wheeler 0 siblings, 0 replies; 6+ messages in thread From: Sitsofe Wheeler @ 2019-02-03 8:16 UTC (permalink / raw) To: Jeff King; +Cc: git On Wed, 19 Dec 2018 at 22:59, Sitsofe Wheeler <sitsofe@gmail.com> wrote: > > While using trying to use git to clone a remote repository git > index-pack occasionally goes on to hang: On Thu, 20 Dec 2018 at 16:48, Sitsofe Wheeler <sitsofe@gmail.com> wrote: > > On Thu, 20 Dec 2018 at 15:11, Jeff King <peff@peff.net> wrote: > > [snip] > > > > with each blocking on read() from its predecessor. So you need to find > > out why "ssh" is blocking. Unfortunately, short of a bug in ssh, the > > likely cause is either: > > > > 1. The git-upload-pack on the remote side stopped generating data for > > some reason. You may or may not have access on the remotehost to > > dig into that. > > > > It's certainly possible there's a deadlock bug between the server > > and client side of a Git conversation. But I'd find it extremely > > unlikely to find such a deadlock bug at this point in the > > conversation, because at this point the client side has nothing > > left to say to the server. The server should just be streaming out > > the packfile bytes and then closing the descriptor. > > I think it's highly unlikely too given how many good runs we generally have. > > > You mentioned "Phabricator sshd scripts" running on the server. > > I don't know what Phabricator might be sticking in the middle of > > the connection, but that could be the source of the stall. > > I think you're right. I set up a seperate sshd on a different port on > the same machine where there were no Phabricator callouts and the > problem never manifested... Just to finally follow up - this was confirmed to be a bug in the Phabricator server side git+ssh wrapper and the issue exists in at least Phabricator versions between 2016 Week 28 - 2019 Week 3. The Phabricator ssh-exec PHP script was doing a select() that would very occasionally fail with errno set to EINTR and then going on to cause subsequent git data to not be sent to the git client. It's unclear why this started happening more frequently for us but we had recently changed to a faster/more cores/larger AWS instance type and we also switched to a later Ubuntu 14.04 kernel... See https://discourse.phabricator-community.org/t/sporadic-git-cloning-hang-over-ssh/2233 for a longer description of the issue that was reported to the Phabricator developers. That link contains a patch by myself that resolves the issue, a link to an upstream Phabricator task made by the Phabricator devs and a link to an upstream Phabricator diff that works around the issue in a looser manner than my final patch. Thanks for your rapid reply and guidance Jeff! -- Sitsofe | http://sucs.org/~sits/ ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2019-02-03 8:17 UTC | newest] Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2018-12-19 22:59 Periodic hang during git index-pack Sitsofe Wheeler 2018-12-19 23:22 ` Jeff King 2018-12-20 10:03 ` Sitsofe Wheeler 2018-12-20 15:10 ` Jeff King 2018-12-20 16:48 ` Sitsofe Wheeler 2019-02-03 8:16 ` Sitsofe Wheeler
Code repositories for project(s) associated with this public inbox https://80x24.org/mirrors/git.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).