* t0008 hang on streaming test (OS X) @ 2013-07-10 16:36 Brian Gernhardt 2013-07-10 20:35 ` Antoine Pelisse 2013-07-11 13:34 ` Jeff King 0 siblings, 2 replies; 10+ messages in thread From: Brian Gernhardt @ 2013-07-10 16:36 UTC (permalink / raw) To: git@vger.kernel.org List The newest test in t0008 "streaming support for --stdin", seems to hang sporadically on my MacBook Pro (running 10.8.4). The hang seems to be related to running it in parallel with other tests, as I can only reliably cause it by running with prove and -j 3. However, once that has hung I am able to semi-reliably have it occur by running the test separately (with the test hung in the background, using separate trash directories via the --root option). Experimentation has led me to find that it is hanging when trying to read the 2nd response from check-ignore. I am somewhat stuck on how to fix it. Any ideas? ~~ Brian Gernhardt ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: t0008 hang on streaming test (OS X) 2013-07-10 16:36 t0008 hang on streaming test (OS X) Brian Gernhardt @ 2013-07-10 20:35 ` Antoine Pelisse 2013-07-11 16:14 ` Brian Gernhardt 2013-07-11 13:34 ` Jeff King 1 sibling, 1 reply; 10+ messages in thread From: Antoine Pelisse @ 2013-07-10 20:35 UTC (permalink / raw) To: Brian Gernhardt; +Cc: git@vger.kernel.org List On Wed, Jul 10, 2013 at 6:36 PM, Brian Gernhardt <brian@gernhardtsoftware.com> wrote: > I am somewhat stuck on how to fix it. Any ideas? I don't have anything to reproduce here, but usually I start investigating this kind of problems by attaching the hung process with gdb to see the current state (if it's stuck in a specific state), or to investigate the end-less loop. That usually help finding a good starting point. Cheers, Antoine ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: t0008 hang on streaming test (OS X) 2013-07-10 20:35 ` Antoine Pelisse @ 2013-07-11 16:14 ` Brian Gernhardt 0 siblings, 0 replies; 10+ messages in thread From: Brian Gernhardt @ 2013-07-11 16:14 UTC (permalink / raw) To: Antoine Pelisse; +Cc: git@vger.kernel.org List On Jul 10, 2013, at 4:35 PM, Antoine Pelisse <apelisse@gmail.com> wrote: > On Wed, Jul 10, 2013 at 6:36 PM, Brian Gernhardt > <brian@gernhardtsoftware.com> wrote: >> I am somewhat stuck on how to fix it. Any ideas? > > I don't have anything to reproduce here, but usually I start > investigating this kind of problems by attaching the hung process with > gdb to see the current state (if it's stuck in a specific state), or > to investigate the end-less loop. > That usually help finding a good starting point. Unfortunately, the hung process is /bin/sh (aka bash). Using the "Sample" function of Activity Monitor gave me that 100% of the time was spent in libc's _open... Which enlightens me not at all. ~~ Brian Gernhardt ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: t0008 hang on streaming test (OS X) 2013-07-10 16:36 t0008 hang on streaming test (OS X) Brian Gernhardt 2013-07-10 20:35 ` Antoine Pelisse @ 2013-07-11 13:34 ` Jeff King 2013-07-11 16:09 ` Junio C Hamano 2013-07-11 16:13 ` t0008 hang on streaming test (OS X) Brian Gernhardt 1 sibling, 2 replies; 10+ messages in thread From: Jeff King @ 2013-07-11 13:34 UTC (permalink / raw) To: Brian Gernhardt; +Cc: git@vger.kernel.org List On Wed, Jul 10, 2013 at 12:36:40PM -0400, Brian Gernhardt wrote: > The newest test in t0008 "streaming support for --stdin", seems to > hang sporadically on my MacBook Pro (running 10.8.4). The hang seems > to be related to running it in parallel with other tests, as I can > only reliably cause it by running with prove and -j 3. However, once > that has hung I am able to semi-reliably have it occur by running the > test separately (with the test hung in the background, using separate > trash directories via the --root option). I can't replicate the hang here (on Linux) doing: for i in `seq 1 30`; do ./t0008-ignores.sh --root=/tmp/foo/$i & done Do you know which test it is hanging on? You mentioned that you can replicate it outside of "prove"; what does running with "-v" say? The last test in t0008, with the fifos, would make me the most suspicious. The way we do it _should_ be fine, but I'm wondering if the shell is blocking in exec here: mkfifo in out && (git check-ignore -n -v --stdin <in >out &) && exec 9>in && That is, if the fifo is not opened for some reason by the backgrounded process (there's a race, of course, but the outer shell should just block until the sub-shell actually opens it). I wonder if the descriptor-opening behavior of: cmd <in >out & is different between shells (that is, if it backgrounds the opening of in and out on some shells, but not on others). But then I would expect it to fail consistently. Just for fun, does switching the middle line there to: (sh -c "git check-ignore -n -v --stdin <in >out" &) && have any effect? -Peff ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: t0008 hang on streaming test (OS X) 2013-07-11 13:34 ` Jeff King @ 2013-07-11 16:09 ` Junio C Hamano 2013-07-12 10:35 ` [PATCH] t0008: avoid SIGPIPE race condition on fifo Jeff King 2013-07-11 16:13 ` t0008 hang on streaming test (OS X) Brian Gernhardt 1 sibling, 1 reply; 10+ messages in thread From: Junio C Hamano @ 2013-07-11 16:09 UTC (permalink / raw) To: Jeff King; +Cc: Brian Gernhardt, git@vger.kernel.org List Jeff King <peff@peff.net> writes: > On Wed, Jul 10, 2013 at 12:36:40PM -0400, Brian Gernhardt wrote: > >> The newest test in t0008 "streaming support for --stdin", seems to >> hang sporadically on my MacBook Pro (running 10.8.4). The hang seems >> to be related to running it in parallel with other tests, as I can >> only reliably cause it by running with prove and -j 3. However, once >> that has hung I am able to semi-reliably have it occur by running the >> test separately (with the test hung in the background, using separate >> trash directories via the --root option). > > I can't replicate the hang here (on Linux) doing: > > for i in `seq 1 30`; do > ./t0008-ignores.sh --root=/tmp/foo/$i & > done It seems to hang on me with bash, but not dash, here. > Do you know which test it is hanging on? You mentioned that you can > replicate it outside of "prove"; what does running with "-v" say? > > The last test in t0008, with the fifos, would make me the most > suspicious. The way we do it _should_ be fine, but I'm wondering if the > shell is blocking in exec here: > > mkfifo in out && > (git check-ignore -n -v --stdin <in >out &) && > exec 9>in && > > That is, if the fifo is not opened for some reason by the backgrounded > process (there's a race, of course, but the outer shell should just > block until the sub-shell actually opens it). I wonder if the > descriptor-opening behavior of: > > cmd <in >out & > > is different between shells (that is, if it backgrounds the opening of > in and out on some shells, but not on others). But then I would expect > it to fail consistently. > > Just for fun, does switching the middle line there to: > > (sh -c "git check-ignore -n -v --stdin <in >out" &) && > > have any effect? > > -Peff ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH] t0008: avoid SIGPIPE race condition on fifo 2013-07-11 16:09 ` Junio C Hamano @ 2013-07-12 10:35 ` Jeff King 2013-07-12 16:23 ` Junio C Hamano 2013-07-12 16:42 ` Brian Gernhardt 0 siblings, 2 replies; 10+ messages in thread From: Jeff King @ 2013-07-12 10:35 UTC (permalink / raw) To: Junio C Hamano; +Cc: Brian Gernhardt, git@vger.kernel.org List On Thu, Jul 11, 2013 at 09:09:55AM -0700, Junio C Hamano wrote: > Jeff King <peff@peff.net> writes: > > > On Wed, Jul 10, 2013 at 12:36:40PM -0400, Brian Gernhardt wrote: > > > >> The newest test in t0008 "streaming support for --stdin", seems to > >> hang sporadically on my MacBook Pro (running 10.8.4). The hang seems > >> to be related to running it in parallel with other tests, as I can > >> only reliably cause it by running with prove and -j 3. However, once > >> that has hung I am able to semi-reliably have it occur by running the > >> test separately (with the test hung in the background, using separate > >> trash directories via the --root option). > > > > I can't replicate the hang here (on Linux) doing: > > > > for i in `seq 1 30`; do > > ./t0008-ignores.sh --root=/tmp/foo/$i & > > done > > It seems to hang on me with bash, but not dash, here. Thanks, I was able to replicate it with bash, and like Brian, I saw it hanging in the second read. strace revealed that we were blocked on open("out"). The patch below should fix it. I'm still not sure why the choice of shell matters; it may simply be a timing fluke that bash is more likely to hit for some reason. -- >8 -- Subject: [PATCH] t0008: avoid SIGPIPE race condition on fifo To test check-ignore's --stdin feature, we use two fifos to send and receive data. We carefully keep a descriptor to its input open so that it does not receive EOF between input lines. However, we do not do the same for its output. That means there is a potential race condition in which check-ignore has opened the output pipe once (when we read the first line), and then writes the second line before we have re-opened the pipe. In that case, check-ignore gets a SIGPIPE and dies. The outer shell then tries to open the output fifo but blocks indefinitely, because there is no writer. We can fix it by keeping a descriptor open through the whole procedure. This should also help if check-ignore dies for any other reason (we would already have opened the fifo and would therefore not block, but just get EOF on read). However, we are technically still susceptible to check-ignore dying early, before we have opened the fifo. This is an unlikely race and shouldn't generally happen in practice, though, so we can hopefully ignore it. Signed-off-by: Jeff King <peff@peff.net> --- t/t0008-ignores.sh | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/t/t0008-ignores.sh b/t/t0008-ignores.sh index a56db80..c29342d 100755 --- a/t/t0008-ignores.sh +++ b/t/t0008-ignores.sh @@ -697,13 +697,21 @@ test_expect_success PIPE 'streaming support for --stdin' ' # shell, and then echo to the fd. We make sure to close it at # the end, so that the subprocess does get EOF and dies # properly. + # + # Similarly, we must keep "out" open so that check-ignore does + # not ever get SIGPIPE trying to write to us. Not only would that + # produce incorrect results, but then there would be no writer on the + # other end of the pipe, and we would potentially block forever trying + # to open it. exec 9>in && + exec 8<out && test_when_finished "exec 9>&-" && + test_when_finished "exec 8<&-" && echo >&9 one && - read response <out && + read response <&8 && echo "$response" | grep "^\.gitignore:1:one one" && echo >&9 two && - read response <out && + read response <&8 && echo "$response" | grep "^:: two" ' -- 1.8.3.rc1.30.gff0fb75 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH] t0008: avoid SIGPIPE race condition on fifo 2013-07-12 10:35 ` [PATCH] t0008: avoid SIGPIPE race condition on fifo Jeff King @ 2013-07-12 16:23 ` Junio C Hamano 2013-07-12 20:42 ` Jeff King 2013-07-12 16:42 ` Brian Gernhardt 1 sibling, 1 reply; 10+ messages in thread From: Junio C Hamano @ 2013-07-12 16:23 UTC (permalink / raw) To: Jeff King; +Cc: Brian Gernhardt, git@vger.kernel.org List Jeff King <peff@peff.net> writes: > Subject: [PATCH] t0008: avoid SIGPIPE race condition on fifo > > To test check-ignore's --stdin feature, we use two fifos to > send and receive data. We carefully keep a descriptor to its > input open so that it does not receive EOF between input > lines. However, we do not do the same for its output. That > means there is a potential race condition in which > check-ignore has opened the output pipe once (when we read > the first line), and then writes the second line before we > have re-opened the pipe. > > In that case, check-ignore gets a SIGPIPE and dies. The > outer shell then tries to open the output fifo but blocks > indefinitely, because there is no writer. We can fix it by > keeping a descriptor open through the whole procedure. Ahh, figures. I wish I were smart enough to figure that out immediately after seeing the test that does funny things to "in" with "9". Thanks. > This should also help if check-ignore dies for any other > reason (we would already have opened the fifo and would > therefore not block, but just get EOF on read). > > However, we are technically still susceptible to > check-ignore dying early, before we have opened the fifo. > This is an unlikely race and shouldn't generally happen in > practice, though, so we can hopefully ignore it. > > Signed-off-by: Jeff King <peff@peff.net> > --- > t/t0008-ignores.sh | 12 ++++++++++-- > 1 file changed, 10 insertions(+), 2 deletions(-) > > diff --git a/t/t0008-ignores.sh b/t/t0008-ignores.sh > index a56db80..c29342d 100755 > --- a/t/t0008-ignores.sh > +++ b/t/t0008-ignores.sh > @@ -697,13 +697,21 @@ test_expect_success PIPE 'streaming support for --stdin' ' > # shell, and then echo to the fd. We make sure to close it at > # the end, so that the subprocess does get EOF and dies > # properly. > + # > + # Similarly, we must keep "out" open so that check-ignore does > + # not ever get SIGPIPE trying to write to us. Not only would that > + # produce incorrect results, but then there would be no writer on the > + # other end of the pipe, and we would potentially block forever trying > + # to open it. > exec 9>in && > + exec 8<out && > test_when_finished "exec 9>&-" && > + test_when_finished "exec 8<&-" && > echo >&9 one && > - read response <out && > + read response <&8 && > echo "$response" | grep "^\.gitignore:1:one one" && > echo >&9 two && > - read response <out && > + read response <&8 && > echo "$response" | grep "^:: two" > ' ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] t0008: avoid SIGPIPE race condition on fifo 2013-07-12 16:23 ` Junio C Hamano @ 2013-07-12 20:42 ` Jeff King 0 siblings, 0 replies; 10+ messages in thread From: Jeff King @ 2013-07-12 20:42 UTC (permalink / raw) To: Junio C Hamano; +Cc: Brian Gernhardt, git@vger.kernel.org List On Fri, Jul 12, 2013 at 09:23:54AM -0700, Junio C Hamano wrote: > > In that case, check-ignore gets a SIGPIPE and dies. The > > outer shell then tries to open the output fifo but blocks > > indefinitely, because there is no writer. We can fix it by > > keeping a descriptor open through the whole procedure. > > Ahh, figures. > > I wish I were smart enough to figure that out immediately after > seeing the test that does funny things to "in" with "9". I wish I were smart enough to have figured it out when I was writing the funny stuff with "in" and "9" in the first place. :) -Peff ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] t0008: avoid SIGPIPE race condition on fifo 2013-07-12 10:35 ` [PATCH] t0008: avoid SIGPIPE race condition on fifo Jeff King 2013-07-12 16:23 ` Junio C Hamano @ 2013-07-12 16:42 ` Brian Gernhardt 1 sibling, 0 replies; 10+ messages in thread From: Brian Gernhardt @ 2013-07-12 16:42 UTC (permalink / raw) To: Jeff King; +Cc: Junio C Hamano, git@vger.kernel.org List On Jul 12, 2013, at 6:35 AM, Jeff King <peff@peff.net> wrote: > Subject: [PATCH] t0008: avoid SIGPIPE race condition on fifo Was able to complete a prove run with this patch. Many thanks. ~~ Brian ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: t0008 hang on streaming test (OS X) 2013-07-11 13:34 ` Jeff King 2013-07-11 16:09 ` Junio C Hamano @ 2013-07-11 16:13 ` Brian Gernhardt 1 sibling, 0 replies; 10+ messages in thread From: Brian Gernhardt @ 2013-07-11 16:13 UTC (permalink / raw) To: Jeff King; +Cc: git@vger.kernel.org List On Jul 11, 2013, at 9:34 AM, Jeff King <peff@peff.net> wrote: > On Wed, Jul 10, 2013 at 12:36:40PM -0400, Brian Gernhardt wrote: > >> The newest test in t0008 "streaming support for --stdin", >> Experimentation has led me to find that it is hanging when trying to read the 2nd response from check-ignore. > Do you know which test it is hanging on? You mentioned that you can > replicate it outside of "prove"; what does running with "-v" say? > > The last test in t0008, with the fifos, would make me the most > suspicious. The 2nd `read response <out` line is where it was hanging, based on a variety of echos added to the test. ~~ Brian ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2013-07-12 20:42 UTC | newest] Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2013-07-10 16:36 t0008 hang on streaming test (OS X) Brian Gernhardt 2013-07-10 20:35 ` Antoine Pelisse 2013-07-11 16:14 ` Brian Gernhardt 2013-07-11 13:34 ` Jeff King 2013-07-11 16:09 ` Junio C Hamano 2013-07-12 10:35 ` [PATCH] t0008: avoid SIGPIPE race condition on fifo Jeff King 2013-07-12 16:23 ` Junio C Hamano 2013-07-12 20:42 ` Jeff King 2013-07-12 16:42 ` Brian Gernhardt 2013-07-11 16:13 ` t0008 hang on streaming test (OS X) Brian Gernhardt
Code repositories for project(s) associated with this public inbox https://80x24.org/mirrors/git.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).