* [PATCH 0/2] Fix early EOF with GfW daemon @ 2018-04-12 21:07 Kim Gybels 2018-04-12 21:07 ` [PATCH 1/2] daemon: use timeout for uninterruptible poll Kim Gybels 2018-04-12 21:07 ` [PATCH 2/2] daemon: graceful shutdown of client connection Kim Gybels 0 siblings, 2 replies; 15+ messages in thread From: Kim Gybels @ 2018-04-12 21:07 UTC (permalink / raw) To: git; +Cc: Junio C Hamano, Johannes Schindelin, Jeff King, Kim Gybels It has been reported [1] that cloning from a Git-for-Windows daemon would sometimes fail with an early EOF error: $ git clone git://server/test Cloning into 'test'... remote: Counting objects: 36, done. remote: Compressing objects: 100% (24/24), done. fatal: read error: Invalid argument fatal: early EOF fatal: index-pack failed These patches solve the issue by only changing git-daemon, its child processes can remain unaware that stdin/stdout are actually network connections. [1] https://github.com/git-for-windows/git/issues/304 Kim Gybels (2): daemon: use timeout for uninterruptible poll daemon: graceful shutdown of client connection daemon.c | 19 ++++++++++++++----- 1 file changed, 14 insertions(+), 5 deletions(-) -- 2.17.0.windows.1 ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 1/2] daemon: use timeout for uninterruptible poll 2018-04-12 21:07 [PATCH 0/2] Fix early EOF with GfW daemon Kim Gybels @ 2018-04-12 21:07 ` Kim Gybels 2018-04-13 12:36 ` Johannes Schindelin 2018-04-15 21:54 ` Junio C Hamano 2018-04-12 21:07 ` [PATCH 2/2] daemon: graceful shutdown of client connection Kim Gybels 1 sibling, 2 replies; 15+ messages in thread From: Kim Gybels @ 2018-04-12 21:07 UTC (permalink / raw) To: git; +Cc: Junio C Hamano, Johannes Schindelin, Jeff King, Kim Gybels The poll provided in compat/poll.c is not interrupted by receiving SIGCHLD. Use a timeout for cleaning up dead children in a timely manner. Signed-off-by: Kim Gybels <kgybels@infogroep.be> --- daemon.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/daemon.c b/daemon.c index fe833ea7de..6dc95c1b2f 100644 --- a/daemon.c +++ b/daemon.c @@ -1147,6 +1147,7 @@ static int service_loop(struct socketlist *socklist) { struct pollfd *pfd; int i; + int poll_timeout = -1; pfd = xcalloc(socklist->nr, sizeof(struct pollfd)); @@ -1161,8 +1162,13 @@ static int service_loop(struct socketlist *socklist) int i; check_dead_children(); - - if (poll(pfd, socklist->nr, -1) < 0) { +#ifdef NO_POLL + poll_timeout = live_children ? 100 : -1; +#endif + int ret = poll(pfd, socklist->nr, poll_timeout); + if (ret == 0) { + continue; + } else if (ret < 0) { if (errno != EINTR) { logerror("Poll failed, resuming: %s", strerror(errno)); -- 2.17.0.windows.1 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] daemon: use timeout for uninterruptible poll 2018-04-12 21:07 ` [PATCH 1/2] daemon: use timeout for uninterruptible poll Kim Gybels @ 2018-04-13 12:36 ` Johannes Schindelin 2018-04-15 17:08 ` Kim Gybels 2018-04-15 21:54 ` Junio C Hamano 1 sibling, 1 reply; 15+ messages in thread From: Johannes Schindelin @ 2018-04-13 12:36 UTC (permalink / raw) To: Kim Gybels; +Cc: git, Junio C Hamano, Jeff King Hi Kim, On Thu, 12 Apr 2018, Kim Gybels wrote: > The poll provided in compat/poll.c is not interrupted by receiving > SIGCHLD. Use a timeout for cleaning up dead children in a timely manner. Maybe say "When using this poll emulation, use a timeout ..."? > diff --git a/daemon.c b/daemon.c > index fe833ea7de..6dc95c1b2f 100644 > --- a/daemon.c > +++ b/daemon.c > @@ -1147,6 +1147,7 @@ static int service_loop(struct socketlist *socklist) > { > struct pollfd *pfd; > int i; > + int poll_timeout = -1; Just reuse the line above: int poll_timeout = -1, i; > @@ -1161,8 +1162,13 @@ static int service_loop(struct socketlist *socklist) > int i; > > check_dead_children(); > - > - if (poll(pfd, socklist->nr, -1) < 0) { > +#ifdef NO_POLL > + poll_timeout = live_children ? 100 : -1; > +#endif > + int ret = poll(pfd, socklist->nr, poll_timeout); > + if (ret == 0) { > + continue; > + } else if (ret < 0) { I would find it a bit easier on the eyes if this did not use curlies, and dropped the unnecessary `else` (`continue` will take care of that): if (!ret) continue; if (ret < 0) [...] Thank you for working on this! Ciao, Dscho ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] daemon: use timeout for uninterruptible poll 2018-04-13 12:36 ` Johannes Schindelin @ 2018-04-15 17:08 ` Kim Gybels 2018-04-18 21:16 ` Johannes Schindelin 0 siblings, 1 reply; 15+ messages in thread From: Kim Gybels @ 2018-04-15 17:08 UTC (permalink / raw) To: Johannes Schindelin; +Cc: git, Junio C Hamano, Jeff King On (13/04/18 14:36), Johannes Schindelin wrote: > > The poll provided in compat/poll.c is not interrupted by receiving > > SIGCHLD. Use a timeout for cleaning up dead children in a timely manner. > > Maybe say "When using this poll emulation, use a timeout ..."? I will rewrite the commit message when I reroll the patch. Calling the poll "uninterruptible" might be wrong as well, although the poll doesn't return with EINTR when a child process terminates, it might still be interruptible in other ways. On a related note, the handler for SIGCHLD is simply not called in Git-for-Windows' daemon. > > diff --git a/daemon.c b/daemon.c > > index fe833ea7de..6dc95c1b2f 100644 > > --- a/daemon.c > > +++ b/daemon.c > > @@ -1147,6 +1147,7 @@ static int service_loop(struct socketlist *socklist) > > { > > struct pollfd *pfd; > > int i; > > + int poll_timeout = -1; > > Just reuse the line above: > > int poll_timeout = -1, i; Sure. > > @@ -1161,8 +1162,13 @@ static int service_loop(struct socketlist *socklist) > > int i; > > > > check_dead_children(); > > - > > - if (poll(pfd, socklist->nr, -1) < 0) { > > +#ifdef NO_POLL > > + poll_timeout = live_children ? 100 : -1; > > +#endif > > + int ret = poll(pfd, socklist->nr, poll_timeout); > > + if (ret == 0) { > > + continue; > > + } else if (ret < 0) { > > I would find it a bit easier on the eyes if this did not use curlies, and > dropped the unnecessary `else` (`continue` will take care of that): > > if (!ret) > continue; > if (ret < 0) > [...] Funny, that's how I would normally write it, if I wasn't so focused on trying to follow the coding quidelines. While I'm at it, I will also fix that sneaky double space after the if. Is it ok to add the timeout for all platforms using the poll emulation, since I only tested for Windows? Best regards, Kim ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] daemon: use timeout for uninterruptible poll 2018-04-15 17:08 ` Kim Gybels @ 2018-04-18 21:16 ` Johannes Schindelin 0 siblings, 0 replies; 15+ messages in thread From: Johannes Schindelin @ 2018-04-18 21:16 UTC (permalink / raw) To: Kim Gybels; +Cc: Randall S. Becker, git, Junio C Hamano, Jeff King Hi Kim, On Sun, 15 Apr 2018, Kim Gybels wrote: > On (13/04/18 14:36), Johannes Schindelin wrote: > > > The poll provided in compat/poll.c is not interrupted by receiving > > > SIGCHLD. Use a timeout for cleaning up dead children in a timely > > > manner. > > > > Maybe say "When using this poll emulation, use a timeout ..."? > > I will rewrite the commit message when I reroll the patch. Calling the > poll "uninterruptible" might be wrong as well, although the poll > doesn't return with EINTR when a child process terminates, it might > still be interruptible in other ways. On a related note, the handler > for SIGCHLD is simply not called in Git-for-Windows' daemon. Right. There is no signal infrastructure on Windows that is an exact equivalent of what Junio desires. > > > @@ -1161,8 +1162,13 @@ static int service_loop(struct socketlist *socklist) > > > int i; > > > > > > check_dead_children(); > > > - > > > - if (poll(pfd, socklist->nr, -1) < 0) { > > > +#ifdef NO_POLL > > > + poll_timeout = live_children ? 100 : -1; > > > +#endif > > > + int ret = poll(pfd, socklist->nr, poll_timeout); > > > + if (ret == 0) { > > > + continue; > > > + } else if (ret < 0) { > > > > I would find it a bit easier on the eyes if this did not use curlies, and > > dropped the unnecessary `else` (`continue` will take care of that): > > > > if (!ret) > > continue; > > if (ret < 0) > > [...] > > Funny, that's how I would normally write it, if I wasn't so focused on > trying to follow the coding quidelines. While I'm at it, I will also > fix that sneaky double space after the if. :-) > Is it ok to add the timeout for all platforms using the poll > emulation, since I only tested for Windows? From my reading of the patch, it changes only one thing, and only in the case that the developer asked to build with NO_POLL (which means that the platform does not have a native poll()): instead of waiting indefinitely, the poll() call is interrupted in regular intervals to give reap_dead_children() a chance to clean up. And that's all it does. So it is a simply heartbeat for platforms that require it, and that heartbeat would not even hurt any platform that would *not* require it. In short: from my point of view, it is fine to add the timeout for all NO_POLL platforms, even if it was only tested on Windows. Of course, we *do* know that there is one other user of NO_POLL: the NonStop platform. Randall, would you mind testing these two patches on NonStop? Thanks, Johannes ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] daemon: use timeout for uninterruptible poll 2018-04-12 21:07 ` [PATCH 1/2] daemon: use timeout for uninterruptible poll Kim Gybels 2018-04-13 12:36 ` Johannes Schindelin @ 2018-04-15 21:54 ` Junio C Hamano 2018-04-15 22:16 ` Junio C Hamano 2018-04-18 21:07 ` Johannes Schindelin 1 sibling, 2 replies; 15+ messages in thread From: Junio C Hamano @ 2018-04-15 21:54 UTC (permalink / raw) To: Kim Gybels; +Cc: git, Johannes Schindelin, Jeff King Kim Gybels <kgybels@infogroep.be> writes: > The poll provided in compat/poll.c is not interrupted by receiving > SIGCHLD. Use a timeout for cleaning up dead children in a timely manner. I think you identified the problem and diagnosed it correctly, but I find that the change proposed here introduces a severe layering violation. The code is still calling what is called poll(), which should not have such a broken semantics. The ideal solution would be to fix the emulation so that it also properly works for reaping a dead child process, but if that is not possible, another solution that does not break the API layering would probably be to introduce our own version of something similar to poll() that helps various platforms that cannot implement the real poll() faithfully for whatever reason. Such an xpoll() API function we introduce (and implement in compat/poll.c) may take, in addition to the usual parameters to reall poll(), the value of live_children we have at this call site. With that - On platforms whose poll() does work correctly for culling dead children will just ignore the live_children paramater in its implementation of xpoll() - On other platforms, it will shorten the timeout depending on the need to cull dead children, just like your patch did. Thanks. > > Signed-off-by: Kim Gybels <kgybels@infogroep.be> > --- > daemon.c | 10 ++++++++-- > 1 file changed, 8 insertions(+), 2 deletions(-) > > diff --git a/daemon.c b/daemon.c > index fe833ea7de..6dc95c1b2f 100644 > --- a/daemon.c > +++ b/daemon.c > @@ -1147,6 +1147,7 @@ static int service_loop(struct socketlist *socklist) > { > struct pollfd *pfd; > int i; > + int poll_timeout = -1; > > pfd = xcalloc(socklist->nr, sizeof(struct pollfd)); > > @@ -1161,8 +1162,13 @@ static int service_loop(struct socketlist *socklist) > int i; > > check_dead_children(); > - > - if (poll(pfd, socklist->nr, -1) < 0) { > +#ifdef NO_POLL > + poll_timeout = live_children ? 100 : -1; > +#endif > + int ret = poll(pfd, socklist->nr, poll_timeout); > + if (ret == 0) { > + continue; > + } else if (ret < 0) { > if (errno != EINTR) { > logerror("Poll failed, resuming: %s", > strerror(errno)); ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] daemon: use timeout for uninterruptible poll 2018-04-15 21:54 ` Junio C Hamano @ 2018-04-15 22:16 ` Junio C Hamano 2018-04-18 21:07 ` Johannes Schindelin 1 sibling, 0 replies; 15+ messages in thread From: Junio C Hamano @ 2018-04-15 22:16 UTC (permalink / raw) To: Kim Gybels; +Cc: git, Johannes Schindelin, Jeff King Junio C Hamano <gitster@pobox.com> writes: > I think you identified the problem and diagnosed it correctly, but I > find that the change proposed here introduces a severe layering > violation. The code is still calling what is called poll(), which > should not have such a broken semantics. I only mentioned a piece of fact (i.e. "the code calls poll() after the patch"), but I guess I should have made it clear what makes that a bad thing. Future readers of the code in daemon.c are required to be aware of the limitation of some poll() emulation; they cannot "optimize" out and made the code unware of the (non-)existence of remaining children, for example. When the callsite uses poll(), those who know how poll() ought to work won't be. The reason why the xpoll() I mentioned as a possible alternative would be better is because they will learn why we do not use normal poll() there and why we maintain and pass live_children (and those who cut and paste without understanding the existing code _will_ copy the calling site of xpoll(), which will automatically copy the need to maintain the number of remaining children ;-). ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] daemon: use timeout for uninterruptible poll 2018-04-15 21:54 ` Junio C Hamano 2018-04-15 22:16 ` Junio C Hamano @ 2018-04-18 21:07 ` Johannes Schindelin 2018-04-18 21:51 ` Junio C Hamano 1 sibling, 1 reply; 15+ messages in thread From: Johannes Schindelin @ 2018-04-18 21:07 UTC (permalink / raw) To: Junio C Hamano; +Cc: Kim Gybels, git, Jeff King Hi Junio, On Mon, 16 Apr 2018, Junio C Hamano wrote: > Kim Gybels <kgybels@infogroep.be> writes: > > > The poll provided in compat/poll.c is not interrupted by receiving > > SIGCHLD. Use a timeout for cleaning up dead children in a timely manner. > > I think you identified the problem and diagnosed it correctly, but I > find that the change proposed here introduces a severe layering > violation. The code is still calling what is called poll(), which > should not have such a broken semantics. While I have sympathy for your desire to apply pure POSIX functionality, the reality is that we do not have this luxury. Not if we want to support Git on the still most prevalent development platform: Windows. On Windows, you simply do not have that poll() that you are looking for. In particular, there is no signal handling of the type you seem to want to require. As to the layering violation you mention, first a HN quote, just to loosen the mood, and to at least partially ease the blow delivered by your mail: There is no such thing as a layering violation. You should be immediately suspicious of anyone who claims that there are such things. ;-) Seriously again. If you care to have a look at the patch, you will see that the loop (which will now benefit from Kim's timeout on platforms without POSIX signal handling) *already* contains that call to reap_dead_children(). In other words, you scolded Kim for something that this patch did not introduce, but which was already there. Unless I am misunderstanding violently what you say, that is, in which case I would like to ask for a clarification why this patch (which does not change a thing unless NO_POLL is defined!) must be rejected, and while at it, I would like to ask you how introducing a layer of indirection with a full new function that is at least moderately misleading (as it would be named xpoll() despite your desire that it should do things that poll() does *not* do) would be preferable to this here patch that changes but a few lines to introduce a regular heartbeat check for platforms that require it? Thank you, Dscho ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] daemon: use timeout for uninterruptible poll 2018-04-18 21:07 ` Johannes Schindelin @ 2018-04-18 21:51 ` Junio C Hamano 2018-04-19 21:33 ` Kim Gybels 0 siblings, 1 reply; 15+ messages in thread From: Junio C Hamano @ 2018-04-18 21:51 UTC (permalink / raw) To: Johannes Schindelin; +Cc: Kim Gybels, git, Jeff King Johannes Schindelin <Johannes.Schindelin@gmx.de> writes: > Unless I am misunderstanding violently what you say, that is, in which > case I would like to ask for a clarification why this patch (which does > not change a thing unless NO_POLL is defined!) must be rejected, and while > at it, I would like to ask you how introducing a layer of indirection with > a full new function that is at least moderately misleading (as it would be > named xpoll() despite your desire that it should do things that poll() > does *not* do) would be preferable to this here patch that changes but a > few lines to introduce a regular heartbeat check for platforms that Our xwrite() and other xfoo() are to "fix" undesirable aspect of the underlying pure POSIX API to make it more suitable for our codebase. When pure POSIX poll() that requires the implementing or emulating platform pays attention to the children being waited on is not appropriate for the codepath we are using (i.e. the place where the patch is touching), it would be in line to introduce a "fixed" API that allows us to pass that information, so that we can build on top of that abstraction that is *not* pure POSIX abstraction, no? After all, you are the one who constantly whine that Git is implemented on POSIX API and it is inconvenient for other platforms. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] daemon: use timeout for uninterruptible poll 2018-04-18 21:51 ` Junio C Hamano @ 2018-04-19 21:33 ` Kim Gybels 2018-04-19 23:18 ` Junio C Hamano 0 siblings, 1 reply; 15+ messages in thread From: Kim Gybels @ 2018-04-19 21:33 UTC (permalink / raw) To: Junio C Hamano; +Cc: Johannes Schindelin, git, Jeff King On (19/04/18 06:51), Junio C Hamano wrote: > Johannes Schindelin <Johannes.Schindelin@gmx.de> writes: > > In other words, you scolded Kim for something that this patch did not > > introduce, but which was already there. I didn't feel scolded, just Junio raising a concern about maintainability of the code. > > Unless I am misunderstanding violently what you say, that is, in which > > case I would like to ask for a clarification why this patch (which does > > not change a thing unless NO_POLL is defined!) must be rejected, and while > > at it, I would like to ask you how introducing a layer of indirection with > > a full new function that is at least moderately misleading (as it would be > > named xpoll() despite your desire that it should do things that poll() > > does *not* do) would be preferable to this here patch that changes but a > > few lines to introduce a regular heartbeat check for platforms that > > Our xwrite() and other xfoo() are to "fix" undesirable aspect of the > underlying pure POSIX API to make it more suitable for our codebase. > When pure POSIX poll() that requires the implementing or emulating > platform pays attention to the children being waited on is not > appropriate for the codepath we are using (i.e. the place where the > patch is touching), it would be in line to introduce a "fixed" API > that allows us to pass that information, so that we can build on top > of that abstraction that is *not* pure POSIX abstraction, no? After > all, you are the one who constantly whine that Git is implemented on > POSIX API and it is inconvenient for other platforms. There is another issue with the existing code that this new "xpoll" will need to take into account. If a SIGCHLD arrives between the call to check_dead_children and poll, the poll will not be interupted by it, resulting in the child not being reaped until another child terminates or a client connects. Currently, the effect is just a zombie process for a longer time, however, the proposed patch (daemon: graceful shutdown of client connection) relies on the cleanup to close the client connection. When I have time, I will reroll including a change to ppoll. -Kim ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] daemon: use timeout for uninterruptible poll 2018-04-19 21:33 ` Kim Gybels @ 2018-04-19 23:18 ` Junio C Hamano 0 siblings, 0 replies; 15+ messages in thread From: Junio C Hamano @ 2018-04-19 23:18 UTC (permalink / raw) To: Kim Gybels; +Cc: Johannes Schindelin, git, Jeff King Kim Gybels <kgybels@infogroep.be> writes: >> > In other words, you scolded Kim for something that this patch did not >> > introduce, but which was already there. > > I didn't feel scolded, just Junio raising a concern about maintainability of > the code. FWIW, I didn't mean to scold, either. Rather I was pointing out that the code already maintains the number of remaining children, which means that a more portable abstraction than poll(), if we desired to have one, would merely be one step away, as we already know that at least need that information to help Windows. > There is another issue with the existing code that this new > "xpoll" will need to take into account. If a SIGCHLD arrives > between the call to check_dead_children and poll, the poll will > not be interupted by it, resulting in the child not being reaped > until another child terminates or a client connects. Currently, > the effect is just a zombie process for a longer time, however, > the proposed patch (daemon: graceful shutdown of client > connection) relies on the cleanup to close the client connection. Good analysis. That consideration may mean that xpoll() as I suggested is useless as a possible more portable abstraction (or, "an abstraction that is implementable easily in both POSIX and non-POSIX world"), but I suspect we would still want to have an internal "portable" API that serves the purpose similar to how we wanted POSIX poll() to serve. The place the patch is touching is not the only place poll() is used in the codebase, and other places (and future ones we would add) may benefit from having one. Thanks for being constructive. ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 2/2] daemon: graceful shutdown of client connection 2018-04-12 21:07 [PATCH 0/2] Fix early EOF with GfW daemon Kim Gybels 2018-04-12 21:07 ` [PATCH 1/2] daemon: use timeout for uninterruptible poll Kim Gybels @ 2018-04-12 21:07 ` Kim Gybels 2018-04-13 13:03 ` Johannes Schindelin 1 sibling, 1 reply; 15+ messages in thread From: Kim Gybels @ 2018-04-12 21:07 UTC (permalink / raw) To: git; +Cc: Junio C Hamano, Johannes Schindelin, Jeff King, Kim Gybels On Windows, a connection is shutdown when the last open handle to it is closed. When that last open handle is stdout of our child process, an abortive shutdown is triggered when said process exits. Ensure a graceful shutdown of the client connection by keeping an open handle until we detect our child process has finished. This allows all the data to be sent to the client, instead of being discarded. Fixes https://github.com/git-for-windows/git/issues/304 Signed-off-by: Kim Gybels <kgybels@infogroep.be> --- daemon.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/daemon.c b/daemon.c index 6dc95c1b2f..97fadd62d1 100644 --- a/daemon.c +++ b/daemon.c @@ -834,9 +834,10 @@ static struct child { struct child *next; struct child_process cld; struct sockaddr_storage address; + int connection; } *firstborn; -static void add_child(struct child_process *cld, struct sockaddr *addr, socklen_t addrlen) +static void add_child(struct child_process *cld, struct sockaddr *addr, socklen_t addrlen, int connection) { struct child *newborn, **cradle; @@ -844,6 +845,7 @@ static void add_child(struct child_process *cld, struct sockaddr *addr, socklen_ live_children++; memcpy(&newborn->cld, cld, sizeof(*cld)); memcpy(&newborn->address, addr, addrlen); + newborn->connection = connection; for (cradle = &firstborn; *cradle; cradle = &(*cradle)->next) if (!addrcmp(&(*cradle)->address, &newborn->address)) break; @@ -888,6 +890,7 @@ static void check_dead_children(void) *cradle = blanket->next; live_children--; child_process_clear(&blanket->cld); + close(blanket->connection); free(blanket); } else cradle = &blanket->next; @@ -928,13 +931,13 @@ static void handle(int incoming, struct sockaddr *addr, socklen_t addrlen) } cld.argv = cld_argv.argv; - cld.in = incoming; + cld.in = dup(incoming); cld.out = dup(incoming); if (start_command(&cld)) logerror("unable to fork"); else - add_child(&cld, addr, addrlen); + add_child(&cld, addr, addrlen, incoming); } static void child_handler(int signo) -- 2.17.0.windows.1 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH 2/2] daemon: graceful shutdown of client connection 2018-04-12 21:07 ` [PATCH 2/2] daemon: graceful shutdown of client connection Kim Gybels @ 2018-04-13 13:03 ` Johannes Schindelin 2018-04-15 20:21 ` Kim Gybels 0 siblings, 1 reply; 15+ messages in thread From: Johannes Schindelin @ 2018-04-13 13:03 UTC (permalink / raw) To: Kim Gybels; +Cc: git, Junio C Hamano, Jeff King Hi Kim, On Thu, 12 Apr 2018, Kim Gybels wrote: > On Windows, a connection is shutdown when the last open handle to it is > closed. When that last open handle is stdout of our child process, an > abortive shutdown is triggered when said process exits. Ensure a > graceful shutdown of the client connection by keeping an open handle > until we detect our child process has finished. This allows all the data > to be sent to the client, instead of being discarded. Nice explanation! > @@ -928,13 +931,13 @@ static void handle(int incoming, struct sockaddr *addr, socklen_t addrlen) > } > > cld.argv = cld_argv.argv; > - cld.in = incoming; > + cld.in = dup(incoming); At first I was worried that somebody might want to remove this in the future, but then I saw this line (which also calls dup()): > cld.out = dup(incoming); > > if (start_command(&cld)) > logerror("unable to fork"); > else > - add_child(&cld, addr, addrlen); > + add_child(&cld, addr, addrlen, incoming); > } > > static void child_handler(int signo) Nice work! I wonder whether you found a reliable way to trigger this? It would be nice to have a regression test for this. Ciao, Dscho ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 2/2] daemon: graceful shutdown of client connection 2018-04-13 13:03 ` Johannes Schindelin @ 2018-04-15 20:21 ` Kim Gybels 2018-04-18 21:48 ` Johannes Schindelin 0 siblings, 1 reply; 15+ messages in thread From: Kim Gybels @ 2018-04-15 20:21 UTC (permalink / raw) To: Johannes Schindelin; +Cc: git, Junio C Hamano, Jeff King, Oleg Gubanov On (13/04/18 15:03), Johannes Schindelin wrote: > I wonder whether you found a reliable way to trigger this? It would be > nice to have a regression test for this. On my system, it reproduced reliably using Oleg's example [1], below is my bash version of it. Script to generate repository with some history: $ cat example.sh #!/bin/bash git init example cd example git --help > foo.txt for i in $(seq 1 12); do cat foo.txt foo.txt > bar.txt mv bar.txt foo.txt git add foo.txt git commit -m v$i done $ ./example.sh Initialized empty Git repository in C:/git/bug/example/.git/ [master (root-commit) 2e44b4a] v1 1 file changed, 84 insertions(+) create mode 100644 foo.txt [master 9791332] v2 1 file changed, 84 insertions(+) [master 524e672] v3 1 file changed, 168 insertions(+) [master afec6ef] v4 1 file changed, 336 insertions(+) [master 1bcd9cc] v5 1 file changed, 672 insertions(+) [master 2f38a8e] v6 1 file changed, 1344 insertions(+) [master 33382fe] v7 1 file changed, 2688 insertions(+) [master 6c2cbd6] v8 1 file changed, 5376 insertions(+) [master 8d0770f] v9 1 file changed, 10752 insertions(+) [master 517d650] v10 1 file changed, 21504 insertions(+) [master 9e12406] v11 1 file changed, 43008 insertions(+) [master 4c4f600] v12 1 file changed, 86016 insertions(+) Server side: $ git daemon --verbose --reuseaddr --base-path=$(pwd) --export-all [4760] Ready to rumble [696] Connection from 127.0.0.1:2054 [696] unable to set SO_KEEPALIVE on socket: No such file or directory [696] Extended attribute "host": 127.0.0.1 [696] Request upload-pack for '/example' Client side: $ git clone git://127.0.0.1/example Cloning into 'example'... remote: Counting objects: 36, done. remote: Compressing objects: 100% (24/24), done. fatal: read error: Invalid argument fatal: early EOF fatal: index-pack failed System information: $ git --version --build-options git version 2.17.0.windows.1 cpu: x86_64 built from commit: e7621d891d081acff6acd1f0ba6ae0adce06dd09 sizeof-long: 4 $ cmd.exe /c ver Microsoft Windows [Version 10.0.16299.371] Best regards, Kim [1] https://github.com/git-for-windows/git/issues/304#issuecomment-274266897 ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 2/2] daemon: graceful shutdown of client connection 2018-04-15 20:21 ` Kim Gybels @ 2018-04-18 21:48 ` Johannes Schindelin 0 siblings, 0 replies; 15+ messages in thread From: Johannes Schindelin @ 2018-04-18 21:48 UTC (permalink / raw) To: Kim Gybels; +Cc: git, Junio C Hamano, Jeff King, Oleg Gubanov Hi Kim, On Sun, 15 Apr 2018, Kim Gybels wrote: > On (13/04/18 15:03), Johannes Schindelin wrote: > > I wonder whether you found a reliable way to trigger this? It would be > > nice to have a regression test for this. > > On my system, it reproduced reliably using Oleg's example [1], below is > my bash version of it. Okay. > Script to generate repository with some history: > > $ cat example.sh > #!/bin/bash > > git init example > cd example > > git --help > foo.txt > > for i in $(seq 1 12); do > cat foo.txt foo.txt > bar.txt > mv bar.txt foo.txt > git add foo.txt > git commit -m v$i > done Okay, so this sets up a minimal repository with a moderate size, inflating foo.txt from the initial Git help text by factor 2**12 = 4096. The help text is around 2kB, so we end up with an ~8MB large file in the end that grew exponentially. > $ ./example.sh > Initialized empty Git repository in C:/git/bug/example/.git/ > [master (root-commit) 2e44b4a] v1 > 1 file changed, 84 insertions(+) > create mode 100644 foo.txt > [master 9791332] v2 > 1 file changed, 84 insertions(+) > [master 524e672] v3 > 1 file changed, 168 insertions(+) > [master afec6ef] v4 > 1 file changed, 336 insertions(+) > [master 1bcd9cc] v5 > 1 file changed, 672 insertions(+) > [master 2f38a8e] v6 > 1 file changed, 1344 insertions(+) > [master 33382fe] v7 > 1 file changed, 2688 insertions(+) > [master 6c2cbd6] v8 > 1 file changed, 5376 insertions(+) > [master 8d0770f] v9 > 1 file changed, 10752 insertions(+) > [master 517d650] v10 > 1 file changed, 21504 insertions(+) > [master 9e12406] v11 > 1 file changed, 43008 insertions(+) > [master 4c4f600] v12 > 1 file changed, 86016 insertions(+) > > Server side: > > $ git daemon --verbose --reuseaddr --base-path=$(pwd) --export-all > [4760] Ready to rumble > [696] Connection from 127.0.0.1:2054 > [696] unable to set SO_KEEPALIVE on socket: No such file or directory > [696] Extended attribute "host": 127.0.0.1 > [696] Request upload-pack for '/example' I guess apart from the generated repo size (which would make this an expensive test in and of itself), the fact that we have to run the daemon (and that lib-git-daemon.sh requires the PIPE prerequisite which we disabled on Windows) makes it hard to turn this into a regular regression test in t/. Although we *could* of course introduce a test that does not use lib-git-daemon.sh and is hidden behind the EXPENSIVE prerequisite or some such. BTW I just tested this, and it indeed fixes the problem, so I am eager to ship it to the Git for Windows users. Let's see whether we can convince Junio that your quite small and easy-to-review patches are not too much of a maintenance burden, and the fact that they fix a long-standing bug should help. BTW I had to apply this to build with DEVELOPER=1: diff --git a/daemon.c b/daemon.c index 97fadd62d10..1cc901e9739 100644 --- a/daemon.c +++ b/daemon.c @@ -1162,13 +1162,13 @@ static int service_loop(struct socketlist *socklist) signal(SIGCHLD, child_handler); for (;;) { - int i; + int i, ret; check_dead_children(); #ifdef NO_POLL poll_timeout = live_children ? 100 : -1; #endif - int ret = poll(pfd, socklist->nr, poll_timeout); + ret = poll(pfd, socklist->nr, poll_timeout); if (ret == 0) { continue; } else if (ret < 0) { Would you mind squashing that in? Thanks, Dscho ^ permalink raw reply related [flat|nested] 15+ messages in thread
end of thread, other threads:[~2018-04-19 23:18 UTC | newest] Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2018-04-12 21:07 [PATCH 0/2] Fix early EOF with GfW daemon Kim Gybels 2018-04-12 21:07 ` [PATCH 1/2] daemon: use timeout for uninterruptible poll Kim Gybels 2018-04-13 12:36 ` Johannes Schindelin 2018-04-15 17:08 ` Kim Gybels 2018-04-18 21:16 ` Johannes Schindelin 2018-04-15 21:54 ` Junio C Hamano 2018-04-15 22:16 ` Junio C Hamano 2018-04-18 21:07 ` Johannes Schindelin 2018-04-18 21:51 ` Junio C Hamano 2018-04-19 21:33 ` Kim Gybels 2018-04-19 23:18 ` Junio C Hamano 2018-04-12 21:07 ` [PATCH 2/2] daemon: graceful shutdown of client connection Kim Gybels 2018-04-13 13:03 ` Johannes Schindelin 2018-04-15 20:21 ` Kim Gybels 2018-04-18 21:48 ` Johannes Schindelin
Code repositories for project(s) associated with this public inbox https://80x24.org/mirrors/git.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).