git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH 0/2] Fix early EOF with GfW daemon
@ 2018-04-12 21:07 Kim Gybels
  2018-04-12 21:07 ` [PATCH 1/2] daemon: use timeout for uninterruptible poll Kim Gybels
  2018-04-12 21:07 ` [PATCH 2/2] daemon: graceful shutdown of client connection Kim Gybels
  0 siblings, 2 replies; 15+ messages in thread
From: Kim Gybels @ 2018-04-12 21:07 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Johannes Schindelin, Jeff King, Kim Gybels

It has been reported [1] that cloning from a Git-for-Windows daemon would
sometimes fail with an early EOF error:

  $ git clone git://server/test
  Cloning into 'test'...
  remote: Counting objects: 36, done.
  remote: Compressing objects: 100% (24/24), done.
  fatal: read error: Invalid argument
  fatal: early EOF
  fatal: index-pack failed

These patches solve the issue by only changing git-daemon, its child processes
can remain unaware that stdin/stdout are actually network connections.

[1] https://github.com/git-for-windows/git/issues/304

Kim Gybels (2):
  daemon: use timeout for uninterruptible poll
  daemon: graceful shutdown of client connection

 daemon.c | 19 ++++++++++++++-----
 1 file changed, 14 insertions(+), 5 deletions(-)

-- 
2.17.0.windows.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 1/2] daemon: use timeout for uninterruptible poll
  2018-04-12 21:07 [PATCH 0/2] Fix early EOF with GfW daemon Kim Gybels
@ 2018-04-12 21:07 ` Kim Gybels
  2018-04-13 12:36   ` Johannes Schindelin
  2018-04-15 21:54   ` Junio C Hamano
  2018-04-12 21:07 ` [PATCH 2/2] daemon: graceful shutdown of client connection Kim Gybels
  1 sibling, 2 replies; 15+ messages in thread
From: Kim Gybels @ 2018-04-12 21:07 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Johannes Schindelin, Jeff King, Kim Gybels

The poll provided in compat/poll.c is not interrupted by receiving
SIGCHLD. Use a timeout for cleaning up dead children in a timely manner.

Signed-off-by: Kim Gybels <kgybels@infogroep.be>
---
 daemon.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/daemon.c b/daemon.c
index fe833ea7de..6dc95c1b2f 100644
--- a/daemon.c
+++ b/daemon.c
@@ -1147,6 +1147,7 @@ static int service_loop(struct socketlist *socklist)
 {
 	struct pollfd *pfd;
 	int i;
+	int poll_timeout = -1;
 
 	pfd = xcalloc(socklist->nr, sizeof(struct pollfd));
 
@@ -1161,8 +1162,13 @@ static int service_loop(struct socketlist *socklist)
 		int i;
 
 		check_dead_children();
-
-		if (poll(pfd, socklist->nr, -1) < 0) {
+#ifdef NO_POLL
+		poll_timeout = live_children ? 100 : -1;
+#endif
+		int ret = poll(pfd, socklist->nr, poll_timeout);
+		if  (ret == 0) {
+			continue;
+		} else if (ret < 0) {
 			if (errno != EINTR) {
 				logerror("Poll failed, resuming: %s",
 				      strerror(errno));
-- 
2.17.0.windows.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 2/2] daemon: graceful shutdown of client connection
  2018-04-12 21:07 [PATCH 0/2] Fix early EOF with GfW daemon Kim Gybels
  2018-04-12 21:07 ` [PATCH 1/2] daemon: use timeout for uninterruptible poll Kim Gybels
@ 2018-04-12 21:07 ` Kim Gybels
  2018-04-13 13:03   ` Johannes Schindelin
  1 sibling, 1 reply; 15+ messages in thread
From: Kim Gybels @ 2018-04-12 21:07 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Johannes Schindelin, Jeff King, Kim Gybels

On Windows, a connection is shutdown when the last open handle to it is
closed. When that last open handle is stdout of our child process, an
abortive shutdown is triggered when said process exits. Ensure a
graceful shutdown of the client connection by keeping an open handle
until we detect our child process has finished. This allows all the data
to be sent to the client, instead of being discarded.

Fixes https://github.com/git-for-windows/git/issues/304

Signed-off-by: Kim Gybels <kgybels@infogroep.be>
---
 daemon.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/daemon.c b/daemon.c
index 6dc95c1b2f..97fadd62d1 100644
--- a/daemon.c
+++ b/daemon.c
@@ -834,9 +834,10 @@ static struct child {
 	struct child *next;
 	struct child_process cld;
 	struct sockaddr_storage address;
+	int connection;
 } *firstborn;
 
-static void add_child(struct child_process *cld, struct sockaddr *addr, socklen_t addrlen)
+static void add_child(struct child_process *cld, struct sockaddr *addr, socklen_t addrlen, int connection)
 {
 	struct child *newborn, **cradle;
 
@@ -844,6 +845,7 @@ static void add_child(struct child_process *cld, struct sockaddr *addr, socklen_
 	live_children++;
 	memcpy(&newborn->cld, cld, sizeof(*cld));
 	memcpy(&newborn->address, addr, addrlen);
+	newborn->connection = connection;
 	for (cradle = &firstborn; *cradle; cradle = &(*cradle)->next)
 		if (!addrcmp(&(*cradle)->address, &newborn->address))
 			break;
@@ -888,6 +890,7 @@ static void check_dead_children(void)
 			*cradle = blanket->next;
 			live_children--;
 			child_process_clear(&blanket->cld);
+			close(blanket->connection);
 			free(blanket);
 		} else
 			cradle = &blanket->next;
@@ -928,13 +931,13 @@ static void handle(int incoming, struct sockaddr *addr, socklen_t addrlen)
 	}
 
 	cld.argv = cld_argv.argv;
-	cld.in = incoming;
+	cld.in = dup(incoming);
 	cld.out = dup(incoming);
 
 	if (start_command(&cld))
 		logerror("unable to fork");
 	else
-		add_child(&cld, addr, addrlen);
+		add_child(&cld, addr, addrlen, incoming);
 }
 
 static void child_handler(int signo)
-- 
2.17.0.windows.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] daemon: use timeout for uninterruptible poll
  2018-04-12 21:07 ` [PATCH 1/2] daemon: use timeout for uninterruptible poll Kim Gybels
@ 2018-04-13 12:36   ` Johannes Schindelin
  2018-04-15 17:08     ` Kim Gybels
  2018-04-15 21:54   ` Junio C Hamano
  1 sibling, 1 reply; 15+ messages in thread
From: Johannes Schindelin @ 2018-04-13 12:36 UTC (permalink / raw)
  To: Kim Gybels; +Cc: git, Junio C Hamano, Jeff King

Hi Kim,

On Thu, 12 Apr 2018, Kim Gybels wrote:

> The poll provided in compat/poll.c is not interrupted by receiving
> SIGCHLD. Use a timeout for cleaning up dead children in a timely manner.

Maybe say "When using this poll emulation, use a timeout ..."?

> diff --git a/daemon.c b/daemon.c
> index fe833ea7de..6dc95c1b2f 100644
> --- a/daemon.c
> +++ b/daemon.c
> @@ -1147,6 +1147,7 @@ static int service_loop(struct socketlist *socklist)
>  {
>  	struct pollfd *pfd;
>  	int i;
> +	int poll_timeout = -1;

Just reuse the line above:

	int poll_timeout = -1, i;

> @@ -1161,8 +1162,13 @@ static int service_loop(struct socketlist *socklist)
>  		int i;
>  
>  		check_dead_children();
> -
> -		if (poll(pfd, socklist->nr, -1) < 0) {
> +#ifdef NO_POLL
> +		poll_timeout = live_children ? 100 : -1;
> +#endif
> +		int ret = poll(pfd, socklist->nr, poll_timeout);
> +		if  (ret == 0) {
> +			continue;
> +		} else if (ret < 0) {

I would find it a bit easier on the eyes if this did not use curlies, and
dropped the unnecessary `else` (`continue` will take care of that):

		if (!ret)
			continue;
		if (ret < 0)
			[...]

Thank you for working on this!

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] daemon: graceful shutdown of client connection
  2018-04-12 21:07 ` [PATCH 2/2] daemon: graceful shutdown of client connection Kim Gybels
@ 2018-04-13 13:03   ` Johannes Schindelin
  2018-04-15 20:21     ` Kim Gybels
  0 siblings, 1 reply; 15+ messages in thread
From: Johannes Schindelin @ 2018-04-13 13:03 UTC (permalink / raw)
  To: Kim Gybels; +Cc: git, Junio C Hamano, Jeff King

Hi Kim,

On Thu, 12 Apr 2018, Kim Gybels wrote:

> On Windows, a connection is shutdown when the last open handle to it is
> closed. When that last open handle is stdout of our child process, an
> abortive shutdown is triggered when said process exits. Ensure a
> graceful shutdown of the client connection by keeping an open handle
> until we detect our child process has finished. This allows all the data
> to be sent to the client, instead of being discarded.

Nice explanation!

> @@ -928,13 +931,13 @@ static void handle(int incoming, struct sockaddr *addr, socklen_t addrlen)
>  	}
>  
>  	cld.argv = cld_argv.argv;
> -	cld.in = incoming;
> +	cld.in = dup(incoming);

At first I was worried that somebody might want to remove this in the
future, but then I saw this line (which also calls dup()):

>  	cld.out = dup(incoming);
>  
>  	if (start_command(&cld))
>  		logerror("unable to fork");
>  	else
> -		add_child(&cld, addr, addrlen);
> +		add_child(&cld, addr, addrlen, incoming);
>  }
>  
>  static void child_handler(int signo)

Nice work!

I wonder whether you found a reliable way to trigger this? It would be
nice to have a regression test for this.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] daemon: use timeout for uninterruptible poll
  2018-04-13 12:36   ` Johannes Schindelin
@ 2018-04-15 17:08     ` Kim Gybels
  2018-04-18 21:16       ` Johannes Schindelin
  0 siblings, 1 reply; 15+ messages in thread
From: Kim Gybels @ 2018-04-15 17:08 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, Junio C Hamano, Jeff King

On (13/04/18 14:36), Johannes Schindelin wrote:
> > The poll provided in compat/poll.c is not interrupted by receiving
> > SIGCHLD. Use a timeout for cleaning up dead children in a timely manner.
> 
> Maybe say "When using this poll emulation, use a timeout ..."?

I will rewrite the commit message when I reroll the patch. Calling the
poll "uninterruptible" might be wrong as well, although the poll
doesn't return with EINTR when a child process terminates, it might
still be interruptible in other ways. On a related note, the handler
for SIGCHLD is simply not called in Git-for-Windows' daemon.

> > diff --git a/daemon.c b/daemon.c
> > index fe833ea7de..6dc95c1b2f 100644
> > --- a/daemon.c
> > +++ b/daemon.c
> > @@ -1147,6 +1147,7 @@ static int service_loop(struct socketlist *socklist)
> >  {
> >  	struct pollfd *pfd;
> >  	int i;
> > +	int poll_timeout = -1;
> 
> Just reuse the line above:
> 
> 	int poll_timeout = -1, i;

Sure.

> > @@ -1161,8 +1162,13 @@ static int service_loop(struct socketlist *socklist)
> >  		int i;
> >  
> >  		check_dead_children();
> > -
> > -		if (poll(pfd, socklist->nr, -1) < 0) {
> > +#ifdef NO_POLL
> > +		poll_timeout = live_children ? 100 : -1;
> > +#endif
> > +		int ret = poll(pfd, socklist->nr, poll_timeout);
> > +		if  (ret == 0) {
> > +			continue;
> > +		} else if (ret < 0) {
> 
> I would find it a bit easier on the eyes if this did not use curlies, and
> dropped the unnecessary `else` (`continue` will take care of that):
> 
> 		if (!ret)
> 			continue;
> 		if (ret < 0)
> 			[...]

Funny, that's how I would normally write it, if I wasn't so focused on
trying to follow the coding quidelines. While I'm at it, I will also
fix that sneaky double space after the if.

Is it ok to add the timeout for all platforms using the poll
emulation, since I only tested for Windows?

Best regards,
Kim

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] daemon: graceful shutdown of client connection
  2018-04-13 13:03   ` Johannes Schindelin
@ 2018-04-15 20:21     ` Kim Gybels
  2018-04-18 21:48       ` Johannes Schindelin
  0 siblings, 1 reply; 15+ messages in thread
From: Kim Gybels @ 2018-04-15 20:21 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, Junio C Hamano, Jeff King, Oleg Gubanov

On (13/04/18 15:03), Johannes Schindelin wrote:
> I wonder whether you found a reliable way to trigger this? It would be
> nice to have a regression test for this.

On my system, it reproduced reliably using Oleg's example [1], below is my bash
version of it.

Script to generate repository with some history:

  $ cat example.sh
  #!/bin/bash
  
  git init example
  cd example
  
  git --help > foo.txt
  
  for i in $(seq 1 12); do
      cat foo.txt foo.txt > bar.txt
      mv bar.txt foo.txt
      git add foo.txt
      git commit -m v$i
  done
  
  $ ./example.sh
  Initialized empty Git repository in C:/git/bug/example/.git/
  [master (root-commit) 2e44b4a] v1
   1 file changed, 84 insertions(+)
   create mode 100644 foo.txt
  [master 9791332] v2
   1 file changed, 84 insertions(+)
  [master 524e672] v3
   1 file changed, 168 insertions(+)
  [master afec6ef] v4
   1 file changed, 336 insertions(+)
  [master 1bcd9cc] v5
   1 file changed, 672 insertions(+)
  [master 2f38a8e] v6
   1 file changed, 1344 insertions(+)
  [master 33382fe] v7
   1 file changed, 2688 insertions(+)
  [master 6c2cbd6] v8
   1 file changed, 5376 insertions(+)
  [master 8d0770f] v9
   1 file changed, 10752 insertions(+)
  [master 517d650] v10
   1 file changed, 21504 insertions(+)
  [master 9e12406] v11
   1 file changed, 43008 insertions(+)
  [master 4c4f600] v12
   1 file changed, 86016 insertions(+)

Server side:

  $ git daemon --verbose --reuseaddr --base-path=$(pwd) --export-all
  [4760] Ready to rumble
  [696] Connection from 127.0.0.1:2054
  [696] unable to set SO_KEEPALIVE on socket: No such file or directory
  [696] Extended attribute "host": 127.0.0.1
  [696] Request upload-pack for '/example'

Client side:

  $ git clone git://127.0.0.1/example
  Cloning into 'example'...
  remote: Counting objects: 36, done.
  remote: Compressing objects: 100% (24/24), done.
  fatal: read error: Invalid argument
  fatal: early EOF
  fatal: index-pack failed

System information:

  $ git --version --build-options
  git version 2.17.0.windows.1
  cpu: x86_64
  built from commit: e7621d891d081acff6acd1f0ba6ae0adce06dd09
  sizeof-long: 4
  
  $ cmd.exe /c ver
  
  Microsoft Windows [Version 10.0.16299.371]

Best regards,
Kim

[1] https://github.com/git-for-windows/git/issues/304#issuecomment-274266897

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] daemon: use timeout for uninterruptible poll
  2018-04-12 21:07 ` [PATCH 1/2] daemon: use timeout for uninterruptible poll Kim Gybels
  2018-04-13 12:36   ` Johannes Schindelin
@ 2018-04-15 21:54   ` Junio C Hamano
  2018-04-15 22:16     ` Junio C Hamano
  2018-04-18 21:07     ` Johannes Schindelin
  1 sibling, 2 replies; 15+ messages in thread
From: Junio C Hamano @ 2018-04-15 21:54 UTC (permalink / raw)
  To: Kim Gybels; +Cc: git, Johannes Schindelin, Jeff King

Kim Gybels <kgybels@infogroep.be> writes:

> The poll provided in compat/poll.c is not interrupted by receiving
> SIGCHLD. Use a timeout for cleaning up dead children in a timely manner.

I think you identified the problem and diagnosed it correctly, but I
find that the change proposed here introduces a severe layering
violation.  The code is still calling what is called poll(), which
should not have such a broken semantics.

The ideal solution would be to fix the emulation so that it also
properly works for reaping a dead child process, but if that is not
possible, another solution that does not break the API layering
would probably be to introduce our own version of something similar
to poll() that helps various platforms that cannot implement the
real poll() faithfully for whatever reason.  Such an xpoll() API
function we introduce (and implement in compat/poll.c) may take, in
addition to the usual parameters to reall poll(), the value of
live_children we have at this call site.  With that

 - On platforms whose poll() does work correctly for culling dead
   children will just ignore the live_children paramater in its
   implementation of xpoll()

 - On other platforms, it will shorten the timeout depending on the
   need to cull dead children, just like your patch did.

Thanks.


>
> Signed-off-by: Kim Gybels <kgybels@infogroep.be>
> ---
>  daemon.c | 10 ++++++++--
>  1 file changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/daemon.c b/daemon.c
> index fe833ea7de..6dc95c1b2f 100644
> --- a/daemon.c
> +++ b/daemon.c
> @@ -1147,6 +1147,7 @@ static int service_loop(struct socketlist *socklist)
>  {
>  	struct pollfd *pfd;
>  	int i;
> +	int poll_timeout = -1;
>  
>  	pfd = xcalloc(socklist->nr, sizeof(struct pollfd));
>  
> @@ -1161,8 +1162,13 @@ static int service_loop(struct socketlist *socklist)
>  		int i;
>  
>  		check_dead_children();
> -
> -		if (poll(pfd, socklist->nr, -1) < 0) {
> +#ifdef NO_POLL
> +		poll_timeout = live_children ? 100 : -1;
> +#endif
> +		int ret = poll(pfd, socklist->nr, poll_timeout);
> +		if  (ret == 0) {
> +			continue;
> +		} else if (ret < 0) {
>  			if (errno != EINTR) {
>  				logerror("Poll failed, resuming: %s",
>  				      strerror(errno));

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] daemon: use timeout for uninterruptible poll
  2018-04-15 21:54   ` Junio C Hamano
@ 2018-04-15 22:16     ` Junio C Hamano
  2018-04-18 21:07     ` Johannes Schindelin
  1 sibling, 0 replies; 15+ messages in thread
From: Junio C Hamano @ 2018-04-15 22:16 UTC (permalink / raw)
  To: Kim Gybels; +Cc: git, Johannes Schindelin, Jeff King

Junio C Hamano <gitster@pobox.com> writes:

> I think you identified the problem and diagnosed it correctly, but I
> find that the change proposed here introduces a severe layering
> violation.  The code is still calling what is called poll(), which
> should not have such a broken semantics.

I only mentioned a piece of fact (i.e. "the code calls poll() after
the patch"), but I guess I should have made it clear what makes that
a bad thing.  Future readers of the code in daemon.c are required to
be aware of the limitation of some poll() emulation; they cannot
"optimize" out and made the code unware of the (non-)existence of
remaining children, for example.  When the callsite uses poll(),
those who know how poll() ought to work won't be.  The reason why
the xpoll() I mentioned as a possible alternative would be better is
because they will learn why we do not use normal poll() there and
why we maintain and pass live_children (and those who cut and paste
without understanding the existing code _will_ copy the calling site
of xpoll(), which will automatically copy the need to maintain the
number of remaining children ;-).



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] daemon: use timeout for uninterruptible poll
  2018-04-15 21:54   ` Junio C Hamano
  2018-04-15 22:16     ` Junio C Hamano
@ 2018-04-18 21:07     ` Johannes Schindelin
  2018-04-18 21:51       ` Junio C Hamano
  1 sibling, 1 reply; 15+ messages in thread
From: Johannes Schindelin @ 2018-04-18 21:07 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Kim Gybels, git, Jeff King

Hi Junio,

On Mon, 16 Apr 2018, Junio C Hamano wrote:

> Kim Gybels <kgybels@infogroep.be> writes:
> 
> > The poll provided in compat/poll.c is not interrupted by receiving
> > SIGCHLD. Use a timeout for cleaning up dead children in a timely manner.
> 
> I think you identified the problem and diagnosed it correctly, but I
> find that the change proposed here introduces a severe layering
> violation.  The code is still calling what is called poll(), which
> should not have such a broken semantics.

While I have sympathy for your desire to apply pure POSIX functionality,
the reality is that we do not have this luxury. Not if we want to support
Git on the still most prevalent development platform: Windows. On Windows,
you simply do not have that poll() that you are looking for.

In particular, there is no signal handling of the type you seem to want to
require.

As to the layering violation you mention, first a HN quote, just to loosen
the mood, and to at least partially ease the blow delivered by your mail:

	There is no such thing as a layering violation. You should be
	immediately suspicious of anyone who claims that there are such
	things.

;-)

Seriously again. If you care to have a look at the patch, you will see
that the loop (which will now benefit from Kim's timeout on platforms
without POSIX signal handling) *already* contains that call to
reap_dead_children().

In other words, you scolded Kim for something that this patch did not
introduce, but which was already there.

Unless I am misunderstanding violently what you say, that is, in which
case I would like to ask for a clarification why this patch (which does
not change a thing unless NO_POLL is defined!) must be rejected, and while
at it, I would like to ask you how introducing a layer of indirection with
a full new function that is at least moderately misleading (as it would be
named xpoll() despite your desire that it should do things that poll()
does *not* do) would be preferable to this here patch that changes but a
few lines to introduce a regular heartbeat check for platforms that
require it?

Thank you,
Dscho

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] daemon: use timeout for uninterruptible poll
  2018-04-15 17:08     ` Kim Gybels
@ 2018-04-18 21:16       ` Johannes Schindelin
  0 siblings, 0 replies; 15+ messages in thread
From: Johannes Schindelin @ 2018-04-18 21:16 UTC (permalink / raw)
  To: Kim Gybels; +Cc: Randall S. Becker, git, Junio C Hamano, Jeff King

Hi Kim,

On Sun, 15 Apr 2018, Kim Gybels wrote:

> On (13/04/18 14:36), Johannes Schindelin wrote:
> > > The poll provided in compat/poll.c is not interrupted by receiving
> > > SIGCHLD. Use a timeout for cleaning up dead children in a timely
> > > manner.
> > 
> > Maybe say "When using this poll emulation, use a timeout ..."?
> 
> I will rewrite the commit message when I reroll the patch. Calling the
> poll "uninterruptible" might be wrong as well, although the poll
> doesn't return with EINTR when a child process terminates, it might
> still be interruptible in other ways. On a related note, the handler
> for SIGCHLD is simply not called in Git-for-Windows' daemon.

Right. There is no signal infrastructure on Windows that is an exact
equivalent of what Junio desires.

> > > @@ -1161,8 +1162,13 @@ static int service_loop(struct socketlist *socklist)
> > >  		int i;
> > >  
> > >  		check_dead_children();
> > > -
> > > -		if (poll(pfd, socklist->nr, -1) < 0) {
> > > +#ifdef NO_POLL
> > > +		poll_timeout = live_children ? 100 : -1;
> > > +#endif
> > > +		int ret = poll(pfd, socklist->nr, poll_timeout);
> > > +		if  (ret == 0) {
> > > +			continue;
> > > +		} else if (ret < 0) {
> > 
> > I would find it a bit easier on the eyes if this did not use curlies, and
> > dropped the unnecessary `else` (`continue` will take care of that):
> > 
> > 		if (!ret)
> > 			continue;
> > 		if (ret < 0)
> > 			[...]
> 
> Funny, that's how I would normally write it, if I wasn't so focused on
> trying to follow the coding quidelines. While I'm at it, I will also
> fix that sneaky double space after the if.

:-)

> Is it ok to add the timeout for all platforms using the poll
> emulation, since I only tested for Windows?

From my reading of the patch, it changes only one thing, and only in the
case that the developer asked to build with NO_POLL (which means that the
platform does not have a native poll()): instead of waiting indefinitely,
the poll() call is interrupted in regular intervals to give
reap_dead_children() a chance to clean up.

And that's all it does.

So it is a simply heartbeat for platforms that require it, and that
heartbeat would not even hurt any platform that would *not* require it.

In short: from my point of view, it is fine to add the timeout for all
NO_POLL platforms, even if it was only tested on Windows.

Of course, we *do* know that there is one other user of NO_POLL: the
NonStop platform.

Randall, would you mind testing these two patches on NonStop?

Thanks,
Johannes

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] daemon: graceful shutdown of client connection
  2018-04-15 20:21     ` Kim Gybels
@ 2018-04-18 21:48       ` Johannes Schindelin
  0 siblings, 0 replies; 15+ messages in thread
From: Johannes Schindelin @ 2018-04-18 21:48 UTC (permalink / raw)
  To: Kim Gybels; +Cc: git, Junio C Hamano, Jeff King, Oleg Gubanov

Hi Kim,

On Sun, 15 Apr 2018, Kim Gybels wrote:

> On (13/04/18 15:03), Johannes Schindelin wrote:
> > I wonder whether you found a reliable way to trigger this? It would be
> > nice to have a regression test for this.
> 
> On my system, it reproduced reliably using Oleg's example [1], below is
> my bash version of it.

Okay.

> Script to generate repository with some history:
> 
>   $ cat example.sh
>   #!/bin/bash
>   
>   git init example
>   cd example
>   
>   git --help > foo.txt
>   
>   for i in $(seq 1 12); do
>       cat foo.txt foo.txt > bar.txt
>       mv bar.txt foo.txt
>       git add foo.txt
>       git commit -m v$i
>   done

Okay, so this sets up a minimal repository with a moderate size, inflating
foo.txt from the initial Git help text by factor 2**12 = 4096. The help
text is around 2kB, so we end up with an ~8MB large file in the end that
grew exponentially.

>   $ ./example.sh
>   Initialized empty Git repository in C:/git/bug/example/.git/
>   [master (root-commit) 2e44b4a] v1
>    1 file changed, 84 insertions(+)
>    create mode 100644 foo.txt
>   [master 9791332] v2
>    1 file changed, 84 insertions(+)
>   [master 524e672] v3
>    1 file changed, 168 insertions(+)
>   [master afec6ef] v4
>    1 file changed, 336 insertions(+)
>   [master 1bcd9cc] v5
>    1 file changed, 672 insertions(+)
>   [master 2f38a8e] v6
>    1 file changed, 1344 insertions(+)
>   [master 33382fe] v7
>    1 file changed, 2688 insertions(+)
>   [master 6c2cbd6] v8
>    1 file changed, 5376 insertions(+)
>   [master 8d0770f] v9
>    1 file changed, 10752 insertions(+)
>   [master 517d650] v10
>    1 file changed, 21504 insertions(+)
>   [master 9e12406] v11
>    1 file changed, 43008 insertions(+)
>   [master 4c4f600] v12
>    1 file changed, 86016 insertions(+)
> 
> Server side:
> 
>   $ git daemon --verbose --reuseaddr --base-path=$(pwd) --export-all
>   [4760] Ready to rumble
>   [696] Connection from 127.0.0.1:2054
>   [696] unable to set SO_KEEPALIVE on socket: No such file or directory
>   [696] Extended attribute "host": 127.0.0.1
>   [696] Request upload-pack for '/example'

I guess apart from the generated repo size (which would make this an
expensive test in and of itself), the fact that we have to run the daemon
(and that lib-git-daemon.sh requires the PIPE prerequisite which we
disabled on Windows) makes it hard to turn this into a regular regression
test in t/.

Although we *could* of course introduce a test that does not use
lib-git-daemon.sh and is hidden behind the EXPENSIVE prerequisite or some
such.

BTW I just tested this, and it indeed fixes the problem, so I am eager to
ship it to the Git for Windows users. Let's see whether we can convince
Junio that your quite small and easy-to-review patches are not too much of
a maintenance burden, and the fact that they fix a long-standing bug
should help.

BTW I had to apply this to build with DEVELOPER=1:

diff --git a/daemon.c b/daemon.c
index 97fadd62d10..1cc901e9739 100644
--- a/daemon.c
+++ b/daemon.c
@@ -1162,13 +1162,13 @@ static int service_loop(struct socketlist
*socklist)
 	signal(SIGCHLD, child_handler);
 
 	for (;;) {
-		int i;
+		int i, ret;
 
 		check_dead_children();
 #ifdef NO_POLL
 		poll_timeout = live_children ? 100 : -1;
 #endif
-		int ret = poll(pfd, socklist->nr, poll_timeout);
+		ret = poll(pfd, socklist->nr, poll_timeout);
 		if  (ret == 0) {
 			continue;
 		} else if (ret < 0) {

Would you mind squashing that in?

Thanks,
Dscho

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] daemon: use timeout for uninterruptible poll
  2018-04-18 21:07     ` Johannes Schindelin
@ 2018-04-18 21:51       ` Junio C Hamano
  2018-04-19 21:33         ` Kim Gybels
  0 siblings, 1 reply; 15+ messages in thread
From: Junio C Hamano @ 2018-04-18 21:51 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Kim Gybels, git, Jeff King

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> Unless I am misunderstanding violently what you say, that is, in which
> case I would like to ask for a clarification why this patch (which does
> not change a thing unless NO_POLL is defined!) must be rejected, and while
> at it, I would like to ask you how introducing a layer of indirection with
> a full new function that is at least moderately misleading (as it would be
> named xpoll() despite your desire that it should do things that poll()
> does *not* do) would be preferable to this here patch that changes but a
> few lines to introduce a regular heartbeat check for platforms that

Our xwrite() and other xfoo() are to "fix" undesirable aspect of the
underlying pure POSIX API to make it more suitable for our codebase.
When pure POSIX poll() that requires the implementing or emulating
platform pays attention to the children being waited on is not
appropriate for the codepath we are using (i.e. the place where the
patch is touching), it would be in line to introduce a "fixed" API
that allows us to pass that information, so that we can build on top
of that abstraction that is *not* pure POSIX abstraction, no?  After
all, you are the one who constantly whine that Git is implemented on
POSIX API and it is inconvenient for other platforms.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] daemon: use timeout for uninterruptible poll
  2018-04-18 21:51       ` Junio C Hamano
@ 2018-04-19 21:33         ` Kim Gybels
  2018-04-19 23:18           ` Junio C Hamano
  0 siblings, 1 reply; 15+ messages in thread
From: Kim Gybels @ 2018-04-19 21:33 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Johannes Schindelin, git, Jeff King

On (19/04/18 06:51), Junio C Hamano wrote:
> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> > In other words, you scolded Kim for something that this patch did not
> > introduce, but which was already there.

I didn't feel scolded, just Junio raising a concern about maintainability of
the code.

> > Unless I am misunderstanding violently what you say, that is, in which
> > case I would like to ask for a clarification why this patch (which does
> > not change a thing unless NO_POLL is defined!) must be rejected, and while
> > at it, I would like to ask you how introducing a layer of indirection with
> > a full new function that is at least moderately misleading (as it would be
> > named xpoll() despite your desire that it should do things that poll()
> > does *not* do) would be preferable to this here patch that changes but a
> > few lines to introduce a regular heartbeat check for platforms that
> 
> Our xwrite() and other xfoo() are to "fix" undesirable aspect of the
> underlying pure POSIX API to make it more suitable for our codebase.
> When pure POSIX poll() that requires the implementing or emulating
> platform pays attention to the children being waited on is not
> appropriate for the codepath we are using (i.e. the place where the
> patch is touching), it would be in line to introduce a "fixed" API
> that allows us to pass that information, so that we can build on top
> of that abstraction that is *not* pure POSIX abstraction, no?  After
> all, you are the one who constantly whine that Git is implemented on
> POSIX API and it is inconvenient for other platforms.

There is another issue with the existing code that this new "xpoll" will need
to take into account. If a SIGCHLD arrives between the call to
check_dead_children and poll, the poll will not be interupted by it, resulting
in the child not being reaped until another child terminates or a client
connects. Currently, the effect is just a zombie process for a longer time,
however, the proposed patch (daemon: graceful shutdown of client connection)
relies on the cleanup to close the client connection.

When I have time, I will reroll including a change to ppoll.

-Kim

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] daemon: use timeout for uninterruptible poll
  2018-04-19 21:33         ` Kim Gybels
@ 2018-04-19 23:18           ` Junio C Hamano
  0 siblings, 0 replies; 15+ messages in thread
From: Junio C Hamano @ 2018-04-19 23:18 UTC (permalink / raw)
  To: Kim Gybels; +Cc: Johannes Schindelin, git, Jeff King

Kim Gybels <kgybels@infogroep.be> writes:

>> > In other words, you scolded Kim for something that this patch did not
>> > introduce, but which was already there.
>
> I didn't feel scolded, just Junio raising a concern about maintainability of
> the code.

FWIW, I didn't mean to scold, either.

Rather I was pointing out that the code already maintains the number
of remaining children, which means that a more portable abstraction
than poll(), if we desired to have one, would merely be one step
away, as we already know that at least need that information to help
Windows.

> There is another issue with the existing code that this new
> "xpoll" will need to take into account. If a SIGCHLD arrives
> between the call to check_dead_children and poll, the poll will
> not be interupted by it, resulting in the child not being reaped
> until another child terminates or a client connects. Currently,
> the effect is just a zombie process for a longer time, however,
> the proposed patch (daemon: graceful shutdown of client
> connection) relies on the cleanup to close the client connection.

Good analysis.  That consideration may mean that xpoll() as I
suggested is useless as a possible more portable abstraction (or,
"an abstraction that is implementable easily in both POSIX and
non-POSIX world"), but I suspect we would still want to have an
internal "portable" API that serves the purpose similar to how we
wanted POSIX poll() to serve.  The place the patch is touching is
not the only place poll() is used in the codebase, and other places
(and future ones we would add) may benefit from having one.

Thanks for being constructive.



^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2018-04-19 23:18 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-04-12 21:07 [PATCH 0/2] Fix early EOF with GfW daemon Kim Gybels
2018-04-12 21:07 ` [PATCH 1/2] daemon: use timeout for uninterruptible poll Kim Gybels
2018-04-13 12:36   ` Johannes Schindelin
2018-04-15 17:08     ` Kim Gybels
2018-04-18 21:16       ` Johannes Schindelin
2018-04-15 21:54   ` Junio C Hamano
2018-04-15 22:16     ` Junio C Hamano
2018-04-18 21:07     ` Johannes Schindelin
2018-04-18 21:51       ` Junio C Hamano
2018-04-19 21:33         ` Kim Gybels
2018-04-19 23:18           ` Junio C Hamano
2018-04-12 21:07 ` [PATCH 2/2] daemon: graceful shutdown of client connection Kim Gybels
2018-04-13 13:03   ` Johannes Schindelin
2018-04-15 20:21     ` Kim Gybels
2018-04-18 21:48       ` Johannes Schindelin

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).