git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCHv2] Improve use of select in http backend
@ 2011-11-02 20:21 Mika Fischer
  2011-11-02 20:21 ` [PATCH 1/2] http.c: Use curl_multi_fdset to select on curl fds instead of just sleeping Mika Fischer
                   ` (3 more replies)
  0 siblings, 4 replies; 19+ messages in thread
From: Mika Fischer @ 2011-11-02 20:21 UTC (permalink / raw
  To: git; +Cc: gitster, daniel

I've split my previous patch into two, since the two parts actually do
different things.

The first patch uses curl_multi_fdset to use the file descriptors of the http
connections in the call to select, in effect waking up immediately when new
data arrives.

The second patch uses curl_multi_timeout (if available) to get a better
recommendation for how long to sleep from curl instead of always sleeping
for 50ms.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 1/2] http.c: Use curl_multi_fdset to select on curl fds instead of just sleeping
  2011-11-02 20:21 [PATCHv2] Improve use of select in http backend Mika Fischer
@ 2011-11-02 20:21 ` Mika Fischer
  2011-11-02 20:32   ` Jeff King
  2011-11-02 20:21 ` [PATCH 2/2] http.c: Use timeout suggested by curl instead of fixed 50ms timeout Mika Fischer
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 19+ messages in thread
From: Mika Fischer @ 2011-11-02 20:21 UTC (permalink / raw
  To: git; +Cc: gitster, daniel, Mika Fischer

Instead of sleeping unconditionally for a 50ms, when no data can be read
from the http connection(s), use curl_multi_fdset to obtain the actual
file descriptors of the open connections and use them in the select call.
This way, the 50ms sleep is interrupted when new data arrives.
---
 http.c |    6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/http.c b/http.c
index a4bc770..ae92318 100644
--- a/http.c
+++ b/http.c
@@ -664,14 +664,14 @@ void run_active_slot(struct active_request_slot *slot)
 		}
 
 		if (slot->in_use && !data_received) {
-			max_fd = 0;
+			max_fd = -1;
 			FD_ZERO(&readfds);
 			FD_ZERO(&writefds);
 			FD_ZERO(&excfds);
+			curl_multi_fdset(curlm, &readfds, &writefds, &excfds, &max_fd);
 			select_timeout.tv_sec = 0;
 			select_timeout.tv_usec = 50000;
-			select(max_fd, &readfds, &writefds,
-			       &excfds, &select_timeout);
+			select(max_fd+1, &readfds, &writefds, &excfds, &select_timeout);
 		}
 	}
 #else
-- 
1.7.8.rc0.33.g09c6f.dirty

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 2/2] http.c: Use timeout suggested by curl instead of fixed 50ms timeout
  2011-11-02 20:21 [PATCHv2] Improve use of select in http backend Mika Fischer
  2011-11-02 20:21 ` [PATCH 1/2] http.c: Use curl_multi_fdset to select on curl fds instead of just sleeping Mika Fischer
@ 2011-11-02 20:21 ` Mika Fischer
  2011-11-03 23:14 ` [PATCHv2] Improve use of select in http backend Junio C Hamano
  2011-11-04 14:19 ` [PATCH v3 0/3] " Mika Fischer
  3 siblings, 0 replies; 19+ messages in thread
From: Mika Fischer @ 2011-11-02 20:21 UTC (permalink / raw
  To: git; +Cc: gitster, daniel, Mika Fischer

Recent versions of curl can suggest a period of time the library user
should sleep and try again, when curl is blocked on reading or writing
(or connecting). Use this timeout instead of always sleeping for 50ms.
---
 http.c |   22 ++++++++++++++++++++--
 1 files changed, 20 insertions(+), 2 deletions(-)

diff --git a/http.c b/http.c
index ae92318..e91a2ab 100644
--- a/http.c
+++ b/http.c
@@ -649,6 +649,9 @@ void run_active_slot(struct active_request_slot *slot)
 	fd_set excfds;
 	int max_fd;
 	struct timeval select_timeout;
+#if LIBCURL_VERSION_NUM >= 0x070f04
+	long curl_timeout;
+#endif
 	int finished = 0;
 
 	slot->finished = &finished;
@@ -664,13 +667,28 @@ void run_active_slot(struct active_request_slot *slot)
 		}
 
 		if (slot->in_use && !data_received) {
+#if LIBCURL_VERSION_NUM >= 0x070f04
+			curl_multi_timeout(curlm, &curl_timeout);
+			if (curl_timeout == 0) {
+				continue;
+			} else if (curl_timeout == -1) {
+				select_timeout.tv_sec  = 0;
+				select_timeout.tv_usec = 50000;
+			} else {
+				select_timeout.tv_sec  =  curl_timeout / 1000;
+				select_timeout.tv_usec = (curl_timeout % 1000) * 1000;
+			}
+#else
+			select_timeout.tv_sec  = 0;
+			select_timeout.tv_usec = 50000;
+#endif
+
 			max_fd = -1;
 			FD_ZERO(&readfds);
 			FD_ZERO(&writefds);
 			FD_ZERO(&excfds);
 			curl_multi_fdset(curlm, &readfds, &writefds, &excfds, &max_fd);
-			select_timeout.tv_sec = 0;
-			select_timeout.tv_usec = 50000;
+
 			select(max_fd+1, &readfds, &writefds, &excfds, &select_timeout);
 		}
 	}
-- 
1.7.8.rc0.33.g09c6f.dirty

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/2] http.c: Use curl_multi_fdset to select on curl fds instead of just sleeping
  2011-11-02 20:21 ` [PATCH 1/2] http.c: Use curl_multi_fdset to select on curl fds instead of just sleeping Mika Fischer
@ 2011-11-02 20:32   ` Jeff King
  2011-11-02 20:35     ` Jeff King
  2011-11-02 22:22     ` Mika Fischer
  0 siblings, 2 replies; 19+ messages in thread
From: Jeff King @ 2011-11-02 20:32 UTC (permalink / raw
  To: Mika Fischer; +Cc: git, gitster, daniel

On Wed, Nov 02, 2011 at 09:21:27PM +0100, Mika Fischer wrote:

> Instead of sleeping unconditionally for a 50ms, when no data can be read
> from the http connection(s), use curl_multi_fdset to obtain the actual
> file descriptors of the open connections and use them in the select call.
> This way, the 50ms sleep is interrupted when new data arrives.
> ---
>  http.c |    6 +++---
>  1 files changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/http.c b/http.c
> index a4bc770..ae92318 100644
> --- a/http.c
> +++ b/http.c
> @@ -664,14 +664,14 @@ void run_active_slot(struct active_request_slot *slot)
>  		}
>  
>  		if (slot->in_use && !data_received) {
> -			max_fd = 0;
> +			max_fd = -1;
>  			FD_ZERO(&readfds);
>  			FD_ZERO(&writefds);
>  			FD_ZERO(&excfds);
> +			curl_multi_fdset(curlm, &readfds, &writefds, &excfds, &max_fd);
>  			select_timeout.tv_sec = 0;
>  			select_timeout.tv_usec = 50000;
> -			select(max_fd, &readfds, &writefds,
> -			       &excfds, &select_timeout);
> +			select(max_fd+1, &readfds, &writefds, &excfds, &select_timeout);
>  		}
>  	}

Do we still need to care about data_received?

My understanding was that the code was originally trying to do:

  1. Call curl, maybe get some data.

  2. If we got data, then ask curl against immediately for some data.

  3. Otherwise, sleep 50ms and then ask curl again.

But now that we are actually selecting on the proper descriptors, it
should now be safe to just do:

  1. Call curl, maybe get some data.

  2. Call select, which will wake immediately if curl is going to get
     data.

At least that's my reading. I am working on unrelated patches that clean
up the handling of data_received, but if it could go away altogether,
that would be even simpler.

-Peff

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/2] http.c: Use curl_multi_fdset to select on curl fds instead of just sleeping
  2011-11-02 20:32   ` Jeff King
@ 2011-11-02 20:35     ` Jeff King
  2011-11-02 21:26       ` Junio C Hamano
  2011-11-02 22:22     ` Mika Fischer
  1 sibling, 1 reply; 19+ messages in thread
From: Jeff King @ 2011-11-02 20:35 UTC (permalink / raw
  To: Mika Fischer; +Cc: git, gitster, daniel

On Wed, Nov 02, 2011 at 04:32:21PM -0400, Jeff King wrote:

> At least that's my reading. I am working on unrelated patches that clean
> up the handling of data_received, but if it could go away altogether,
> that would be even simpler.

That patch, btw, looks like this:

-- >8 --
Subject: [PATCH] http: remove "local" member from slot struct

The curl-multi http code does something like this:

  while (!finished) {
	  try_to_read_from_slots();
	  if (!data_received)
		  wait_for_50_ms();
  }

Which is horrible enough in itself, because of the
hard-coded 50ms wait.

But there's some additional complexity: the method for
finding whether we received data is to actually run ftell()
on the file we are writing to to see if it got anything. So
the "local" member of the slot struct contains the FILE
pointer. Except that sometimes we don't have a file, because
we're writing to a strbuf. In that case, since curl calls
our custom callback, we just increment the data_received
flag when curl gives data to our callback.

Let's do the same thing for the write-to-file case as we do
for the write-to-strbuf case: use a thin wrapper callback
and increment the received flag. This makes both methods
consistent with each other, and saves us from managing the
"local" struct member at all, reducing the code size.

Signed-off-by: Jeff King <peff@peff.net>
---
 http.c |   23 +++++++----------------
 http.h |    1 -
 2 files changed, 7 insertions(+), 17 deletions(-)

diff --git a/http.c b/http.c
index a4bc770..99386ef 100644
--- a/http.c
+++ b/http.c
@@ -93,6 +93,12 @@ curlioerr ioctl_buffer(CURL *handle, int cmd, void *clientp)
 }
 #endif
 
+size_t fwrite_file(char *ptr, size_t eltsize, size_t nmemb, void *handle)
+{
+	data_received++;
+	return fwrite(ptr, eltsize, nmemb, handle);
+}
+
 size_t fwrite_buffer(char *ptr, size_t eltsize, size_t nmemb, void *buffer_)
 {
 	size_t size = eltsize * nmemb;
@@ -538,7 +544,6 @@ struct active_request_slot *get_active_slot(void)
 
 	active_requests++;
 	slot->in_use = 1;
-	slot->local = NULL;
 	slot->results = NULL;
 	slot->finished = NULL;
 	slot->callback_data = NULL;
@@ -642,8 +647,6 @@ void step_active_slots(void)
 void run_active_slot(struct active_request_slot *slot)
 {
 #ifdef USE_CURL_MULTI
-	long last_pos = 0;
-	long current_pos;
 	fd_set readfds;
 	fd_set writefds;
 	fd_set excfds;
@@ -656,13 +659,6 @@ void run_active_slot(struct active_request_slot *slot)
 		data_received = 0;
 		step_active_slots();
 
-		if (!data_received && slot->local != NULL) {
-			current_pos = ftell(slot->local);
-			if (current_pos > last_pos)
-				data_received++;
-			last_pos = current_pos;
-		}
-
 		if (slot->in_use && !data_received) {
 			max_fd = 0;
 			FD_ZERO(&readfds);
@@ -818,13 +814,12 @@ static int http_request(const char *url, void *result, int target, int options)
 		if (target == HTTP_REQUEST_FILE) {
 			long posn = ftell(result);
 			curl_easy_setopt(slot->curl, CURLOPT_WRITEFUNCTION,
-					 fwrite);
+					 fwrite_file);
 			if (posn > 0) {
 				strbuf_addf(&buf, "Range: bytes=%ld-", posn);
 				headers = curl_slist_append(headers, buf.buf);
 				strbuf_reset(&buf);
 			}
-			slot->local = result;
 		} else
 			curl_easy_setopt(slot->curl, CURLOPT_WRITEFUNCTION,
 					 fwrite_buffer);
@@ -871,7 +866,6 @@ static int http_request(const char *url, void *result, int target, int options)
 		ret = HTTP_START_FAILED;
 	}
 
-	slot->local = NULL;
 	curl_slist_free_all(headers);
 	strbuf_release(&buf);
 
@@ -1066,7 +1060,6 @@ void release_http_pack_request(struct http_pack_request *preq)
 	if (preq->packfile != NULL) {
 		fclose(preq->packfile);
 		preq->packfile = NULL;
-		preq->slot->local = NULL;
 	}
 	if (preq->range_header != NULL) {
 		curl_slist_free_all(preq->range_header);
@@ -1088,7 +1081,6 @@ int finish_http_pack_request(struct http_pack_request *preq)
 
 	fclose(preq->packfile);
 	preq->packfile = NULL;
-	preq->slot->local = NULL;
 
 	lst = preq->lst;
 	while (*lst != p)
@@ -1157,7 +1149,6 @@ struct http_pack_request *new_http_pack_request(
 	}
 
 	preq->slot = get_active_slot();
-	preq->slot->local = preq->packfile;
 	curl_easy_setopt(preq->slot->curl, CURLOPT_FILE, preq->packfile);
 	curl_easy_setopt(preq->slot->curl, CURLOPT_WRITEFUNCTION, fwrite);
 	curl_easy_setopt(preq->slot->curl, CURLOPT_URL, preq->url);
diff --git a/http.h b/http.h
index 3c332a9..7429381 100644
--- a/http.h
+++ b/http.h
@@ -49,7 +49,6 @@ struct slot_results {
 
 struct active_request_slot {
 	CURL *curl;
-	FILE *local;
 	int in_use;
 	CURLcode curl_result;
 	long http_code;
-- 
1.7.7.rc3.12.g571d67

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/2] http.c: Use curl_multi_fdset to select on curl fds instead of just sleeping
  2011-11-02 20:35     ` Jeff King
@ 2011-11-02 21:26       ` Junio C Hamano
  0 siblings, 0 replies; 19+ messages in thread
From: Junio C Hamano @ 2011-11-02 21:26 UTC (permalink / raw
  To: Jeff King; +Cc: Mika Fischer, git, gitster, daniel

Jeff King <peff@peff.net> writes:

> On Wed, Nov 02, 2011 at 04:32:21PM -0400, Jeff King wrote:
>
>> At least that's my reading. I am working on unrelated patches that clean
>> up the handling of data_received, but if it could go away altogether,
>> that would be even simpler.
>
> That patch, btw, looks like this:
>
> -- >8 --
> Subject: [PATCH] http: remove "local" member from slot struct
>
> The curl-multi http code does something like this:
>
>   while (!finished) {
> 	  try_to_read_from_slots();
> 	  if (!data_received)
> 		  wait_for_50_ms();
>   }
>
> ...
> Let's do the same thing for the write-to-file case as we do
> for the write-to-strbuf case: use a thin wrapper callback
> and increment the received flag. This makes both methods
> consistent with each other, and saves us from managing the
> "local" struct member at all, reducing the code size.

Looks very sensible.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/2] http.c: Use curl_multi_fdset to select on curl fds instead of just sleeping
  2011-11-02 20:32   ` Jeff King
  2011-11-02 20:35     ` Jeff King
@ 2011-11-02 22:22     ` Mika Fischer
  2011-11-02 22:40       ` Daniel Stenberg
  1 sibling, 1 reply; 19+ messages in thread
From: Mika Fischer @ 2011-11-02 22:22 UTC (permalink / raw
  To: Jeff King; +Cc: git, gitster, daniel

On Wed, Nov 2, 2011 at 21:32, Jeff King <peff@peff.net> wrote:
> Do we still need to care about data_received?
>
> My understanding was that the code was originally trying to do:
>
>  1. Call curl, maybe get some data.
>
>  2. If we got data, then ask curl against immediately for some data.
>
>  3. Otherwise, sleep 50ms and then ask curl again.

Yes, that's exactly what it did.

> But now that we are actually selecting on the proper descriptors, it
> should now be safe to just do:
>
>  1. Call curl, maybe get some data.
>
>  2. Call select, which will wake immediately if curl is going to get
>     data.

The only problem I can see is that curl_multi_fdset is not guaranteed
to return any fds. So in theory it could be possible that we don't get
fds, but we're actually reading stuff. In this case things would get
slow, because we would sleep for 50ms after every read...

However, I don't know if this is a case that actually comes up in the
real world. Maybe Daniel has some advice on this.

Best,
 Mika

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/2] http.c: Use curl_multi_fdset to select on curl fds instead of just sleeping
  2011-11-02 22:22     ` Mika Fischer
@ 2011-11-02 22:40       ` Daniel Stenberg
  0 siblings, 0 replies; 19+ messages in thread
From: Daniel Stenberg @ 2011-11-02 22:40 UTC (permalink / raw
  To: Mika Fischer; +Cc: Jeff King, git, gitster

On Wed, 2 Nov 2011, Mika Fischer wrote:

> The only problem I can see is that curl_multi_fdset is not guaranteed to 
> return any fds. So in theory it could be possible that we don't get fds, but 
> we're actually reading stuff. In this case things would get slow, because we 
> would sleep for 50ms after every read...
>
> However, I don't know if this is a case that actually comes up in the real 
> world. Maybe Daniel has some advice on this.

It doesn't really happen so it should be safe.

The case where no fds are returned is when libcurl cannot return a socket to 
wait for during name resolving (if your particular libcurl is built to use 
such a resolver backend - libcurl has several different ones). And during name 
resolving there won't be any data to read for the libcurl-app anyway.

-- 

  / daniel.haxx.se

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCHv2] Improve use of select in http backend
  2011-11-02 20:21 [PATCHv2] Improve use of select in http backend Mika Fischer
  2011-11-02 20:21 ` [PATCH 1/2] http.c: Use curl_multi_fdset to select on curl fds instead of just sleeping Mika Fischer
  2011-11-02 20:21 ` [PATCH 2/2] http.c: Use timeout suggested by curl instead of fixed 50ms timeout Mika Fischer
@ 2011-11-03 23:14 ` Junio C Hamano
  2011-11-04 14:19 ` [PATCH v3 0/3] " Mika Fischer
  3 siblings, 0 replies; 19+ messages in thread
From: Junio C Hamano @ 2011-11-03 23:14 UTC (permalink / raw
  To: Mika Fischer; +Cc: git, Jeff King, daniel

I re-read this thread once again, and my understanding of the current
situation on these two patch series is the following. Please correct me if
I am wrong.

 * This is not a regression fix, this is not a correctness fix, but it is
   a performance improvement;

 * Jeff gave an idea for improvement around the use of (rather, not having
   to use) data_received; and

 * Mika understood Jeff's suggestion, but was hesitant due to one
   potential issue around curl_multi_fdset() and asked Daniel's opinion,
   to which Daniel responded that the worrysome situation would not
   happen.

It appears to me that the next step is for Mika to decide either (1) we go
ahead with the original patch and leave the improvement for later, or (2)
update the patch as Jeff suggested and we review it again.

I can go either way, but whichever way you choose, I would want to see the
patches properly signed-off.

Thanks.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v3 0/3] Improve use of select in http backend
  2011-11-02 20:21 [PATCHv2] Improve use of select in http backend Mika Fischer
                   ` (2 preceding siblings ...)
  2011-11-03 23:14 ` [PATCHv2] Improve use of select in http backend Junio C Hamano
@ 2011-11-04 14:19 ` Mika Fischer
  2011-11-04 14:19   ` [PATCH v3 1/3] http.c: Use curl_multi_fdset to select on curl fds instead of just sleeping Mika Fischer
                     ` (3 more replies)
  3 siblings, 4 replies; 19+ messages in thread
From: Mika Fischer @ 2011-11-04 14:19 UTC (permalink / raw
  To: git; +Cc: gitster, daniel, peff, Mika Fischer

Changes since v2:
- Properly signed off
- Incorporated Jeff's suggestion of getting rid of data_received
  altogether

And yes, this is just a performance improvement, not a bugfix.

Mika Fischer (3):
  http.c: Use curl_multi_fdset to select on curl fds instead of just
    sleeping
  http.c: Use timeout suggested by curl instead of fixed 50ms timeout
  http.c: Rely on select instead of tracking whether data was received

 http.c |   42 +++++++++++++++++++++++-------------------
 http.h |    1 -
 2 files changed, 23 insertions(+), 20 deletions(-)

-- 
1.7.8.rc0.35.gd9f16.dirty

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v3 1/3] http.c: Use curl_multi_fdset to select on curl fds instead of just sleeping
  2011-11-04 14:19 ` [PATCH v3 0/3] " Mika Fischer
@ 2011-11-04 14:19   ` Mika Fischer
  2011-11-04 14:19   ` [PATCH v3 2/3] http.c: Use timeout suggested by curl instead of fixed 50ms timeout Mika Fischer
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 19+ messages in thread
From: Mika Fischer @ 2011-11-04 14:19 UTC (permalink / raw
  To: git; +Cc: gitster, daniel, peff, Mika Fischer

Instead of sleeping unconditionally for a 50ms, when no data can be read
from the http connection(s), use curl_multi_fdset to obtain the actual
file descriptors of the open connections and use them in the select call.
This way, the 50ms sleep is interrupted when new data arrives.

Signed-off-by: Mika Fischer <mika.fischer@zoopnet.de>
---
 http.c |    6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/http.c b/http.c
index a4bc770..ae92318 100644
--- a/http.c
+++ b/http.c
@@ -664,14 +664,14 @@ void run_active_slot(struct active_request_slot *slot)
 		}
 
 		if (slot->in_use && !data_received) {
-			max_fd = 0;
+			max_fd = -1;
 			FD_ZERO(&readfds);
 			FD_ZERO(&writefds);
 			FD_ZERO(&excfds);
+			curl_multi_fdset(curlm, &readfds, &writefds, &excfds, &max_fd);
 			select_timeout.tv_sec = 0;
 			select_timeout.tv_usec = 50000;
-			select(max_fd, &readfds, &writefds,
-			       &excfds, &select_timeout);
+			select(max_fd+1, &readfds, &writefds, &excfds, &select_timeout);
 		}
 	}
 #else
-- 
1.7.8.rc0.35.gd9f16.dirty

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v3 2/3] http.c: Use timeout suggested by curl instead of fixed 50ms timeout
  2011-11-04 14:19 ` [PATCH v3 0/3] " Mika Fischer
  2011-11-04 14:19   ` [PATCH v3 1/3] http.c: Use curl_multi_fdset to select on curl fds instead of just sleeping Mika Fischer
@ 2011-11-04 14:19   ` Mika Fischer
  2011-11-04 17:13     ` Junio C Hamano
  2011-11-04 14:19   ` [PATCH v3 3/3] http.c: Rely on select instead of tracking whether data was received Mika Fischer
  2011-11-04 17:53   ` [PATCH v3 0/3] Improve use of select in http backend Jeff King
  3 siblings, 1 reply; 19+ messages in thread
From: Mika Fischer @ 2011-11-04 14:19 UTC (permalink / raw
  To: git; +Cc: gitster, daniel, peff, Mika Fischer

Recent versions of curl can suggest a period of time the library user
should sleep and try again, when curl is blocked on reading or writing
(or connecting). Use this timeout instead of always sleeping for 50ms.

Signed-off-by: Mika Fischer <mika.fischer@zoopnet.de>
---
 http.c |   22 ++++++++++++++++++++--
 1 files changed, 20 insertions(+), 2 deletions(-)

diff --git a/http.c b/http.c
index ae92318..e91a2ab 100644
--- a/http.c
+++ b/http.c
@@ -649,6 +649,9 @@ void run_active_slot(struct active_request_slot *slot)
 	fd_set excfds;
 	int max_fd;
 	struct timeval select_timeout;
+#if LIBCURL_VERSION_NUM >= 0x070f04
+	long curl_timeout;
+#endif
 	int finished = 0;
 
 	slot->finished = &finished;
@@ -664,13 +667,28 @@ void run_active_slot(struct active_request_slot *slot)
 		}
 
 		if (slot->in_use && !data_received) {
+#if LIBCURL_VERSION_NUM >= 0x070f04
+			curl_multi_timeout(curlm, &curl_timeout);
+			if (curl_timeout == 0) {
+				continue;
+			} else if (curl_timeout == -1) {
+				select_timeout.tv_sec  = 0;
+				select_timeout.tv_usec = 50000;
+			} else {
+				select_timeout.tv_sec  =  curl_timeout / 1000;
+				select_timeout.tv_usec = (curl_timeout % 1000) * 1000;
+			}
+#else
+			select_timeout.tv_sec  = 0;
+			select_timeout.tv_usec = 50000;
+#endif
+
 			max_fd = -1;
 			FD_ZERO(&readfds);
 			FD_ZERO(&writefds);
 			FD_ZERO(&excfds);
 			curl_multi_fdset(curlm, &readfds, &writefds, &excfds, &max_fd);
-			select_timeout.tv_sec = 0;
-			select_timeout.tv_usec = 50000;
+
 			select(max_fd+1, &readfds, &writefds, &excfds, &select_timeout);
 		}
 	}
-- 
1.7.8.rc0.35.gd9f16.dirty

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v3 3/3] http.c: Rely on select instead of tracking whether data was received
  2011-11-04 14:19 ` [PATCH v3 0/3] " Mika Fischer
  2011-11-04 14:19   ` [PATCH v3 1/3] http.c: Use curl_multi_fdset to select on curl fds instead of just sleeping Mika Fischer
  2011-11-04 14:19   ` [PATCH v3 2/3] http.c: Use timeout suggested by curl instead of fixed 50ms timeout Mika Fischer
@ 2011-11-04 14:19   ` Mika Fischer
  2011-11-04 17:53   ` [PATCH v3 0/3] Improve use of select in http backend Jeff King
  3 siblings, 0 replies; 19+ messages in thread
From: Mika Fischer @ 2011-11-04 14:19 UTC (permalink / raw
  To: git; +Cc: gitster, daniel, peff, Mika Fischer

Since now select is used with the file descriptors of the http connections,
tracking whether data was received recently (and trying to read more in
that case) is no longer necessary. Instead, always call select and rely on
it to return as soon as new data can be read.

Signed-off-by: Mika Fischer <mika.fischer@zoopnet.de>
---
 http.c |   16 +---------------
 http.h |    1 -
 2 files changed, 1 insertions(+), 16 deletions(-)

diff --git a/http.c b/http.c
index e91a2ab..3c6a00b 100644
--- a/http.c
+++ b/http.c
@@ -4,7 +4,6 @@
 #include "run-command.h"
 #include "url.h"
 
-int data_received;
 int active_requests;
 int http_is_verbose;
 size_t http_post_buffer = 16 * LARGE_PACKET_MAX;
@@ -99,13 +98,11 @@ size_t fwrite_buffer(char *ptr, size_t eltsize, size_t nmemb, void *buffer_)
 	struct strbuf *buffer = buffer_;
 
 	strbuf_add(buffer, ptr, size);
-	data_received++;
 	return size;
 }
 
 size_t fwrite_null(char *ptr, size_t eltsize, size_t nmemb, void *strbuf)
 {
-	data_received++;
 	return eltsize * nmemb;
 }
 
@@ -642,8 +639,6 @@ void step_active_slots(void)
 void run_active_slot(struct active_request_slot *slot)
 {
 #ifdef USE_CURL_MULTI
-	long last_pos = 0;
-	long current_pos;
 	fd_set readfds;
 	fd_set writefds;
 	fd_set excfds;
@@ -656,17 +651,9 @@ void run_active_slot(struct active_request_slot *slot)
 
 	slot->finished = &finished;
 	while (!finished) {
-		data_received = 0;
 		step_active_slots();
 
-		if (!data_received && slot->local != NULL) {
-			current_pos = ftell(slot->local);
-			if (current_pos > last_pos)
-				data_received++;
-			last_pos = current_pos;
-		}
-
-		if (slot->in_use && !data_received) {
+		if (slot->in_use) {
 #if LIBCURL_VERSION_NUM >= 0x070f04
 			curl_multi_timeout(curlm, &curl_timeout);
 			if (curl_timeout == 0) {
@@ -1232,7 +1219,6 @@ static size_t fwrite_sha1_file(char *ptr, size_t eltsize, size_t nmemb,
 		git_SHA1_Update(&freq->c, expn,
 				sizeof(expn) - freq->stream.avail_out);
 	} while (freq->stream.avail_in && freq->zret == Z_OK);
-	data_received++;
 	return size;
 }
 
diff --git a/http.h b/http.h
index 3c332a9..71bdf58 100644
--- a/http.h
+++ b/http.h
@@ -89,7 +89,6 @@ extern void step_active_slots(void);
 extern void http_init(struct remote *remote, const char *url);
 extern void http_cleanup(void);
 
-extern int data_received;
 extern int active_requests;
 extern int http_is_verbose;
 extern size_t http_post_buffer;
-- 
1.7.8.rc0.35.gd9f16.dirty

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 2/3] http.c: Use timeout suggested by curl instead of fixed 50ms timeout
  2011-11-04 14:19   ` [PATCH v3 2/3] http.c: Use timeout suggested by curl instead of fixed 50ms timeout Mika Fischer
@ 2011-11-04 17:13     ` Junio C Hamano
  2011-11-04 17:47       ` Mika Fischer
  0 siblings, 1 reply; 19+ messages in thread
From: Junio C Hamano @ 2011-11-04 17:13 UTC (permalink / raw
  To: Mika Fischer; +Cc: git, daniel, peff

Mika Fischer <mika.fischer@zoopnet.de> writes:

> Recent versions of curl can suggest a period of time the library user
> should sleep and try again, when curl is blocked on reading or writing
> (or connecting). Use this timeout instead of always sleeping for 50ms.
>
> Signed-off-by: Mika Fischer <mika.fischer@zoopnet.de>

Thanks.

I'm inclined to squash in the following to narrow the scope of
curl_timeout, though.

diff --git a/http.c b/http.c
index 5cb0fb6..924be52 100644
--- a/http.c
+++ b/http.c
@@ -636,9 +636,6 @@ void run_active_slot(struct active_request_slot *slot)
 	fd_set excfds;
 	int max_fd;
 	struct timeval select_timeout;
-#if LIBCURL_VERSION_NUM >= 0x070f04
-	long curl_timeout;
-#endif
 	int finished = 0;
 
 	slot->finished = &finished;
@@ -655,6 +652,7 @@ void run_active_slot(struct active_request_slot *slot)
 
 		if (slot->in_use && !data_received) {
 #if LIBCURL_VERSION_NUM >= 0x070f04
+			long curl_timeout;
 			curl_multi_timeout(curlm, &curl_timeout);
 			if (curl_timeout == 0) {
 				continue;

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 2/3] http.c: Use timeout suggested by curl instead of fixed 50ms timeout
  2011-11-04 17:13     ` Junio C Hamano
@ 2011-11-04 17:47       ` Mika Fischer
  2011-11-04 17:51         ` Jeff King
  2011-11-04 18:02         ` Junio C Hamano
  0 siblings, 2 replies; 19+ messages in thread
From: Mika Fischer @ 2011-11-04 17:47 UTC (permalink / raw
  To: Junio C Hamano; +Cc: git, daniel, peff

On Fri, Nov 4, 2011 at 18:13, Junio C Hamano <gitster@pobox.com> wrote:
> I'm inclined to squash in the following to narrow the scope of
> curl_timeout, though.
>
> diff --git a/http.c b/http.c
> index 5cb0fb6..924be52 100644
> --- a/http.c
> +++ b/http.c
> @@ -636,9 +636,6 @@ void run_active_slot(struct active_request_slot *slot)
>        fd_set excfds;
>        int max_fd;
>        struct timeval select_timeout;
> -#if LIBCURL_VERSION_NUM >= 0x070f04
> -       long curl_timeout;
> -#endif
>        int finished = 0;
>
>        slot->finished = &finished;
> @@ -655,6 +652,7 @@ void run_active_slot(struct active_request_slot *slot)
>
>                if (slot->in_use && !data_received) {
>  #if LIBCURL_VERSION_NUM >= 0x070f04
> +                       long curl_timeout;
>                        curl_multi_timeout(curlm, &curl_timeout);
>                        if (curl_timeout == 0) {
>                                continue;

Ah yes, that's good. I would have done it this way in C++, but I
wasn't sure whether C99 is OK for git.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 2/3] http.c: Use timeout suggested by curl instead of fixed 50ms timeout
  2011-11-04 17:47       ` Mika Fischer
@ 2011-11-04 17:51         ` Jeff King
  2011-11-04 17:55           ` Mika Fischer
  2011-11-04 18:02         ` Junio C Hamano
  1 sibling, 1 reply; 19+ messages in thread
From: Jeff King @ 2011-11-04 17:51 UTC (permalink / raw
  To: Mika Fischer; +Cc: Junio C Hamano, git, daniel

On Fri, Nov 04, 2011 at 06:47:44PM +0100, Mika Fischer wrote:

> >                if (slot->in_use && !data_received) {
> >  #if LIBCURL_VERSION_NUM >= 0x070f04
> > +                       long curl_timeout;
> >                        curl_multi_timeout(curlm, &curl_timeout);
> >                        if (curl_timeout == 0) {
> >                                continue;
> 
> Ah yes, that's good. I would have done it this way in C++, but I
> wasn't sure whether C99 is OK for git.

C99 is not OK. But this is not C99, as the conditional opens a new
block.

-Peff

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 0/3] Improve use of select in http backend
  2011-11-04 14:19 ` [PATCH v3 0/3] " Mika Fischer
                     ` (2 preceding siblings ...)
  2011-11-04 14:19   ` [PATCH v3 3/3] http.c: Rely on select instead of tracking whether data was received Mika Fischer
@ 2011-11-04 17:53   ` Jeff King
  3 siblings, 0 replies; 19+ messages in thread
From: Jeff King @ 2011-11-04 17:53 UTC (permalink / raw
  To: Mika Fischer; +Cc: git, gitster, daniel

On Fri, Nov 04, 2011 at 03:19:24PM +0100, Mika Fischer wrote:

> Mika Fischer (3):
>   http.c: Use curl_multi_fdset to select on curl fds instead of just
>     sleeping
>   http.c: Use timeout suggested by curl instead of fixed 50ms timeout
>   http.c: Rely on select instead of tracking whether data was received

All three patches look good to me. Your 3/3 does most of the cleanup
from the other patch I posted, but we can also do this on top:

-- >8 --
Subject: [PATCH 4/3] http: drop "local" member from request struct

This is a FILE pointer in the case that we are sending our
output to a file. We originally used it to run ftell() to
determine whether data had been written to our file during
our last call to curl. However, as of the last patch, we no
longer care about that flag anymore. All uses of this struct
member are now just book-keeping that can go away.

Signed-off-by: Jeff King <peff@peff.net>
---
 http.c |    6 ------
 http.h |    1 -
 2 files changed, 0 insertions(+), 7 deletions(-)

diff --git a/http.c b/http.c
index 3c6a00b..cfa9b07 100644
--- a/http.c
+++ b/http.c
@@ -535,7 +535,6 @@ struct active_request_slot *get_active_slot(void)
 
 	active_requests++;
 	slot->in_use = 1;
-	slot->local = NULL;
 	slot->results = NULL;
 	slot->finished = NULL;
 	slot->callback_data = NULL;
@@ -829,7 +828,6 @@ static int http_request(const char *url, void *result, int target, int options)
 				headers = curl_slist_append(headers, buf.buf);
 				strbuf_reset(&buf);
 			}
-			slot->local = result;
 		} else
 			curl_easy_setopt(slot->curl, CURLOPT_WRITEFUNCTION,
 					 fwrite_buffer);
@@ -876,7 +874,6 @@ static int http_request(const char *url, void *result, int target, int options)
 		ret = HTTP_START_FAILED;
 	}
 
-	slot->local = NULL;
 	curl_slist_free_all(headers);
 	strbuf_release(&buf);
 
@@ -1071,7 +1068,6 @@ void release_http_pack_request(struct http_pack_request *preq)
 	if (preq->packfile != NULL) {
 		fclose(preq->packfile);
 		preq->packfile = NULL;
-		preq->slot->local = NULL;
 	}
 	if (preq->range_header != NULL) {
 		curl_slist_free_all(preq->range_header);
@@ -1093,7 +1089,6 @@ int finish_http_pack_request(struct http_pack_request *preq)
 
 	fclose(preq->packfile);
 	preq->packfile = NULL;
-	preq->slot->local = NULL;
 
 	lst = preq->lst;
 	while (*lst != p)
@@ -1162,7 +1157,6 @@ struct http_pack_request *new_http_pack_request(
 	}
 
 	preq->slot = get_active_slot();
-	preq->slot->local = preq->packfile;
 	curl_easy_setopt(preq->slot->curl, CURLOPT_FILE, preq->packfile);
 	curl_easy_setopt(preq->slot->curl, CURLOPT_WRITEFUNCTION, fwrite);
 	curl_easy_setopt(preq->slot->curl, CURLOPT_URL, preq->url);
diff --git a/http.h b/http.h
index 71bdf58..ee16069 100644
--- a/http.h
+++ b/http.h
@@ -49,7 +49,6 @@ struct slot_results {
 
 struct active_request_slot {
 	CURL *curl;
-	FILE *local;
 	int in_use;
 	CURLcode curl_result;
 	long http_code;
-- 
1.7.7.2.4.gfd7b9

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 2/3] http.c: Use timeout suggested by curl instead of fixed 50ms timeout
  2011-11-04 17:51         ` Jeff King
@ 2011-11-04 17:55           ` Mika Fischer
  0 siblings, 0 replies; 19+ messages in thread
From: Mika Fischer @ 2011-11-04 17:55 UTC (permalink / raw
  To: Jeff King; +Cc: git

On Fri, Nov 4, 2011 at 18:51, Jeff King <peff@peff.net> wrote:
>> Ah yes, that's good. I would have done it this way in C++, but I
>> wasn't sure whether C99 is OK for git.
>
> C99 is not OK. But this is not C99, as the conditional opens a new
> block.

Oh I see, thanks for clarifying!

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 2/3] http.c: Use timeout suggested by curl instead of fixed 50ms timeout
  2011-11-04 17:47       ` Mika Fischer
  2011-11-04 17:51         ` Jeff King
@ 2011-11-04 18:02         ` Junio C Hamano
  1 sibling, 0 replies; 19+ messages in thread
From: Junio C Hamano @ 2011-11-04 18:02 UTC (permalink / raw
  To: Mika Fischer; +Cc: git, daniel, peff

Mika Fischer <mika.fischer@zoopnet.de> writes:

>>                if (slot->in_use && !data_received) {
>>  #if LIBCURL_VERSION_NUM >= 0x070f04
>> +                       long curl_timeout;
>>                        curl_multi_timeout(curlm, &curl_timeout);
>>                        if (curl_timeout == 0) {
>>                                continue;
>
> Ah yes, that's good. I would have done it this way in C++, but I
> wasn't sure whether C99 is OK for git.

I do not see anything C99 here.

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2011-11-04 18:02 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-11-02 20:21 [PATCHv2] Improve use of select in http backend Mika Fischer
2011-11-02 20:21 ` [PATCH 1/2] http.c: Use curl_multi_fdset to select on curl fds instead of just sleeping Mika Fischer
2011-11-02 20:32   ` Jeff King
2011-11-02 20:35     ` Jeff King
2011-11-02 21:26       ` Junio C Hamano
2011-11-02 22:22     ` Mika Fischer
2011-11-02 22:40       ` Daniel Stenberg
2011-11-02 20:21 ` [PATCH 2/2] http.c: Use timeout suggested by curl instead of fixed 50ms timeout Mika Fischer
2011-11-03 23:14 ` [PATCHv2] Improve use of select in http backend Junio C Hamano
2011-11-04 14:19 ` [PATCH v3 0/3] " Mika Fischer
2011-11-04 14:19   ` [PATCH v3 1/3] http.c: Use curl_multi_fdset to select on curl fds instead of just sleeping Mika Fischer
2011-11-04 14:19   ` [PATCH v3 2/3] http.c: Use timeout suggested by curl instead of fixed 50ms timeout Mika Fischer
2011-11-04 17:13     ` Junio C Hamano
2011-11-04 17:47       ` Mika Fischer
2011-11-04 17:51         ` Jeff King
2011-11-04 17:55           ` Mika Fischer
2011-11-04 18:02         ` Junio C Hamano
2011-11-04 14:19   ` [PATCH v3 3/3] http.c: Rely on select instead of tracking whether data was received Mika Fischer
2011-11-04 17:53   ` [PATCH v3 0/3] Improve use of select in http backend Jeff King

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).