[PATCH v2 1/3] remote-curl: add testing for intelligent retry for HTTP

git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed

* [PATCH v2 1/3] remote-curl: add testing for intelligent retry for HTTP
@ 2020-10-13 19:17 Sean McAllister
  2020-10-13 19:17 ` [PATCH v2 2/3] replace CURLOPT_FILE With CURLOPT_WRITEDATA Sean McAllister
                   ` (2 more replies)
  0 siblings, 3 replies; 30+ messages in thread
From: Sean McAllister @ 2020-10-13 19:17 UTC (permalink / raw)
  To: git, gitster, peff, masayasuzuki, jrnieder; +Cc: Sean McAllister

HTTP servers can sometimes throw errors, but that doesn't mean we should
give up.  If we have an error condition that we can reasonably retry on,
then we should.

This change is tricky because it requires a new CGI script to test as we
need to be able to instruct the server to throw an error(s) before
succeeding.  We do this by encoding an error code and optional value for
Retry-After into the URL, followed by the real endpoint:

  /error_ntime/dc724af1/<N>/429/10/smart/server

This is a bit hacky, but really the best we can do since HTTP is
fundamentally stateless.  The URL is uniquefied by a nonce and we leave
a breadcrumb on disk so all accesses after the first <N> redirect to the
appropriate endpoint.

Signed-off-by: Sean McAllister <smcallis@google.com>
---
 t/lib-httpd.sh             |  8 ++++
 t/lib-httpd/apache.conf    |  9 +++++
 t/lib-httpd/error-ntime.sh | 80 ++++++++++++++++++++++++++++++++++++++
 t/t5601-clone.sh           |  9 +++++
 4 files changed, 106 insertions(+)
 create mode 100755 t/lib-httpd/error-ntime.sh

diff --git a/t/lib-httpd.sh b/t/lib-httpd.sh
index d2edfa4c50..edc10b6f4b 100644
--- a/t/lib-httpd.sh
+++ b/t/lib-httpd.sh
@@ -132,6 +132,7 @@ prepare_httpd() {
 	install_script incomplete-length-upload-pack-v2-http.sh
 	install_script incomplete-body-upload-pack-v2-http.sh
 	install_script broken-smart-http.sh
+	install_script error-ntime.sh
 	install_script error-smart-http.sh
 	install_script error.sh
 	install_script apply-one-time-perl.sh
@@ -308,3 +309,10 @@ check_access_log() {
 		test_cmp "$1" access.log.stripped
 	fi
 }
+
+# generate a process unique one-up value
+global_counter_for_nonce=0
+gen_nonce () {
+	global_counter_for_nonce=$((global_counter_for_nonce + 1)) &&
+	echo "$global_counter_for_nonce"
+}
diff --git a/t/lib-httpd/apache.conf b/t/lib-httpd/apache.conf
index afa91e38b0..77c495e164 100644
--- a/t/lib-httpd/apache.conf
+++ b/t/lib-httpd/apache.conf
@@ -117,6 +117,12 @@ Alias /auth/dumb/ www/auth/dumb/
 	SetEnv GIT_EXEC_PATH ${GIT_EXEC_PATH}
 	SetEnv GIT_HTTP_EXPORT_ALL
 </LocationMatch>
+
+# This may be suffixed with any path for redirection, so it should come before
+# any of the other aliases, particularly the /smart_*[^/]*/(.*) alias as that can
+# match a lot of URLs
+ScriptAlias /error_ntime/ error-ntime.sh/
+
 ScriptAlias /smart/incomplete_length/git-upload-pack incomplete-length-upload-pack-v2-http.sh/
 ScriptAlias /smart/incomplete_body/git-upload-pack incomplete-body-upload-pack-v2-http.sh/
 ScriptAliasMatch /error_git_upload_pack/(.*)/git-upload-pack error.sh/
@@ -137,6 +143,9 @@ ScriptAliasMatch /one_time_perl/(.*) apply-one-time-perl.sh/$1
 <Files broken-smart-http.sh>
 	Options ExecCGI
 </Files>
+<Files error-ntime.sh>
+	Options ExecCGI
+</Files>
 <Files error-smart-http.sh>
 	Options ExecCGI
 </Files>
diff --git a/t/lib-httpd/error-ntime.sh b/t/lib-httpd/error-ntime.sh
new file mode 100755
index 0000000000..64dc878746
--- /dev/null
+++ b/t/lib-httpd/error-ntime.sh
@@ -0,0 +1,80 @@
+#!/bin/sh
+
+# Script to simulate a transient error code with Retry-After header set.
+#
+# PATH_INFO must be of the form /<nonce>/<times>/<retcode>/<retry-after>/<path>
+#   (eg: /dc724af1/3/429/10/some/url)
+#
+# The <nonce> value uniquely identifies the URL, since we're simulating
+# a stateful operation using a stateless protocol, we need a way to "namespace"
+# URLs so that they don't step on each other.
+#
+# The first <times> times this endpoint is called, it will return the given
+# <retcode>, and if the <retry-after> is non-negative, it will set the
+# Retry-After head to that value.
+#
+# Subsequent calls will return a 302 redirect to <path>.
+#
+# Supported error codes are 429,502,503, and 504
+
+print_status() {
+	if [ "$1" -eq "302" ]; then
+		printf "Status: 302 Found\n"
+	elif [ "$1" -eq "429" ]; then
+		printf "Status: 429 Too Many Requests\n"
+	elif [ "$1" -eq "502" ]; then
+		printf "Status: 502 Bad Gateway\n"
+	elif [ "$1" -eq "503" ]; then
+		printf "Status: 503 Service Unavailable\n"
+	elif [ "$1" -eq "504" ]; then
+		printf "Status: 504 Gateway Timeout\n"
+	else
+		printf "Status: 500 Internal Server Error\n"
+	fi
+	printf "Content-Type: text/plain\n"
+}
+
+# set $@ to components of PATH_INFO
+IFS='/'
+set -f
+set -- $PATH_INFO
+set +f
+
+# pull out first four path components
+shift
+nonce=$1
+times=$2
+code=$3
+retry=$4
+shift 4
+
+# glue the rest back together as redirect path
+path=""
+while [ "$#" -gt "0" ]; do
+	path="${path}/$1"
+	shift
+done
+
+# leave a cookie for this request/retry count
+state_file="request_${REMOTE_ADDR}_${nonce}_${times}_${code}_${retry}"
+
+if [ ! -f "$state_file" ]; then
+	echo 0 > "$state_file"
+fi
+
+read -r cnt < "$state_file"
+if [ "$cnt" -lt "$times" ]; then
+	echo $((cnt+1)) > "$state_file"
+
+	# return error
+	print_status "$code"
+	if [ "$retry" -ge "0" ]; then
+		printf "Retry-After: %s\n" "$retry"
+	fi
+else
+	# redirect
+	print_status 302
+	printf "Location: %s?%s\n" "$path" "${QUERY_STRING}"
+fi
+
+echo
diff --git a/t/t5601-clone.sh b/t/t5601-clone.sh
index 7df3c5373a..72aaed5a93 100755
--- a/t/t5601-clone.sh
+++ b/t/t5601-clone.sh
@@ -756,6 +756,15 @@ test_expect_success 'partial clone using HTTP' '
 	partial_clone "$HTTPD_DOCUMENT_ROOT_PATH/server" "$HTTPD_URL/smart/server"
 '
 
+test_expect_success 'partial clone using HTTP with redirect' '
+	_NONCE=`gen_nonce` && export _NONCE &&
+	curl "$HTTPD_URL/error_ntime/${_NONCE}/3/502/10/smart/server" &&
+	curl "$HTTPD_URL/error_ntime/${_NONCE}/3/502/10/smart/server" &&
+	curl "$HTTPD_URL/error_ntime/${_NONCE}/3/502/10/smart/server" &&
+	partial_clone "$HTTPD_DOCUMENT_ROOT_PATH/server" "$HTTPD_URL/error_ntime/${_NONCE}/3/502/10/smart/server"
+'
+
+
 # DO NOT add non-httpd-specific tests here, because the last part of this
 # test script is only executed when httpd is available and enabled.
 
-- 
2.28.0.1011.ga647a8990f-goog


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 2/3] replace CURLOPT_FILE With CURLOPT_WRITEDATA
  2020-10-13 19:17 [PATCH v2 1/3] remote-curl: add testing for intelligent retry for HTTP Sean McAllister
@ 2020-10-13 19:17 ` Sean McAllister
  2020-10-13 20:46   ` Junio C Hamano
  2020-10-13 19:17 ` [PATCH v2 3/3] http: automatically retry some requests Sean McAllister
  2020-10-13 20:40 ` [PATCH v2 1/3] remote-curl: add testing for intelligent retry for HTTP Junio C Hamano
  2 siblings, 1 reply; 30+ messages in thread
From: Sean McAllister @ 2020-10-13 19:17 UTC (permalink / raw)
  To: git, gitster, peff, masayasuzuki, jrnieder; +Cc: Sean McAllister

CURLOPT_FILE has been deprecated since 2003.

Signed-off-by: Sean McAllister <smcallis@google.com>
---
 http-push.c   | 6 +++---
 http-walker.c | 2 +-
 http.c        | 6 +++---
 remote-curl.c | 4 ++--
 4 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/http-push.c b/http-push.c
index 6a4a43e07f..2e6fee3305 100644
--- a/http-push.c
+++ b/http-push.c
@@ -894,7 +894,7 @@ static struct remote_lock *lock_remote(const char *path, long timeout)
 	slot->results = &results;
 	curl_setup_http(slot->curl, url, DAV_LOCK, &out_buffer, fwrite_buffer);
 	curl_easy_setopt(slot->curl, CURLOPT_HTTPHEADER, dav_headers);
-	curl_easy_setopt(slot->curl, CURLOPT_FILE, &in_buffer);
+	curl_easy_setopt(slot->curl, CURLOPT_WRITEDATA, &in_buffer);
 
 	lock = xcalloc(1, sizeof(*lock));
 	lock->timeout = -1;
@@ -1151,7 +1151,7 @@ static void remote_ls(const char *path, int flags,
 	curl_setup_http(slot->curl, url, DAV_PROPFIND,
 			&out_buffer, fwrite_buffer);
 	curl_easy_setopt(slot->curl, CURLOPT_HTTPHEADER, dav_headers);
-	curl_easy_setopt(slot->curl, CURLOPT_FILE, &in_buffer);
+	curl_easy_setopt(slot->curl, CURLOPT_WRITEDATA, &in_buffer);
 
 	if (start_active_slot(slot)) {
 		run_active_slot(slot);
@@ -1225,7 +1225,7 @@ static int locking_available(void)
 	curl_setup_http(slot->curl, repo->url, DAV_PROPFIND,
 			&out_buffer, fwrite_buffer);
 	curl_easy_setopt(slot->curl, CURLOPT_HTTPHEADER, dav_headers);
-	curl_easy_setopt(slot->curl, CURLOPT_FILE, &in_buffer);
+	curl_easy_setopt(slot->curl, CURLOPT_WRITEDATA, &in_buffer);
 
 	if (start_active_slot(slot)) {
 		run_active_slot(slot);
diff --git a/http-walker.c b/http-walker.c
index 4fb1235cd4..6c630711d1 100644
--- a/http-walker.c
+++ b/http-walker.c
@@ -384,7 +384,7 @@ static void fetch_alternates(struct walker *walker, const char *base)
 	alt_req.walker = walker;
 	slot->callback_data = &alt_req;
 
-	curl_easy_setopt(slot->curl, CURLOPT_FILE, &buffer);
+	curl_easy_setopt(slot->curl, CURLOPT_WRITEDATA, &buffer);
 	curl_easy_setopt(slot->curl, CURLOPT_WRITEFUNCTION, fwrite_buffer);
 	curl_easy_setopt(slot->curl, CURLOPT_URL, url.buf);
 
diff --git a/http.c b/http.c
index 8b23a546af..b3c1669388 100644
--- a/http.c
+++ b/http.c
@@ -1921,7 +1921,7 @@ static int http_request(const char *url,
 		curl_easy_setopt(slot->curl, CURLOPT_NOBODY, 1);
 	} else {
 		curl_easy_setopt(slot->curl, CURLOPT_NOBODY, 0);
-		curl_easy_setopt(slot->curl, CURLOPT_FILE, result);
+		curl_easy_setopt(slot->curl, CURLOPT_WRITEDATA, result);
 
 		if (target == HTTP_REQUEST_FILE) {
 			off_t posn = ftello(result);
@@ -2337,7 +2337,7 @@ struct http_pack_request *new_direct_http_pack_request(
 	}
 
 	preq->slot = get_active_slot();
-	curl_easy_setopt(preq->slot->curl, CURLOPT_FILE, preq->packfile);
+	curl_easy_setopt(preq->slot->curl, CURLOPT_WRITEDATA, preq->packfile);
 	curl_easy_setopt(preq->slot->curl, CURLOPT_WRITEFUNCTION, fwrite);
 	curl_easy_setopt(preq->slot->curl, CURLOPT_URL, preq->url);
 	curl_easy_setopt(preq->slot->curl, CURLOPT_HTTPHEADER,
@@ -2508,7 +2508,7 @@ struct http_object_request *new_http_object_request(const char *base_url,
 
 	freq->slot = get_active_slot();
 
-	curl_easy_setopt(freq->slot->curl, CURLOPT_FILE, freq);
+	curl_easy_setopt(freq->slot->curl, CURLOPT_WRITEDATA, freq);
 	curl_easy_setopt(freq->slot->curl, CURLOPT_FAILONERROR, 0);
 	curl_easy_setopt(freq->slot->curl, CURLOPT_WRITEFUNCTION, fwrite_sha1_file);
 	curl_easy_setopt(freq->slot->curl, CURLOPT_ERRORBUFFER, freq->errorstr);
diff --git a/remote-curl.c b/remote-curl.c
index 32cc4a0c55..7f44fa30fe 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -847,7 +847,7 @@ static int probe_rpc(struct rpc_state *rpc, struct slot_results *results)
 	curl_easy_setopt(slot->curl, CURLOPT_POSTFIELDSIZE, 4);
 	curl_easy_setopt(slot->curl, CURLOPT_HTTPHEADER, headers);
 	curl_easy_setopt(slot->curl, CURLOPT_WRITEFUNCTION, fwrite_buffer);
-	curl_easy_setopt(slot->curl, CURLOPT_FILE, &buf);
+	curl_easy_setopt(slot->curl, CURLOPT_WRITEDATA, &buf);
 
 	err = run_slot(slot, results);
 
@@ -1012,7 +1012,7 @@ static int post_rpc(struct rpc_state *rpc, int stateless_connect, int flush_rece
 	rpc_in_data.slot = slot;
 	rpc_in_data.check_pktline = stateless_connect;
 	memset(&rpc_in_data.pktline_state, 0, sizeof(rpc_in_data.pktline_state));
-	curl_easy_setopt(slot->curl, CURLOPT_FILE, &rpc_in_data);
+	curl_easy_setopt(slot->curl, CURLOPT_WRITEDATA, &rpc_in_data);
 	curl_easy_setopt(slot->curl, CURLOPT_FAILONERROR, 0);
 
 
-- 
2.28.0.1011.ga647a8990f-goog


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 3/3] http: automatically retry some requests
  2020-10-13 19:17 [PATCH v2 1/3] remote-curl: add testing for intelligent retry for HTTP Sean McAllister
  2020-10-13 19:17 ` [PATCH v2 2/3] replace CURLOPT_FILE With CURLOPT_WRITEDATA Sean McAllister
@ 2020-10-13 19:17 ` Sean McAllister
  2020-10-13 21:14   ` Junio C Hamano
                     ` (2 more replies)
  2020-10-13 20:40 ` [PATCH v2 1/3] remote-curl: add testing for intelligent retry for HTTP Junio C Hamano
  2 siblings, 3 replies; 30+ messages in thread
From: Sean McAllister @ 2020-10-13 19:17 UTC (permalink / raw)
  To: git, gitster, peff, masayasuzuki, jrnieder; +Cc: Sean McAllister

Some HTTP response codes indicate a server state that can support
retrying the request rather than immediately erroring out.  The server
can also provide information about how long to wait before retries to
via the Retry-After header.  So check the server response and retry
some reasonable number of times before erroring out to better accomodate
transient errors.

Exiting immediately becomes irksome when pulling large multi-repo code
bases such as Android or Chromium, as often the entire fetch operation
has to be restarted from the beginning due to an error in one repo. If
we can reduce how often that occurs, then it's a big win.

Signed-off-by: Sean McAllister <smcallis@google.com>
---
 Documentation/config/http.txt |   5 ++
 http.c                        | 124 +++++++++++++++++++++++++++++++++-
 http.h                        |   2 +-
 remote-curl.c                 |   2 +-
 t/t5601-clone.sh              |   6 +-
 5 files changed, 133 insertions(+), 6 deletions(-)

diff --git a/Documentation/config/http.txt b/Documentation/config/http.txt
index 3968fbb697..5d5baef967 100644
--- a/Documentation/config/http.txt
+++ b/Documentation/config/http.txt
@@ -260,6 +260,11 @@ http.followRedirects::
 	the base for the follow-up requests, this is generally
 	sufficient. The default is `initial`.
 
+http.retryLimit::
+	Some HTTP status codes (eg: 429, 503) can reasonably be retried if
+	they're encountered.  This value configures the number of retry attempts
+	before giving up.  The default retry limit is 3.
+
 http.<url>.*::
 	Any of the http.* options above can be applied selectively to some URLs.
 	For a config key to match a URL, each element of the config key is
diff --git a/http.c b/http.c
index b3c1669388..f0147582f9 100644
--- a/http.c
+++ b/http.c
@@ -92,6 +92,9 @@ static const char *http_proxy_ssl_key;
 static const char *http_proxy_ssl_ca_info;
 static struct credential proxy_cert_auth = CREDENTIAL_INIT;
 static int proxy_ssl_cert_password_required;
+static int http_retry_limit = 3;
+static int http_default_delay_sec = 2;
+static int http_max_delay_sec = 60;
 
 static struct {
 	const char *name;
@@ -219,6 +222,56 @@ size_t fwrite_null(char *ptr, size_t eltsize, size_t nmemb, void *strbuf)
 	return nmemb;
 }
 
+
+/* return 1 for a retryable HTTP code, 0 otherwise */
+static int retryable_code(int code)
+{
+	switch(code) {
+	case 429: /* fallthrough */
+	case 502: /* fallthrough */
+	case 503: /* fallthrough */
+	case 504: return 1;
+	default:  return 0;
+	}
+}
+
+/*
+ * Query the value of an HTTP header.
+ *
+ * If the header is found, then a newly allocate string containing the value
+ * is returned.
+ *
+ * If not found, returns NULL
+ */
+static char* http_header_value(const struct strbuf headers, const char *header)
+{
+	const size_t header_len = strlen(header);
+	const char* beg = headers.buf, *end;
+	const char* ptr = strcasestr(beg, header), *eol;
+
+	while (ptr) {
+		/* headers should have no leading whitespace, and end with ':' */
+		end = ptr + header_len;
+		if ((ptr != beg && ptr[-1] != '\n') || *end != ':') {
+			ptr = strcasestr(end, header);
+			continue;
+		}
+
+		/* skip leading LWS */
+		ptr = end + 1;
+		while (*ptr && isspace(*ptr) && *ptr != '\r')
+			ptr++;
+
+		/* skip trailing LWS */
+		eol = strchrnul(ptr, '\r');
+		while (eol > ptr && isspace(eol[-1]))
+			eol--;
+
+		return xstrndup(ptr, eol-ptr);
+	}
+	return NULL;
+}
+
 static void closedown_active_slot(struct active_request_slot *slot)
 {
 	active_requests--;
@@ -452,6 +505,11 @@ static int http_options(const char *var, const char *value, void *cb)
 		return 0;
 	}
 
+	if (!strcmp("http.retrylimit", var)) {
+		http_retry_limit = git_config_int(var, value);
+		return 0;
+	}
+
 	/* Fall back on the default ones */
 	return git_default_config(var, value, cb);
 }
@@ -1668,7 +1726,7 @@ static int handle_curl_result(struct slot_results *results)
 }
 
 int run_one_slot(struct active_request_slot *slot,
-		 struct slot_results *results)
+		 struct slot_results *results, int *http_code)
 {
 	slot->results = results;
 	if (!start_active_slot(slot)) {
@@ -1678,6 +1736,8 @@ int run_one_slot(struct active_request_slot *slot,
 	}
 
 	run_active_slot(slot);
+	if (http_code)
+		*http_code = results->http_code;
 	return handle_curl_result(results);
 }
 
@@ -1903,20 +1963,58 @@ static void http_opt_request_remainder(CURL *curl, off_t pos)
 #define HTTP_REQUEST_STRBUF	0
 #define HTTP_REQUEST_FILE	1
 
+/*
+ * check for a retry-after header in the given headers string, if found, then
+ * honor it, otherwise do an exponential backoff up to the max on the current
+ * delay
+*/
+static int http_retry_after(const struct strbuf headers, int cur_delay_sec)
+{
+	int delay_sec;
+	char *end;
+	char* value = http_header_value(headers, "retry-after");
+
+	if (value) {
+		delay_sec = strtol(value, &end, 0);
+		free(value);
+		if (*value && *end == '\0' && delay_sec >= 0) {
+			if (delay_sec > http_max_delay_sec) {
+				die(Q_("server requested retry after %d second,"
+					   " which is longer than max allowed\n",
+					   "server requested retry after %d seconds,"
+					   " which is longer than max allowed\n", delay_sec),
+					delay_sec);
+			}
+			return delay_sec;
+		}
+	}
+
+	cur_delay_sec *= 2;
+	return cur_delay_sec >= http_max_delay_sec ? http_max_delay_sec : cur_delay_sec;
+}
+
 static int http_request(const char *url,
 			void *result, int target,
 			const struct http_get_options *options)
 {
 	struct active_request_slot *slot;
 	struct slot_results results;
-	struct curl_slist *headers = http_copy_default_headers();
+	struct curl_slist *headers;
 	struct strbuf buf = STRBUF_INIT;
+	struct strbuf result_headers = STRBUF_INIT;
 	const char *accept_language;
 	int ret;
+	int retry_cnt = 0;
+	int retry_delay_sec = http_default_delay_sec;
+	int http_code;
 
+retry:
 	slot = get_active_slot();
 	curl_easy_setopt(slot->curl, CURLOPT_HTTPGET, 1);
 
+	curl_easy_setopt(slot->curl, CURLOPT_HEADERDATA, &result_headers);
+	curl_easy_setopt(slot->curl, CURLOPT_HEADERFUNCTION, fwrite_buffer);
+
 	if (result == NULL) {
 		curl_easy_setopt(slot->curl, CURLOPT_NOBODY, 1);
 	} else {
@@ -1936,6 +2034,7 @@ static int http_request(const char *url,
 
 	accept_language = get_accept_language();
 
+	headers = http_copy_default_headers();
 	if (accept_language)
 		headers = curl_slist_append(headers, accept_language);
 
@@ -1961,7 +2060,26 @@ static int http_request(const char *url,
 	curl_easy_setopt(slot->curl, CURLOPT_ENCODING, "");
 	curl_easy_setopt(slot->curl, CURLOPT_FAILONERROR, 0);
 
-	ret = run_one_slot(slot, &results);
+	http_code = 0;
+	ret = run_one_slot(slot, &results, &http_code);
+
+	/* remove header data fields since not all slots will use them */
+	curl_easy_setopt(slot->curl, CURLOPT_HEADERDATA, NULL);
+	curl_easy_setopt(slot->curl, CURLOPT_HEADERFUNCTION, NULL);
+
+	if (ret != HTTP_OK) {
+		if (retryable_code(http_code) && retry_cnt < http_retry_limit) {
+			retry_cnt++;
+			retry_delay_sec = http_retry_after(result_headers, retry_delay_sec);
+			warning(Q_("got HTTP response %d, retrying after %d second (%d/%d)\n",
+					   "got HTTP response %d, retrying after %d seconds (%d/%d)\n",
+					   retry_delay_sec),
+				http_code, retry_delay_sec, retry_cnt, http_retry_limit);
+			sleep(retry_delay_sec);
+
+			goto retry;
+		}
+	}
 
 	if (options && options->content_type) {
 		struct strbuf raw = STRBUF_INIT;
diff --git a/http.h b/http.h
index 5de792ef3f..faf9f1060e 100644
--- a/http.h
+++ b/http.h
@@ -99,7 +99,7 @@ void finish_all_active_slots(void);
  *
  */
 int run_one_slot(struct active_request_slot *slot,
-		 struct slot_results *results);
+		 struct slot_results *results, int *http_code);
 
 #ifdef USE_CURL_MULTI
 void fill_active_slots(void);
diff --git a/remote-curl.c b/remote-curl.c
index 7f44fa30fe..2657c95bcb 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -805,7 +805,7 @@ static int run_slot(struct active_request_slot *slot,
 	if (!results)
 		results = &results_buf;
 
-	err = run_one_slot(slot, results);
+	err = run_one_slot(slot, results, NULL);
 
 	if (err != HTTP_OK && err != HTTP_REAUTH) {
 		struct strbuf msg = STRBUF_INIT;
diff --git a/t/t5601-clone.sh b/t/t5601-clone.sh
index 72aaed5a93..3965cd265d 100755
--- a/t/t5601-clone.sh
+++ b/t/t5601-clone.sh
@@ -757,13 +757,17 @@ test_expect_success 'partial clone using HTTP' '
 '
 
 test_expect_success 'partial clone using HTTP with redirect' '
-	_NONCE=`gen_nonce` && export _NONCE &&
+	_NONCE=$(gen_nonce) &&
 	curl "$HTTPD_URL/error_ntime/${_NONCE}/3/502/10/smart/server" &&
 	curl "$HTTPD_URL/error_ntime/${_NONCE}/3/502/10/smart/server" &&
 	curl "$HTTPD_URL/error_ntime/${_NONCE}/3/502/10/smart/server" &&
 	partial_clone "$HTTPD_DOCUMENT_ROOT_PATH/server" "$HTTPD_URL/error_ntime/${_NONCE}/3/502/10/smart/server"
 '
 
+test_expect_success 'partial clone with retry' '
+	partial_clone "$HTTPD_DOCUMENT_ROOT_PATH/server" "$HTTPD_URL/error_ntime/$(gen_nonce)/3/429/1/smart/server" 2>err &&
+	test_i18ngrep "got HTTP response 429" err
+'
 
 # DO NOT add non-httpd-specific tests here, because the last part of this
 # test script is only executed when httpd is available and enabled.
-- 
2.28.0.1011.ga647a8990f-goog


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 1/3] remote-curl: add testing for intelligent retry for HTTP
  2020-10-13 19:17 [PATCH v2 1/3] remote-curl: add testing for intelligent retry for HTTP Sean McAllister
  2020-10-13 19:17 ` [PATCH v2 2/3] replace CURLOPT_FILE With CURLOPT_WRITEDATA Sean McAllister
  2020-10-13 19:17 ` [PATCH v2 3/3] http: automatically retry some requests Sean McAllister
@ 2020-10-13 20:40 ` Junio C Hamano
  2020-10-14 16:56   ` Sean McAllister
  2 siblings, 1 reply; 30+ messages in thread
From: Junio C Hamano @ 2020-10-13 20:40 UTC (permalink / raw)
  To: Sean McAllister; +Cc: git, peff, masayasuzuki, jrnieder

Sean McAllister <smcallis@google.com> writes:

> +# generate a process unique one-up value
> +global_counter_for_nonce=0
> +gen_nonce () {
> +	global_counter_for_nonce=$((global_counter_for_nonce + 1)) &&
> +	echo "$global_counter_for_nonce"
> +}

This must not be called in a subprocess if the caller truly wants
uniqueness.  May want to be described in a comment.

> diff --git a/t/lib-httpd/error-ntime.sh b/t/lib-httpd/error-ntime.sh
> new file mode 100755
> index 0000000000..64dc878746
> --- /dev/null
> +++ b/t/lib-httpd/error-ntime.sh
> @@ -0,0 +1,80 @@
> +#!/bin/sh
> +
> +# Script to simulate a transient error code with Retry-After header set.
> +#
> +# PATH_INFO must be of the form /<nonce>/<times>/<retcode>/<retry-after>/<path>
> +#   (eg: /dc724af1/3/429/10/some/url)
> +#
> +# The <nonce> value uniquely identifies the URL, since we're simulating
> +# a stateful operation using a stateless protocol, we need a way to "namespace"
> +# URLs so that they don't step on each other.
> +#
> +# The first <times> times this endpoint is called, it will return the given
> +# <retcode>, and if the <retry-after> is non-negative, it will set the
> +# Retry-After head to that value.
> +#
> +# Subsequent calls will return a 302 redirect to <path>.
> +#
> +# Supported error codes are 429,502,503, and 504
> +

I thought "error codes" were rephrased after the first round's
review to some other term (which I do not recall--was it status?)?

> +print_status() {
> +	if [ "$1" -eq "302" ]; then
> +		printf "Status: 302 Found\n"
> +	elif [ "$1" -eq "429" ]; then
> +		printf "Status: 429 Too Many Requests\n"
> +	elif [ "$1" -eq "502" ]; then
> +		printf "Status: 502 Bad Gateway\n"
> +	elif [ "$1" -eq "503" ]; then
> +		printf "Status: 503 Service Unavailable\n"
> +	elif [ "$1" -eq "504" ]; then
> +		printf "Status: 504 Gateway Timeout\n"
> +	else
> +		printf "Status: 500 Internal Server Error\n"
> +	fi
> +	printf "Content-Type: text/plain\n"
> +}

Style (Documentation/CodingGuidelines).

	print_status () {
		if test "$1" = "302"
		then
			printf "...";
		...
	}

but in this particular case, I do not see why we want if/else
cascade.  Perhaps

	print_status () {
		case "$1" in
		302)	printf "Status: 302 Found\n" ;;
		429)	... ;;
		...
		*)	printf "Status: 500 Internal Server Error\n" ;;
		esac
		printf "Content-Type: text/plain\n";
	}

would be more standard?

Also I am not sure why we want "printf ...\n" not "echo" here.  If
we want to talk HTTP ourselves strictly, I would understand avoiding
"echo" and doing "printf ...\r\n", though.  If we fear DOS line
endings coming out of localized "echo", and ensure we use LF line
ending even on Windows and Cygwin, it is sort of understandable but
if that is what is going on, that does not explain a lone "echo"
at the end of the caller.

Puzzled.

  +oIFS="$IFS"
> +IFS='/'
> +set -f
> +set -- $PATH_INFO
> +set +f
  +IFS="$oIFS"
> +
> +# pull out first four path components
> +shift
> +nonce=$1
> +times=$2
> +code=$3
> +retry=$4
> +shift 4
> +
> +# glue the rest back together as redirect path
> +path=""
> +while [ "$#" -gt "0" ]; do
> +	path="${path}/$1"
> +	shift
> +done

Hmph.  Would this work better, I wonder?

	path=${PATH_INFO#*/}	;# discard leading '/'
	nonce=${path%%/*}	path=${path#*/}
	times=${path%%/*}	path=${path#*/}
	code=${path%%/*}	path=${path#*/}
	retry=${path%%/*}	path=${path#*/}

At least it is shorter and easier to read.

> +# leave a cookie for this request/retry count
> +state_file="request_${REMOTE_ADDR}_${nonce}_${times}_${code}_${retry}"
> +
> +if [ ! -f "$state_file" ]; then
> +	echo 0 > "$state_file"
> +fi

Style (Documentation/CodingGuidelines, look for "For shell scripts
specifically").

 - use "test" not "[]";

 - don't write ";then" on the same line (rule of thumb. you should
   be able to write your shell scripts without semicolon except for
   double-semicolons in case/esac statements)

 - don't leave SP between redirection operator '>' and its target
   file, i.e. write 'echo 0 >"$state_file"'.

> +read -r cnt < "$state_file"
> +if [ "$cnt" -lt "$times" ]; then
> +	echo $((cnt+1)) > "$state_file"
> +
> +	# return error
> +	print_status "$code"
> +	if [ "$retry" -ge "0" ]; then
> +		printf "Retry-After: %s\n" "$retry"
> +	fi
> +else
> +	# redirect
> +	print_status 302
> +	printf "Location: %s?%s\n" "$path" "${QUERY_STRING}"
> +fi
> +
> +echo

This "echo" to the client also puzzles me further, after seeing
puzzling use of "printf ...\n" earlier.

> diff --git a/t/t5601-clone.sh b/t/t5601-clone.sh
> index 7df3c5373a..72aaed5a93 100755
> --- a/t/t5601-clone.sh
> +++ b/t/t5601-clone.sh
> @@ -756,6 +756,15 @@ test_expect_success 'partial clone using HTTP' '
>  	partial_clone "$HTTPD_DOCUMENT_ROOT_PATH/server" "$HTTPD_URL/smart/server"
>  '
>  
> +test_expect_success 'partial clone using HTTP with redirect' '
> +	_NONCE=`gen_nonce` && export _NONCE &&
> +	curl "$HTTPD_URL/error_ntime/${_NONCE}/3/502/10/smart/server" &&
> +	curl "$HTTPD_URL/error_ntime/${_NONCE}/3/502/10/smart/server" &&
> +	curl "$HTTPD_URL/error_ntime/${_NONCE}/3/502/10/smart/server" &&

Why do we need to test "curl" here?  Is this remnant of debugging of
the server side?

> +	partial_clone "$HTTPD_DOCUMENT_ROOT_PATH/server" "$HTTPD_URL/error_ntime/${_NONCE}/3/502/10/smart/server"
> +'
> +
> +
>  # DO NOT add non-httpd-specific tests here, because the last part of this
>  # test script is only executed when httpd is available and enabled.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 2/3] replace CURLOPT_FILE With CURLOPT_WRITEDATA
  2020-10-13 19:17 ` [PATCH v2 2/3] replace CURLOPT_FILE With CURLOPT_WRITEDATA Sean McAllister
@ 2020-10-13 20:46   ` Junio C Hamano
  2020-10-13 20:58     ` Jeff King
                       ` (2 more replies)
  0 siblings, 3 replies; 30+ messages in thread
From: Junio C Hamano @ 2020-10-13 20:46 UTC (permalink / raw)
  To: Sean McAllister; +Cc: git, peff, masayasuzuki, jrnieder

Sean McAllister <smcallis@google.com> writes:

> CURLOPT_FILE has been deprecated since 2003.

I thought that Dscho already mention this, but updating the above
description to mention that _WRITEDATA was introduce to overtake
_FILE as an equivalent in the same timeframe would be more helpful
to readers.


> Signed-off-by: Sean McAllister <smcallis@google.com>
> ---
>  http-push.c   | 6 +++---
>  http-walker.c | 2 +-
>  http.c        | 6 +++---
>  remote-curl.c | 4 ++--
>  4 files changed, 9 insertions(+), 9 deletions(-)
>
> diff --git a/http-push.c b/http-push.c
> index 6a4a43e07f..2e6fee3305 100644
> --- a/http-push.c
> +++ b/http-push.c
> @@ -894,7 +894,7 @@ static struct remote_lock *lock_remote(const char *path, long timeout)
>  	slot->results = &results;
>  	curl_setup_http(slot->curl, url, DAV_LOCK, &out_buffer, fwrite_buffer);
>  	curl_easy_setopt(slot->curl, CURLOPT_HTTPHEADER, dav_headers);
> -	curl_easy_setopt(slot->curl, CURLOPT_FILE, &in_buffer);
> +	curl_easy_setopt(slot->curl, CURLOPT_WRITEDATA, &in_buffer);
>   ...

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 2/3] replace CURLOPT_FILE With CURLOPT_WRITEDATA
  2020-10-13 20:46   ` Junio C Hamano
@ 2020-10-13 20:58     ` Jeff King
  2020-10-13 21:16       ` Daniel Stenberg
  2020-10-14 14:29     ` Sean McAllister
  2020-10-14 17:11     ` Sean McAllister
  2 siblings, 1 reply; 30+ messages in thread
From: Jeff King @ 2020-10-13 20:58 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Sean McAllister, git, masayasuzuki, jrnieder

On Tue, Oct 13, 2020 at 01:46:07PM -0700, Junio C Hamano wrote:

> Sean McAllister <smcallis@google.com> writes:
> 
> > CURLOPT_FILE has been deprecated since 2003.
> 
> I thought that Dscho already mention this, but updating the above
> description to mention that _WRITEDATA was introduce to overtake
> _FILE as an equivalent in the same timeframe would be more helpful
> to readers.

Yes. But more important:

  - when is _FILE going away (or has it already in some versions)?

  - when did _WRITEDATA appear?

IOW, as a reviewer I would want to make sure that we are not losing
support for any reasonable version of libcurl, or that we are at least
getting something in return (fixing an incompatibility with newer
versions).

From the link Dscho dug up it looks like the answer to the second one is
"long enough not to care", but the commit message should make that
plain.

-Peff

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 3/3] http: automatically retry some requests
  2020-10-13 19:17 ` [PATCH v2 3/3] http: automatically retry some requests Sean McAllister
@ 2020-10-13 21:14   ` Junio C Hamano
  2020-10-14 14:28     ` Sean McAllister
  2020-10-13 21:14   ` Jeff King
  2020-10-14 19:55   ` Jeff King
  2 siblings, 1 reply; 30+ messages in thread
From: Junio C Hamano @ 2020-10-13 21:14 UTC (permalink / raw)
  To: Sean McAllister; +Cc: git, peff, masayasuzuki, jrnieder

Sean McAllister <smcallis@google.com> writes:

>  test_expect_success 'partial clone using HTTP with redirect' '
> -	_NONCE=`gen_nonce` && export _NONCE &&
> +	_NONCE=$(gen_nonce) &&
>  	curl "$HTTPD_URL/error_ntime/${_NONCE}/3/502/10/smart/server" &&
>  	curl "$HTTPD_URL/error_ntime/${_NONCE}/3/502/10/smart/server" &&
>  	curl "$HTTPD_URL/error_ntime/${_NONCE}/3/502/10/smart/server" &&
>  	partial_clone "$HTTPD_DOCUMENT_ROOT_PATH/server" "$HTTPD_URL/error_ntime/${_NONCE}/3/502/10/smart/server"

It also bothers me somewhat that we waste 40 seconds waiting to
complete this test, most of the time just sleeping.

Now, do we still need gen_nonce in the test library, now there is
only one or two tests, both in a single test script?

Thanks.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 3/3] http: automatically retry some requests
  2020-10-13 19:17 ` [PATCH v2 3/3] http: automatically retry some requests Sean McAllister
  2020-10-13 21:14   ` Junio C Hamano
@ 2020-10-13 21:14   ` Jeff King
  2020-10-13 23:45     ` brian m. carlson
  2020-10-14 19:09     ` Sean McAllister
  2020-10-14 19:55   ` Jeff King
  2 siblings, 2 replies; 30+ messages in thread
From: Jeff King @ 2020-10-13 21:14 UTC (permalink / raw)
  To: Sean McAllister; +Cc: git, gitster, masayasuzuki, jrnieder

On Tue, Oct 13, 2020 at 01:17:29PM -0600, Sean McAllister wrote:

> Some HTTP response codes indicate a server state that can support
> retrying the request rather than immediately erroring out.  The server
> can also provide information about how long to wait before retries to
> via the Retry-After header.  So check the server response and retry
> some reasonable number of times before erroring out to better accomodate
> transient errors.
> 
> Exiting immediately becomes irksome when pulling large multi-repo code
> bases such as Android or Chromium, as often the entire fetch operation
> has to be restarted from the beginning due to an error in one repo. If
> we can reduce how often that occurs, then it's a big win.

I had hoped that libcurl might have some retry mechanisms, since the
curl command-line tool has several --retry-* options. But it looks like
that is all only at the tool level, and the library code doesn't know
anything about it. So we are stuck driving the process ourselves.

I do think you could be leveraging CURLINFO_RETRY_AFTER rather than
implementing your own header parsing, though.

>  static int http_request(const char *url,
>  			void *result, int target,
>  			const struct http_get_options *options)
>  {

It looks like you trigger retries only from this function. But this
doesn't cover all http requests that Git makes. That might be sufficient
for your purposes (I think it would catch all of the initial contact),
but it might not (it probably doesn't cover subsequent POSTs for fetch
negotiation nor pack push; likewise I'm not sure if it covers much of
anything after v2 stateless-connect is established).

>  	struct active_request_slot *slot;
>  	struct slot_results results;
> -	struct curl_slist *headers = http_copy_default_headers();
> +	struct curl_slist *headers;

So here we stop copying the headers at the top of the function...

> [...]
> +retry:
> [...]
> +	headers = http_copy_default_headers();
>  	if (accept_language)
>  		headers = curl_slist_append(headers, accept_language);

And instead set them up totally here. Which make some sense, because we
wouldn't want to append accept_language over and over. But who frees the
old ones? There is a call to curl_slist_free_all(headers) later in the
function, but it's after your "goto retry". So I think each retry would
leak another copy of the list.

The ideal thing would probably be to create the header list once, and
then use it for each retry. That would require reordering some of the
setup. If that's too much, then it would be OK to just create a new list
from scratch on each call. Though in the latter case I suspect it may be
simpler to wrap the whole function, like:

  static int http_request(...)
  {
	int try;
	int result;
	for (try = 0; try < max_retries; i++) {
		result = http_request_try(...);
		if (...result is not retryable...)
			break;
	}
	return result;
  }

and then we'd know that the single-try function just needs to be
self-contained, without worrying about gotos jumping around in it.

-Peff

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 2/3] replace CURLOPT_FILE With CURLOPT_WRITEDATA
  2020-10-13 20:58     ` Jeff King
@ 2020-10-13 21:16       ` Daniel Stenberg
  2020-10-14 14:57         ` Jeff King
  0 siblings, 1 reply; 30+ messages in thread
From: Daniel Stenberg @ 2020-10-13 21:16 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, Sean McAllister, git, masayasuzuki, jrnieder

On Tue, 13 Oct 2020, Jeff King wrote:

Let me just inject myself here and comment on two curl things.

>  - when is _FILE going away (or has it already in some versions)?

It will be kept around *at least* for as long as libcurl supports the version 
7 API: for the forseeable future. Posssibly decades.

>  - when did _WRITEDATA appear?

All curl symbols ever introduced are documented clearly in which version they 
appeared. See:

   https://curl.haxx.se/libcurl/c/symbols-in-versions.html

To map the version numbers to release dates, this second table comes handy:

   https://curl.haxx.se/docs/releases.html

CURLOPT_WRITEDATA was added in curl 7.9.7, shipped on May 10 2002.

In the curl project we often see users use *very* old versions but 18 years is 
longer than even the most conservative users I've seen...

-- 

  / daniel.haxx.se (with my curl hat on)

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 3/3] http: automatically retry some requests
  2020-10-13 21:14   ` Jeff King
@ 2020-10-13 23:45     ` brian m. carlson
  2020-10-14 15:17       ` Jeff King
  2020-10-14 19:09     ` Sean McAllister
  1 sibling, 1 reply; 30+ messages in thread
From: brian m. carlson @ 2020-10-13 23:45 UTC (permalink / raw)
  To: Jeff King; +Cc: Sean McAllister, git, gitster, masayasuzuki, jrnieder

[-- Attachment #1: Type: text/plain, Size: 2425 bytes --]

On 2020-10-13 at 21:14:53, Jeff King wrote:
> On Tue, Oct 13, 2020 at 01:17:29PM -0600, Sean McAllister wrote:
> >  static int http_request(const char *url,
> >  			void *result, int target,
> >  			const struct http_get_options *options)
> >  {
> 
> It looks like you trigger retries only from this function. But this
> doesn't cover all http requests that Git makes. That might be sufficient
> for your purposes (I think it would catch all of the initial contact),
> but it might not (it probably doesn't cover subsequent POSTs for fetch
> negotiation nor pack push; likewise I'm not sure if it covers much of
> anything after v2 stateless-connect is established).

Yeah, I was about to mention the same thing.  It looks like we cover
only a subset of requests.  Moreover, I think this feature is going to
practically fail in some cases and we need to either document that
clearly or abandon this effort.

In remote-curl.c, we have post_rpc, which does a POST request to upload
data for a push.  However, if the data is larger than the buffer, we
stream it using chunked transfer-encoding.  Because we're reading from a
pipe, that data cannot be retried: the pack-objects stream will have
ended.

That's why we have code to force Expect: 100-continue for Kerberos
(Negotiate): it can require a 401 response from the server with valid
data in order to send a valid Authorization header, and without the 100
Continue response, we'd have uploaded all the data just to get the 401
response, leading to a failed push.

The only possible alternative to this is to increase the buffer size
(http.postBuffer) and I definitely don't want to encourage people to do
that.  People already get the mistaken idea that that's a magic salve
for all push problems and end up needlessly allocating gigabytes of
memory every time they push.  Encouraging people to waste memory because
the server might experience a problem puts the costs of unreliability on
the users instead of on the server operators where it belongs.

So the only sane thing to do here is to make this operation work only
for fetch requests, since they are the only thing that can be safely
retried in the general case without consuming excessive resources.  As a
result, we may want to add appropriate tests for the push case that we
don't retry those requests.
-- 
brian m. carlson (he/him or they/them)
Houston, Texas, US

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 263 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 3/3] http: automatically retry some requests
  2020-10-13 21:14   ` Junio C Hamano
@ 2020-10-14 14:28     ` Sean McAllister
  2020-10-14 15:25       ` Junio C Hamano
  0 siblings, 1 reply; 30+ messages in thread
From: Sean McAllister @ 2020-10-14 14:28 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, peff, Masaya Suzuki, jrnieder

On Tue, Oct 13, 2020 at 3:14 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Sean McAllister <smcallis@google.com> writes:
>
> >  test_expect_success 'partial clone using HTTP with redirect' '
> > -     _NONCE=`gen_nonce` && export _NONCE &&
> > +     _NONCE=$(gen_nonce) &&
> >       curl "$HTTPD_URL/error_ntime/${_NONCE}/3/502/10/smart/server" &&
> >       curl "$HTTPD_URL/error_ntime/${_NONCE}/3/502/10/smart/server" &&
> >       curl "$HTTPD_URL/error_ntime/${_NONCE}/3/502/10/smart/server" &&
> >       partial_clone "$HTTPD_DOCUMENT_ROOT_PATH/server" "$HTTPD_URL/error_ntime/${_NONCE}/3/502/10/smart/server"
>
> It also bothers me somewhat that we waste 40 seconds waiting to
> complete this test, most of the time just sleeping.
>
> Now, do we still need gen_nonce in the test library, now there is
> only one or two tests, both in a single test script?
>
> Thanks.

In the last patch of the series I actually knocked these down to 1
second because the delay was obnoxious, I agree.  I'll update it so
that it's done in this patch and I'll move gen_nonce.
I'll be a little bit delayed with v3 since my workstation decided to
explode last night :(

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 2/3] replace CURLOPT_FILE With CURLOPT_WRITEDATA
  2020-10-13 20:46   ` Junio C Hamano
  2020-10-13 20:58     ` Jeff King
@ 2020-10-14 14:29     ` Sean McAllister
  2020-10-14 17:11     ` Sean McAllister
  2 siblings, 0 replies; 30+ messages in thread
From: Sean McAllister @ 2020-10-14 14:29 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, peff, Masaya Suzuki, jrnieder

>
> Sean McAllister <smcallis@google.com> writes:
>
> > CURLOPT_FILE has been deprecated since 2003.
>
> I thought that Dscho already mention this, but updating the above
> description to mention that _WRITEDATA was introduce to overtake
> _FILE as an equivalent in the same timeframe would be more helpful
> to readers.
>
>
Yes he did, I'll update the commit message to give more history on the
change and why it's safe.

> > Signed-off-by: Sean McAllister <smcallis@google.com>
> > ---
> >  http-push.c   | 6 +++---
> >  http-walker.c | 2 +-
> >  http.c        | 6 +++---
> >  remote-curl.c | 4 ++--
> >  4 files changed, 9 insertions(+), 9 deletions(-)
> >
> > diff --git a/http-push.c b/http-push.c
> > index 6a4a43e07f..2e6fee3305 100644
> > --- a/http-push.c
> > +++ b/http-push.c
> > @@ -894,7 +894,7 @@ static struct remote_lock *lock_remote(const char *path, long timeout)
> >       slot->results = &results;
> >       curl_setup_http(slot->curl, url, DAV_LOCK, &out_buffer, fwrite_buffer);
> >       curl_easy_setopt(slot->curl, CURLOPT_HTTPHEADER, dav_headers);
> > -     curl_easy_setopt(slot->curl, CURLOPT_FILE, &in_buffer);
> > +     curl_easy_setopt(slot->curl, CURLOPT_WRITEDATA, &in_buffer);
> >   ...

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 2/3] replace CURLOPT_FILE With CURLOPT_WRITEDATA
  2020-10-13 21:16       ` Daniel Stenberg
@ 2020-10-14 14:57         ` Jeff King
  0 siblings, 0 replies; 30+ messages in thread
From: Jeff King @ 2020-10-14 14:57 UTC (permalink / raw)
  To: Daniel Stenberg
  Cc: Junio C Hamano, Sean McAllister, git, masayasuzuki, jrnieder

On Tue, Oct 13, 2020 at 11:16:58PM +0200, Daniel Stenberg wrote:

> Let me just inject myself here and comment on two curl things.
> 
> >  - when is _FILE going away (or has it already in some versions)?
> 
> It will be kept around *at least* for as long as libcurl supports the
> version 7 API: for the forseeable future. Posssibly decades.

Thanks. That's exactly the level of carefulness I expected from libcurl. :)

So this definitely isn't an urgent change, but given the date that
WRITEDATA was introduced, there's very little downside to using it. So
the argument for the commit message is mostly that it's a readability
improvement.

-Peff

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 3/3] http: automatically retry some requests
  2020-10-13 23:45     ` brian m. carlson
@ 2020-10-14 15:17       ` Jeff King
  0 siblings, 0 replies; 30+ messages in thread
From: Jeff King @ 2020-10-14 15:17 UTC (permalink / raw)
  To: brian m. carlson; +Cc: Sean McAllister, git, gitster, masayasuzuki, jrnieder

On Tue, Oct 13, 2020 at 11:45:02PM +0000, brian m. carlson wrote:

> Yeah, I was about to mention the same thing.  It looks like we cover
> only a subset of requests.  Moreover, I think this feature is going to
> practically fail in some cases and we need to either document that
> clearly or abandon this effort.
> 
> In remote-curl.c, we have post_rpc, which does a POST request to upload
> data for a push.  However, if the data is larger than the buffer, we
> stream it using chunked transfer-encoding.  Because we're reading from a
> pipe, that data cannot be retried: the pack-objects stream will have
> ended.

Right, this is the large_request code path there. We use the same
function for fetches, too, though perhaps it's less likely that a
negotiation request will exceed the post buffer size.

We do send a probe rpc in this case, which lets us handle an HTTP 401
auth request. We _could_ retry on errors to the probe rpc, but I'm not
sure if it really accomplishes that much. The interesting thing is
whether the actual request with content goes through. If retrying
magically fixes things, there's no reason to think that the actual
request is any less likely to intermittently fail than the probe request
(in fact I'd expect it to fail more).

It would be possible to restart even these large requests. Obviously we
could spool the contents to disk in order to replay it. But that carries
a cost for the common case that we either succeed or definitely fail on
the first case, and never use the spooled copy.

Another option is to simply restart the Git process that is generating
the data that we're piping. But that has to happen outside of
post_rpc(); only the caller knows the right way to invoke that again.
And we kind of already have that: just run the Git command again.
I know that sounds a little dismissive, but it has really been our
recommended retry strategy for ages[1].

So I'd wonder in Sean's use case why just restarting the whole Git
process isn't a workable solution. It wouldn't respect Retry-After, but
it might be reasonable to surface that header's value to the caller so
it can act appropriately (and I guess likewise whether we saw an error
that implies retrying might work).

All of this is written from the perspective of v1. In v2, we do a lot
more blind packet-shuffling (see remote-curl.c:stateless_connect()). I
suspect it makes any kind of retry at the level of the http code much
harder. Whereas just restarting the Git command would probably work just
fine.

-Peff

[1] I think git-submodule will retry failed clones, for example. TBH, I
    have never once seen this retry accomplish anything, and it only
    wastes time and makes the output more confusing (since we see the
    failure twice). I have to admit I'm not thrilled to see more blind
    retrying for that reason.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 3/3] http: automatically retry some requests
  2020-10-14 14:28     ` Sean McAllister
@ 2020-10-14 15:25       ` Junio C Hamano
  0 siblings, 0 replies; 30+ messages in thread
From: Junio C Hamano @ 2020-10-14 15:25 UTC (permalink / raw)
  To: Sean McAllister; +Cc: git, peff, Masaya Suzuki, jrnieder

Sean McAllister <smcallis@google.com> writes:

> In the last patch of the series I actually knocked these down to 1
> second because the delay was obnoxious, I agree.  I'll update it so
> that it's done in this patch and I'll move gen_nonce.
> I'll be a little bit delayed with v3 since my workstation decided to
> explode last night :(

Ouch, sorry to hear that.  Be safe ;-)

We are deep in pre-release feature freeze right now anyway, so no
rush.


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 1/3] remote-curl: add testing for intelligent retry for HTTP
  2020-10-13 20:40 ` [PATCH v2 1/3] remote-curl: add testing for intelligent retry for HTTP Junio C Hamano
@ 2020-10-14 16:56   ` Sean McAllister
  2020-10-14 18:13     ` Junio C Hamano
  0 siblings, 1 reply; 30+ messages in thread
From: Sean McAllister @ 2020-10-14 16:56 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, peff, Masaya Suzuki, jrnieder

> Sean McAllister <smcallis@google.com> writes:
>
> > +# generate a process unique one-up value
> > +global_counter_for_nonce=0
> > +gen_nonce () {
> > +     global_counter_for_nonce=$((global_counter_for_nonce + 1)) &&
> > +     echo "$global_counter_for_nonce"
> > +}
>
> This must not be called in a subprocess if the caller truly wants
> uniqueness.  May want to be described in a comment.
>
Done.

> > diff --git a/t/lib-httpd/error-ntime.sh b/t/lib-httpd/error-ntime.sh
> > new file mode 100755
> > index 0000000000..64dc878746
> > --- /dev/null
> > +++ b/t/lib-httpd/error-ntime.sh
> > @@ -0,0 +1,80 @@
> > +#!/bin/sh
> > +
> > +# Script to simulate a transient error code with Retry-After header set.
> > +#
> > +# PATH_INFO must be of the form /<nonce>/<times>/<retcode>/<retry-after>/<path>
> > +#   (eg: /dc724af1/3/429/10/some/url)
> > +#
> > +# The <nonce> value uniquely identifies the URL, since we're simulating
> > +# a stateful operation using a stateless protocol, we need a way to "namespace"
> > +# URLs so that they don't step on each other.
> > +#
> > +# The first <times> times this endpoint is called, it will return the given
> > +# <retcode>, and if the <retry-after> is non-negative, it will set the
> > +# Retry-After head to that value.
> > +#
> > +# Subsequent calls will return a 302 redirect to <path>.
> > +#
> > +# Supported error codes are 429,502,503, and 504
> > +
>
> I thought "error codes" were rephrased after the first round's
> review to some other term (which I do not recall--was it status?)?
>
Yes you're right, fixed to reflect that and adjust formatting.

> > +print_status() {
> > +     if [ "$1" -eq "302" ]; then
> > +             printf "Status: 302 Found\n"
> > +     elif [ "$1" -eq "429" ]; then
> > +             printf "Status: 429 Too Many Requests\n"
> > +     elif [ "$1" -eq "502" ]; then
> > +             printf "Status: 502 Bad Gateway\n"
> > +     elif [ "$1" -eq "503" ]; then
> > +             printf "Status: 503 Service Unavailable\n"
> > +     elif [ "$1" -eq "504" ]; then
> > +             printf "Status: 504 Gateway Timeout\n"
> > +     else
> > +             printf "Status: 500 Internal Server Error\n"
> > +     fi
> > +     printf "Content-Type: text/plain\n"
> > +}
>
> Style (Documentation/CodingGuidelines).
>
>         print_status () {
>                 if test "$1" = "302"
>                 then
>                         printf "...";
>                 ...
>         }
>
> but in this particular case, I do not see why we want if/else
> cascade.  Perhaps
>
>         print_status () {
>                 case "$1" in
>                 302)    printf "Status: 302 Found\n" ;;
>                 429)    ... ;;
>                 ...
>                 *)      printf "Status: 500 Internal Server Error\n" ;;
>                 esac
>                 printf "Content-Type: text/plain\n";
>         }
>
> would be more standard?
>
> Also I am not sure why we want "printf ...\n" not "echo" here.  If
> we want to talk HTTP ourselves strictly, I would understand avoiding
> "echo" and doing "printf ...\r\n", though.  If we fear DOS line
> endings coming out of localized "echo", and ensure we use LF line
> ending even on Windows and Cygwin, it is sort of understandable but
> if that is what is going on, that does not explain a lone "echo"
> at the end of the caller.
>
> Puzzled.
>
I modified it to use echo as the standard, it turns out apache handles
properly terminating lines for you
with CRLF.  As for the lone echo, a double CRLF signals the end of the response
header and the start of the body.  Curl doesn't behave properly without it.


>   +oIFS="$IFS"
> > +IFS='/'
> > +set -f
> > +set -- $PATH_INFO
> > +set +f
>   +IFS="$oIFS"
> > +
> > +# pull out first four path components
> > +shift
> > +nonce=$1
> > +times=$2
> > +code=$3
> > +retry=$4
> > +shift 4
> > +
> > +# glue the rest back together as redirect path
> > +path=""
> > +while [ "$#" -gt "0" ]; do
> > +     path="${path}/$1"
> > +     shift
> > +done
>
> Hmph.  Would this work better, I wonder?
>
>         path=${PATH_INFO#*/}    ;# discard leading '/'
>         nonce=${path%%/*}       path=${path#*/}
>         times=${path%%/*}       path=${path#*/}
>         code=${path%%/*}        path=${path#*/}
>         retry=${path%%/*}       path=${path#*/}
>
> At least it is shorter and easier to read.
>
I agree it's better. changed.


> > +# leave a cookie for this request/retry count
> > +state_file="request_${REMOTE_ADDR}_${nonce}_${times}_${code}_${retry}"
> > +
> > +if [ ! -f "$state_file" ]; then
> > +     echo 0 > "$state_file"
> > +fi
>
> Style (Documentation/CodingGuidelines, look for "For shell scripts
> specifically").
>
>  - use "test" not "[]";
>
>  - don't write ";then" on the same line (rule of thumb. you should
>    be able to write your shell scripts without semicolon except for
>    double-semicolons in case/esac statements)
>
>  - don't leave SP between redirection operator '>' and its target
>    file, i.e. write 'echo 0 >"$state_file"'.
>
Done.  I went back over the guidelines and tried to follow everything.

> > +read -r cnt < "$state_file"
> > +if [ "$cnt" -lt "$times" ]; then
> > +     echo $((cnt+1)) > "$state_file"
> > +
> > +     # return error
> > +     print_status "$code"
> > +     if [ "$retry" -ge "0" ]; then
> > +             printf "Retry-After: %s\n" "$retry"
> > +     fi
> > +else
> > +     # redirect
> > +     print_status 302
> > +     printf "Location: %s?%s\n" "$path" "${QUERY_STRING}"
> > +fi
> > +
> > +echo
>
> This "echo" to the client also puzzles me further, after seeing
> puzzling use of "printf ...\n" earlier.
>
See earlier comment, we need a second CRLF to end the HTTP header.
Before I added this, curl was very unhappy.

> > diff --git a/t/t5601-clone.sh b/t/t5601-clone.sh
> > index 7df3c5373a..72aaed5a93 100755
> > --- a/t/t5601-clone.sh
> > +++ b/t/t5601-clone.sh
> > @@ -756,6 +756,15 @@ test_expect_success 'partial clone using HTTP' '
> >       partial_clone "$HTTPD_DOCUMENT_ROOT_PATH/server" "$HTTPD_URL/smart/server"
> >  '
> >
> > +test_expect_success 'partial clone using HTTP with redirect' '
> > +     _NONCE=`gen_nonce` && export _NONCE &&
> > +     curl "$HTTPD_URL/error_ntime/${_NONCE}/3/502/10/smart/server" &&
> > +     curl "$HTTPD_URL/error_ntime/${_NONCE}/3/502/10/smart/server" &&
> > +     curl "$HTTPD_URL/error_ntime/${_NONCE}/3/502/10/smart/server" &&
>
> Why do we need to test "curl" here?  Is this remnant of debugging of
> the server side?
>
At this point in the patch set the retry logic isn't implemented yet.
So this triggers the first error status manually
and then clones the repo using the same URL to verify that clone still
works when returning a 302 redirect status.
I've modified it in v3 so that it only has to do the manual call once though.

> > +     partial_clone "$HTTPD_DOCUMENT_ROOT_PATH/server" "$HTTPD_URL/error_ntime/${_NONCE}/3/502/10/smart/server"
> > +'
> > +
> > +
> >  # DO NOT add non-httpd-specific tests here, because the last part of this
> >  # test script is only executed when httpd is available and enabled.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 2/3] replace CURLOPT_FILE With CURLOPT_WRITEDATA
  2020-10-13 20:46   ` Junio C Hamano
  2020-10-13 20:58     ` Jeff King
  2020-10-14 14:29     ` Sean McAllister
@ 2020-10-14 17:11     ` Sean McAllister
  2 siblings, 0 replies; 30+ messages in thread
From: Sean McAllister @ 2020-10-14 17:11 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, peff, Masaya Suzuki, jrnieder

>
> Sean McAllister <smcallis@google.com> writes:
>
> > CURLOPT_FILE has been deprecated since 2003.
>
> I thought that Dscho already mention this, but updating the above
> description to mention that _WRITEDATA was introduce to overtake
> _FILE as an equivalent in the same timeframe would be more helpful
> to readers.
>
>
I've updated the commit message in v3 to explain more about the
history and details here.

> > Signed-off-by: Sean McAllister <smcallis@google.com>
> > ---
> >  http-push.c   | 6 +++---
> >  http-walker.c | 2 +-
> >  http.c        | 6 +++---
> >  remote-curl.c | 4 ++--
> >  4 files changed, 9 insertions(+), 9 deletions(-)
> >
> > diff --git a/http-push.c b/http-push.c
> > index 6a4a43e07f..2e6fee3305 100644
> > --- a/http-push.c
> > +++ b/http-push.c
> > @@ -894,7 +894,7 @@ static struct remote_lock *lock_remote(const char *path, long timeout)
> >       slot->results = &results;
> >       curl_setup_http(slot->curl, url, DAV_LOCK, &out_buffer, fwrite_buffer);
> >       curl_easy_setopt(slot->curl, CURLOPT_HTTPHEADER, dav_headers);
> > -     curl_easy_setopt(slot->curl, CURLOPT_FILE, &in_buffer);
> > +     curl_easy_setopt(slot->curl, CURLOPT_WRITEDATA, &in_buffer);
> >   ...

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 1/3] remote-curl: add testing for intelligent retry for HTTP
  2020-10-14 16:56   ` Sean McAllister
@ 2020-10-14 18:13     ` Junio C Hamano
  2020-10-14 18:21       ` Sean McAllister
  0 siblings, 1 reply; 30+ messages in thread
From: Junio C Hamano @ 2020-10-14 18:13 UTC (permalink / raw)
  To: Sean McAllister
  Cc: Git Mailing List, Jeff King, Masaya Suzuki, Jonathan Nieder

> I modified it to use echo as the standard, it turns out apache handles
> properly terminating lines for you
> with CRLF.  As for the lone echo, a double CRLF signals the end of the response
> header and the start of the body.  Curl doesn't behave properly without it.

You misunderstood me.  I wasn't questioning the need for a blank line.
I found the inconsistency puzzling that all the previous ones (above)
were done with printf and the later one was done with echo.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 1/3] remote-curl: add testing for intelligent retry for HTTP
  2020-10-14 18:13     ` Junio C Hamano
@ 2020-10-14 18:21       ` Sean McAllister
  0 siblings, 0 replies; 30+ messages in thread
From: Sean McAllister @ 2020-10-14 18:21 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Git Mailing List, Jeff King, Masaya Suzuki, Jonathan Nieder

On Wed, Oct 14, 2020 at 12:14 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> > I modified it to use echo as the standard, it turns out apache handles
> > properly terminating lines for you
> > with CRLF.  As for the lone echo, a double CRLF signals the end of the response
> > header and the start of the body.  Curl doesn't behave properly without it.
>
> You misunderstood me.  I wasn't questioning the need for a blank line.
> I found the inconsistency puzzling that all the previous ones (above)
> were done with printf and the later one was done with echo.

Ah yes that was just sloppy on my part =D

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 3/3] http: automatically retry some requests
  2020-10-13 21:14   ` Jeff King
  2020-10-13 23:45     ` brian m. carlson
@ 2020-10-14 19:09     ` Sean McAllister
  2020-10-14 19:10       ` Sean McAllister
  1 sibling, 1 reply; 30+ messages in thread
From: Sean McAllister @ 2020-10-14 19:09 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Junio C Hamano, Masaya Suzuki, jrnieder

On Tue, Oct 13, 2020 at 3:14 PM Jeff King <peff@peff.net> wrote:
>
> On Tue, Oct 13, 2020 at 01:17:29PM -0600, Sean McAllister wrote:
>
> > Exiting immediately becomes irksome when pulling large multi-repo code
> > bases such as Android or Chromium, as often the entire fetch operation
> > has to be restarted from the beginning due to an error in one repo. If
> > we can reduce how often that occurs, then it's a big win.
>
> I had hoped that libcurl might have some retry mechanisms, since the
> curl command-line tool has several --retry-* options. But it looks like
> that is all only at the tool level, and the library code doesn't know
> anything about it. So we are stuck driving the process ourselves.
>
> I do think you could be leveraging CURLINFO_RETRY_AFTER rather than
> implementing your own header parsing, though.
>
Ah I didn't know about CURLINFO_RETRY_AFTER, I'll look at that and use
it if I can.

> >  static int http_request(const char *url,
> >                       void *result, int target,
> >                       const struct http_get_options *options)
> >  {
>
> It looks like you trigger retries only from this function. But this
> doesn't cover all http requests that Git makes. That might be sufficient
> for your purposes (I think it would catch all of the initial contact),
> but it might not (it probably doesn't cover subsequent POSTs for fetch
> negotiation nor pack push; likewise I'm not sure if it covers much of
> anything after v2 stateless-connect is established).
>
You're right that I only trigger from this function.  I've since removed them
in response to feedback on having too many tests, but I originally tested this
with:
  t5539-fetch-http-shallow.sh
  t5540-http-push-webdav.sh
  t5541-http-push-smart.sh
  t5550-http-fetch-dumb.sh
  t5551-http-fetch-smart.sh
  t5601-clone.sh

I'd have to look at the packet logs to see exactly what each of those
protocols is doing, but it seemed to cover _most_ of what they were doing.

Definitely open to adding retries in other places though.

> >       struct active_request_slot *slot;
> >       struct slot_results results;
> > -     struct curl_slist *headers = http_copy_default_headers();
> > +     struct curl_slist *headers;
>
> So here we stop copying the headers at the top of the function...
>
> > [...]
> > +retry:
> > [...]
> > +     headers = http_copy_default_headers();
> >       if (accept_language)
> >               headers = curl_slist_append(headers, accept_language);
>
> And instead set them up totally here. Which make some sense, because we
> wouldn't want to append accept_language over and over. But who frees the
> old ones? There is a call to curl_slist_free_all(headers) later in the
> function, but it's after your "goto retry". So I think each retry would
> leak another copy of the list.
>
> The ideal thing would probably be to create the header list once, and
> then use it for each retry. That would require reordering some of the
> setup. If that's too much, then it would be OK to just create a new list
> from scratch on each call. Though in the latter case I suspect it may be
> simpler to wrap the whole function, like:
>
>   static int http_request(...)
>   {
>         int try;
>         int result;
>         for (try = 0; try < max_retries; i++) {
>                 result = http_request_try(...);
>                 if (...result is not retryable...)
>                         break;
>         }
>         return result;
>   }
>
> and then we'd know that the single-try function just needs to be
> self-contained, without worrying about gotos jumping around in it.
>
> -Peff
I like this idea, I've refactored it to do just this.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 3/3] http: automatically retry some requests
  2020-10-14 19:09     ` Sean McAllister
@ 2020-10-14 19:10       ` Sean McAllister
  2020-10-14 19:34         ` Jeff King
  0 siblings, 1 reply; 30+ messages in thread
From: Sean McAllister @ 2020-10-14 19:10 UTC (permalink / raw)
  To: Jeff King
  Cc: Git Mailing List, Junio C Hamano, Masaya Suzuki, Jonathan Nieder

On Wed, Oct 14, 2020 at 1:09 PM Sean McAllister <smcallis@google.com> wrote:
>
> On Tue, Oct 13, 2020 at 3:14 PM Jeff King <peff@peff.net> wrote:
> >
> > I do think you could be leveraging CURLINFO_RETRY_AFTER rather than
> > implementing your own header parsing, though.
> >
> Ah I didn't know about CURLINFO_RETRY_AFTER, I'll look at that and use
> it if I can.
>
I took a look, it looks like CURLINFO_RETRY_AFTER was only added in
7.66 (September, 2019),  so
I don't think it's reasonable to rely on it for getting the
Retry-After value in this case.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 3/3] http: automatically retry some requests
  2020-10-14 19:10       ` Sean McAllister
@ 2020-10-14 19:34         ` Jeff King
  2020-10-14 22:38           ` Sean McAllister
  0 siblings, 1 reply; 30+ messages in thread
From: Jeff King @ 2020-10-14 19:34 UTC (permalink / raw)
  To: Sean McAllister
  Cc: Git Mailing List, Junio C Hamano, Masaya Suzuki, Jonathan Nieder

On Wed, Oct 14, 2020 at 01:10:46PM -0600, Sean McAllister wrote:

> On Wed, Oct 14, 2020 at 1:09 PM Sean McAllister <smcallis@google.com> wrote:
> >
> > On Tue, Oct 13, 2020 at 3:14 PM Jeff King <peff@peff.net> wrote:
> > >
> > > I do think you could be leveraging CURLINFO_RETRY_AFTER rather than
> > > implementing your own header parsing, though.
> > >
> > Ah I didn't know about CURLINFO_RETRY_AFTER, I'll look at that and use
> > it if I can.
> >
> I took a look, it looks like CURLINFO_RETRY_AFTER was only added in
> 7.66 (September, 2019),  so
> I don't think it's reasonable to rely on it for getting the
> Retry-After value in this case.

I agree that's pretty recent.

How important is it that we respect it? I.e., we'd have some sane retry
behavior if the header is missing anyway. On older curl versions, how
bad would it be to just use that fallback behavior all the time?

-Peff

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 3/3] http: automatically retry some requests
  2020-10-13 19:17 ` [PATCH v2 3/3] http: automatically retry some requests Sean McAllister
  2020-10-13 21:14   ` Junio C Hamano
  2020-10-13 21:14   ` Jeff King
@ 2020-10-14 19:55   ` Jeff King
  2020-10-14 22:46     ` Sean McAllister
  2 siblings, 1 reply; 30+ messages in thread
From: Jeff King @ 2020-10-14 19:55 UTC (permalink / raw)
  To: Sean McAllister; +Cc: git, gitster, masayasuzuki, jrnieder

On Tue, Oct 13, 2020 at 01:17:29PM -0600, Sean McAllister wrote:

> +/*
> + * check for a retry-after header in the given headers string, if found, then
> + * honor it, otherwise do an exponential backoff up to the max on the current
> + * delay
> +*/
> +static int http_retry_after(const struct strbuf headers, int cur_delay_sec)
> +{
> +	int delay_sec;
> +	char *end;
> +	char* value = http_header_value(headers, "retry-after");
> +
> +	if (value) {
> +		delay_sec = strtol(value, &end, 0);
> +		free(value);
> +		if (*value && *end == '\0' && delay_sec >= 0) {

This looks at the contents of the just-freed "value" memory block.

> +			if (delay_sec > http_max_delay_sec) {
> +				die(Q_("server requested retry after %d second,"
> +					   " which is longer than max allowed\n",
> +					   "server requested retry after %d seconds,"
> +					   " which is longer than max allowed\n", delay_sec),
> +					delay_sec);
> +			}
> +			return delay_sec;

I guess there's no point in being gentle here. We could shrink the retry
time to our maximum allowed, but the server just told us not to bother.
But would this die() mask the actual http error we encountered, which is
surely the more interesting thing for the user?

I wonder if it needs to be returning a "do not bother retrying" value,
which presumably would cause the caller to propagate the real failure in
the usual way.

>  static int http_request(const char *url,
>  			void *result, int target,
>  			const struct http_get_options *options)
>  {
>  	struct active_request_slot *slot;
>  	struct slot_results results;
> -	struct curl_slist *headers = http_copy_default_headers();
> +	struct curl_slist *headers;
>  	struct strbuf buf = STRBUF_INIT;
> +	struct strbuf result_headers = STRBUF_INIT;

This new result_headers strbuf is filled in for every request, but I
don't think us ever releasing it (whether we retry or not). So I think
it's leaking for each request.

It sounds like you're going to rework this to put the retry loop outside
of http_request(), so it may naturally get fixed there. But I thought it
worth mentioning.

> +	curl_easy_setopt(slot->curl, CURLOPT_HEADERDATA, &result_headers);
> +	curl_easy_setopt(slot->curl, CURLOPT_HEADERFUNCTION, fwrite_buffer);

After looking at your parsing code, I wondered if there was a way to
just get a single header out of curl. But according to the documentation
for CURLOPT_HEADERFUNCTION, it will pass back individual lines anyway.
Perhaps it would be simpler to have the callback function understand
that we only care about getting "Retry-After".

The documentation says it doesn't support header folding, but that's
probably OK for our purposes. It's deprecated, and your custom parsing
doesn't handle it either. :) And most importantly, we won't misbehave
terribly if we see it in the wild (we'll just ignore that header).

-Peff

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 3/3] http: automatically retry some requests
  2020-10-14 19:34         ` Jeff King
@ 2020-10-14 22:38           ` Sean McAllister
  2020-10-15  0:04             ` Jonathan Nieder
  0 siblings, 1 reply; 30+ messages in thread
From: Sean McAllister @ 2020-10-14 22:38 UTC (permalink / raw)
  To: Jeff King
  Cc: Git Mailing List, Junio C Hamano, Masaya Suzuki, Jonathan Nieder

Some large projects (Android, Chrome), use git with a distributed
backend to feed changes to automated builders and such.  We can
actually get into a case where DDOS mitigation kicks in and 429s start
going out.  In that case I think it's pretty important that we honor
the Retry-After field so we're good citizens and whoever's running the
backend service has some options for traffic shaping to manage load.
In general you're right it doesn't matter _that_ much but in at least
the specific case I have in my head, it does.

On Wed, Oct 14, 2020 at 1:34 PM Jeff King <peff@peff.net> wrote:
>
> On Wed, Oct 14, 2020 at 01:10:46PM -0600, Sean McAllister wrote:
>
> > On Wed, Oct 14, 2020 at 1:09 PM Sean McAllister <smcallis@google.com> wrote:
> > >
> > > On Tue, Oct 13, 2020 at 3:14 PM Jeff King <peff@peff.net> wrote:
> > > >
> > > > I do think you could be leveraging CURLINFO_RETRY_AFTER rather than
> > > > implementing your own header parsing, though.
> > > >
> > > Ah I didn't know about CURLINFO_RETRY_AFTER, I'll look at that and use
> > > it if I can.
> > >
> > I took a look, it looks like CURLINFO_RETRY_AFTER was only added in
> > 7.66 (September, 2019),  so
> > I don't think it's reasonable to rely on it for getting the
> > Retry-After value in this case.
>
> I agree that's pretty recent.
>
> How important is it that we respect it? I.e., we'd have some sane retry
> behavior if the header is missing anyway. On older curl versions, how
> bad would it be to just use that fallback behavior all the time?
>
> -Peff

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 3/3] http: automatically retry some requests
  2020-10-14 19:55   ` Jeff King
@ 2020-10-14 22:46     ` Sean McAllister
  2020-10-15 16:23       ` Jeff King
  0 siblings, 1 reply; 30+ messages in thread
From: Sean McAllister @ 2020-10-14 22:46 UTC (permalink / raw)
  To: Jeff King
  Cc: Git Mailing List, Junio C Hamano, Masaya Suzuki, Jonathan Nieder

On Wed, Oct 14, 2020 at 1:55 PM Jeff King <peff@peff.net> wrote:
>
> On Tue, Oct 13, 2020 at 01:17:29PM -0600, Sean McAllister wrote:
>
> > +/*
> > + * check for a retry-after header in the given headers string, if found, then
> > + * honor it, otherwise do an exponential backoff up to the max on the current
> > + * delay
> > +*/
> > +static int http_retry_after(const struct strbuf headers, int cur_delay_sec)
> > +{
> > +     int delay_sec;
> > +     char *end;
> > +     char* value = http_header_value(headers, "retry-after");
> > +
> > +     if (value) {
> > +             delay_sec = strtol(value, &end, 0);
> > +             free(value);
> > +             if (*value && *end == '\0' && delay_sec >= 0) {
>
> This looks at the contents of the just-freed "value" memory block.
>
Sure does, fixed in v3, disappointed that electric fence didn't catch
this for me...

> > +                     if (delay_sec > http_max_delay_sec) {
> > +                             die(Q_("server requested retry after %d second,"
> > +                                        " which is longer than max allowed\n",
> > +                                        "server requested retry after %d seconds,"
> > +                                        " which is longer than max allowed\n", delay_sec),
> > +                                     delay_sec);
> > +                     }
> > +                     return delay_sec;
>
> I guess there's no point in being gentle here. We could shrink the retry
> time to our maximum allowed, but the server just told us not to bother.
> But would this die() mask the actual http error we encountered, which is
> surely the more interesting thing for the user?
>
> I wonder if it needs to be returning a "do not bother retrying" value,
> which presumably would cause the caller to propagate the real failure in
> the usual way.
>
I've moved this check up a couple levels in v3, so that if we get too large
a retry value, then we'll print this message as a warning and quit retrying,
which will unmask the underlying HTTP error.

> >  static int http_request(const char *url,
> >                       void *result, int target,
> >                       const struct http_get_options *options)
> >  {
> >       struct active_request_slot *slot;
> >       struct slot_results results;
> > -     struct curl_slist *headers = http_copy_default_headers();
> > +     struct curl_slist *headers;
> >       struct strbuf buf = STRBUF_INIT;
> > +     struct strbuf result_headers = STRBUF_INIT;
>
> This new result_headers strbuf is filled in for every request, but I
> don't think us ever releasing it (whether we retry or not). So I think
> it's leaking for each request.
>
> It sounds like you're going to rework this to put the retry loop outside
> of http_request(), so it may naturally get fixed there. But I thought it
> worth mentioning.
>
Yes good catch, the new http_request_try version fixes it as a matter of course.

> > +     curl_easy_setopt(slot->curl, CURLOPT_HEADERDATA, &result_headers);
> > +     curl_easy_setopt(slot->curl, CURLOPT_HEADERFUNCTION, fwrite_buffer);
>
> After looking at your parsing code, I wondered if there was a way to
> just get a single header out of curl. But according to the documentation
> for CURLOPT_HEADERFUNCTION, it will pass back individual lines anyway.
> Perhaps it would be simpler to have the callback function understand
> that we only care about getting "Retry-After".
>
> The documentation says it doesn't support header folding, but that's
> probably OK for our purposes. It's deprecated, and your custom parsing
> doesn't handle it either. :) And most importantly, we won't misbehave
> terribly if we see it in the wild (we'll just ignore that header).
>
I'll put this in my todo pile to think on a little, it'd be nice not
to have expand
the strbuf with every request, but also not a huge overhead.

> -Peff

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 3/3] http: automatically retry some requests
  2020-10-14 22:38           ` Sean McAllister
@ 2020-10-15  0:04             ` Jonathan Nieder
  2020-10-15 16:14               ` Jeff King
  2020-10-15 16:24               ` Junio C Hamano
  0 siblings, 2 replies; 30+ messages in thread
From: Jonathan Nieder @ 2020-10-15  0:04 UTC (permalink / raw)
  To: Sean McAllister
  Cc: Jeff King, Git Mailing List, Junio C Hamano, Masaya Suzuki

Hi,

Sean McAllister wrote:
> On Wed, Oct 14, 2020 at 1:34 PM Jeff King <peff@peff.net> wrote:
>> On Wed, Oct 14, 2020 at 01:10:46PM -0600, Sean McAllister wrote:

>>> I took a look, it looks like CURLINFO_RETRY_AFTER was only added in
>>> 7.66 (September, 2019),  so
>>> I don't think it's reasonable to rely on it for getting the
>>> Retry-After value in this case.
>>
>> I agree that's pretty recent.
>>
>> How important is it that we respect it? I.e., we'd have some sane retry
>> behavior if the header is missing anyway. On older curl versions, how
>> bad would it be to just use that fallback behavior all the time?
>
> Some large projects (Android, Chrome), use git with a distributed
> backend to feed changes to automated builders and such.  We can
> actually get into a case where DDOS mitigation kicks in and 429s start
> going out.  In that case I think it's pretty important that we honor
> the Retry-After field so we're good citizens and whoever's running the
> backend service has some options for traffic shaping to manage load.
> In general you're right it doesn't matter _that_ much but in at least
> the specific case I have in my head, it does.

I see.  With Peff's proposal, the backend service could still set
Retry-After, and *modern* machines with new enough libcurl would still
respect it, but older machines[*] would have to use some fallback
behavior.

I suppose that fallback behavior could be to not retry at all.  That
sounds appealing to me since it would be more maintainable (no custom
header parsing) and the fallback would be decreasingly relevant over
time as users upgrade to modern versions of libcurl and git.  What do
you think?

Thanks,
Jonathan

[*] This represents two cases: if CURLINFO_RETRY_AFTER is not defined,
meaning that libcurl-dev was too old at *build time*, and if getinfo
returned CURLE_UNKNOWN_OPTION, meaning that libcurl was too old at run
time.

You might wonder how we handle the too-old-at-build-time case.  The
usual trick is to provide a helper function with a stub
implementation:

 #ifndef CURLINFO_RETRY_AFTER
 static inline CURLcode retry_after(CURL *handle, curl_off_t *retry)
 {
 	return CURLE_UNKNOWN_OPTION;
 }
 #else
 static inline CURLcode retry_after(CURL *handle, curl_off_t *retry)
 {
 	return curl_easy_getinfo(handle, CURLINFO_RETRY_AFTER, retry);
 }
 #endif

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 3/3] http: automatically retry some requests
  2020-10-15  0:04             ` Jonathan Nieder
@ 2020-10-15 16:14               ` Jeff King
  2020-10-15 16:24               ` Junio C Hamano
  1 sibling, 0 replies; 30+ messages in thread
From: Jeff King @ 2020-10-15 16:14 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: Sean McAllister, Git Mailing List, Junio C Hamano, Masaya Suzuki

On Wed, Oct 14, 2020 at 05:04:10PM -0700, Jonathan Nieder wrote:

> > Some large projects (Android, Chrome), use git with a distributed
> > backend to feed changes to automated builders and such.  We can
> > actually get into a case where DDOS mitigation kicks in and 429s start
> > going out.  In that case I think it's pretty important that we honor
> > the Retry-After field so we're good citizens and whoever's running the
> > backend service has some options for traffic shaping to manage load.
> > In general you're right it doesn't matter _that_ much but in at least
> > the specific case I have in my head, it does.
> 
> I see.  With Peff's proposal, the backend service could still set
> Retry-After, and *modern* machines with new enough libcurl would still
> respect it, but older machines[*] would have to use some fallback
> behavior.
> 
> I suppose that fallback behavior could be to not retry at all.  That
> sounds appealing to me since it would be more maintainable (no custom
> header parsing) and the fallback would be decreasingly relevant over
> time as users upgrade to modern versions of libcurl and git.  What do
> you think?

Yeah, the good-citizen behavior would be to err on the side of not
retrying. And actually, I kind of like that better anyway. The
retryable_code() list has things like 502 in it, which aren't
necessarily temporary errors. If the server didn't give us a hint of
when to retry, perhaps it's not a good idea to do so.

That's slightly orthogonal to the CURLINFO_RETRY_AFTER question. It
would mean an older Git would not retry in this case. But if you're
primarily interested in fixing automated builders overloading your
system, it's probably not that big a deal to make sure they're up to
date (after all, they need the new Git, too ;) ).

If you're hoping to help random developers on various platforms, then
making the feature work with older curl is more compelling. Many people
might upgrade their Git version there, but be stuck with an older system
libcurl.

-Peff

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 3/3] http: automatically retry some requests
  2020-10-14 22:46     ` Sean McAllister
@ 2020-10-15 16:23       ` Jeff King
  2020-10-15 16:33         ` Sean McAllister
  0 siblings, 1 reply; 30+ messages in thread
From: Jeff King @ 2020-10-15 16:23 UTC (permalink / raw)
  To: Sean McAllister
  Cc: Git Mailing List, Junio C Hamano, Masaya Suzuki, Jonathan Nieder

On Wed, Oct 14, 2020 at 04:46:57PM -0600, Sean McAllister wrote:

> > I wonder if it needs to be returning a "do not bother retrying" value,
> > which presumably would cause the caller to propagate the real failure in
> > the usual way.
> >
> I've moved this check up a couple levels in v3, so that if we get too large
> a retry value, then we'll print this message as a warning and quit retrying,
> which will unmask the underlying HTTP error.

Thanks, that sounds much better.

> > After looking at your parsing code, I wondered if there was a way to
> > just get a single header out of curl. But according to the documentation
> > for CURLOPT_HEADERFUNCTION, it will pass back individual lines anyway.
> > Perhaps it would be simpler to have the callback function understand
> > that we only care about getting "Retry-After".
> >
> > The documentation says it doesn't support header folding, but that's
> > probably OK for our purposes. It's deprecated, and your custom parsing
> > doesn't handle it either. :) And most importantly, we won't misbehave
> > terribly if we see it in the wild (we'll just ignore that header).
> >
> I'll put this in my todo pile to think on a little, it'd be nice not
> to have expand the strbuf with every request, but also not a huge
> overhead.

I was less concerned with the overhead of the strbuf (http requests are
pretty heavyweight already) and more that it could simplify your parsing
if you could just do it left-to-right on a single line:

	char *line = xmemdupz(buffer, size);
	const char *p = line;
	if (skip_iprefix(p, "retry-after:", &p)) {
		char *end;
		while (isspace(*p))
			p++;
		opts->retry_after = strtol(p, &end, 10);
		/* if you want to be pedantic */
		if (*end && *end != '\r' && *end != '\n')
			opts->retry_after = 0; /* warn, too? */
	}

If you want to be clever, you could probably avoid the extra allocation,
but I think being able to parse with simple string functions makes it
much more obvious that we don't walk off the end of the input.

-Peff

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 3/3] http: automatically retry some requests
  2020-10-15  0:04             ` Jonathan Nieder
  2020-10-15 16:14               ` Jeff King
@ 2020-10-15 16:24               ` Junio C Hamano
  1 sibling, 0 replies; 30+ messages in thread
From: Junio C Hamano @ 2020-10-15 16:24 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: Sean McAllister, Jeff King, Git Mailing List, Masaya Suzuki

Jonathan Nieder <jrnieder@gmail.com> writes:

> I see.  With Peff's proposal, the backend service could still set
> Retry-After, and *modern* machines with new enough libcurl would still
> respect it, but older machines[*] would have to use some fallback
> behavior.
>
> I suppose that fallback behavior could be to not retry at all.

Yup, that is a quite attractive approach.  We'll let the passage of
time work for us.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 3/3] http: automatically retry some requests
  2020-10-15 16:23       ` Jeff King
@ 2020-10-15 16:33         ` Sean McAllister
  0 siblings, 0 replies; 30+ messages in thread
From: Sean McAllister @ 2020-10-15 16:33 UTC (permalink / raw)
  To: Jeff King
  Cc: Git Mailing List, Junio C Hamano, Masaya Suzuki, Jonathan Nieder

>
> I was less concerned with the overhead of the strbuf (http requests are
> pretty heavyweight already) and more that it could simplify your parsing
> if you could just do it left-to-right on a single line:
>
>         char *line = xmemdupz(buffer, size);
>         const char *p = line;
>         if (skip_iprefix(p, "retry-after:", &p)) {
>                 char *end;
>                 while (isspace(*p))
>                         p++;
>                 opts->retry_after = strtol(p, &end, 10);
>                 /* if you want to be pedantic */
>                 if (*end && *end != '\r' && *end != '\n')
>                         opts->retry_after = 0; /* warn, too? */
>         }
>
> If you want to be clever, you could probably avoid the extra allocation,
> but I think being able to parse with simple string functions makes it
> much more obvious that we don't walk off the end of the input.
>
The latest code effectively does this per Dosch' suggestion.  No more
string splitting, it just
parses it as a single big string as you mention.  It sounds like we're
quickly settling on using
CURLINFO_RETRY_AFTER when available, and not retrying otherwise, so it
might be OBE.

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2020-10-15 16:34 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-13 19:17 [PATCH v2 1/3] remote-curl: add testing for intelligent retry for HTTP Sean McAllister
2020-10-13 19:17 ` [PATCH v2 2/3] replace CURLOPT_FILE With CURLOPT_WRITEDATA Sean McAllister
2020-10-13 20:46   ` Junio C Hamano
2020-10-13 20:58     ` Jeff King
2020-10-13 21:16       ` Daniel Stenberg
2020-10-14 14:57         ` Jeff King
2020-10-14 14:29     ` Sean McAllister
2020-10-14 17:11     ` Sean McAllister
2020-10-13 19:17 ` [PATCH v2 3/3] http: automatically retry some requests Sean McAllister
2020-10-13 21:14   ` Junio C Hamano
2020-10-14 14:28     ` Sean McAllister
2020-10-14 15:25       ` Junio C Hamano
2020-10-13 21:14   ` Jeff King
2020-10-13 23:45     ` brian m. carlson
2020-10-14 15:17       ` Jeff King
2020-10-14 19:09     ` Sean McAllister
2020-10-14 19:10       ` Sean McAllister
2020-10-14 19:34         ` Jeff King
2020-10-14 22:38           ` Sean McAllister
2020-10-15  0:04             ` Jonathan Nieder
2020-10-15 16:14               ` Jeff King
2020-10-15 16:24               ` Junio C Hamano
2020-10-14 19:55   ` Jeff King
2020-10-14 22:46     ` Sean McAllister
2020-10-15 16:23       ` Jeff King
2020-10-15 16:33         ` Sean McAllister
2020-10-13 20:40 ` [PATCH v2 1/3] remote-curl: add testing for intelligent retry for HTTP Junio C Hamano
2020-10-14 16:56   ` Sean McAllister
2020-10-14 18:13     ` Junio C Hamano
2020-10-14 18:21       ` Sean McAllister

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).