[PATCH] Disable dumb HTTP fallback with GIT_CURL

git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed

* [PATCH] Disable dumb HTTP fallback with GIT_CURL_FALLBACK=0
@ 2012-09-20  2:55 Shawn O. Pearce
  2012-09-20  3:22 ` Shawn Pearce
                   ` (2 more replies)
  0 siblings, 3 replies; 36+ messages in thread
From: Shawn O. Pearce @ 2012-09-20  2:55 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Shawn O. Pearce

From: "Shawn O. Pearce" <spearce@spearce.org>

If the user doesn't want to use the dumb HTTP protocol, she may
set GIT_CURL_FALLBACK=0 in the environment before invoking a Git
protocol operation. This is mostly useful when testing against
servers that are known to not support the dumb protocol. If the
smart service detection fails the client should not continue with
dumb behavior, but instead provide accurate HTTP failure data.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
---
 remote-curl.c | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/remote-curl.c b/remote-curl.c
index 4a0927e..2f91128 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -17,7 +17,8 @@ struct options {
 	unsigned progress : 1,
 		followtags : 1,
 		dry_run : 1,
-		thin : 1;
+		thin : 1,
+		fallback : 1;
 };
 static struct options options;
 
@@ -115,7 +116,8 @@ static struct discovery* discover_refs(const char *service)
 	http_ret = http_get_strbuf(refs_url, &buffer, HTTP_NO_CACHE);
 
 	/* try again with "plain" url (no ? or & appended) */
-	if (http_ret != HTTP_OK && http_ret != HTTP_NOAUTH) {
+	if (options.fallback && http_ret != HTTP_OK
+	    && http_ret != HTTP_NOAUTH) {
 		free(refs_url);
 		strbuf_reset(&buffer);
 
@@ -868,6 +870,12 @@ int main(int argc, const char **argv)
 	options.verbosity = 1;
 	options.progress = !!isatty(2);
 	options.thin = 1;
+	options.fallback = 1;
+
+	if (getenv("GIT_CURL_FALLBACK")) {
+		char *fb = getenv("GIT_CURL_FALLBACK");
+		options.fallback = *fb != '0';
+	}
 
 	remote = remote_get(argv[1]);
 
-- 
1.7.12.1.512.g9b230e6

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PATCH] Disable dumb HTTP fallback with GIT_CURL_FALLBACK=0
  2012-09-20  2:55 [PATCH] Disable dumb HTTP fallback with GIT_CURL_FALLBACK=0 Shawn O. Pearce
@ 2012-09-20  3:22 ` Shawn Pearce
  2012-09-20  3:52   ` Jeff King
  2012-09-20  3:48 ` Jeff King
  2012-09-20  4:14 ` Re* [PATCH] Disable dumb HTTP fallback with GIT_CURL_FALLBACK=0 Junio C Hamano
  2 siblings, 1 reply; 36+ messages in thread
From: Shawn Pearce @ 2012-09-20  3:22 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Shawn O. Pearce

On Wed, Sep 19, 2012 at 7:55 PM, Shawn O. Pearce <spearce@spearce.org> wrote:
> From: "Shawn O. Pearce" <spearce@spearce.org>

I can't explain why git send-email did this. I obviously didn't need
the extra From header here. format-patch didn't write it to the patch
file, it was injected by send-email. My .git/config is pretty simple,
the name/email are derived from there:

  [user]
	name = Shawn O. Pearce
	email = spearce@spearce.org

Ick. I really don't want to debug this right now so I'm just going to
pretend it wasn't written.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] Disable dumb HTTP fallback with GIT_CURL_FALLBACK=0
  2012-09-20  2:55 [PATCH] Disable dumb HTTP fallback with GIT_CURL_FALLBACK=0 Shawn O. Pearce
  2012-09-20  3:22 ` Shawn Pearce
@ 2012-09-20  3:48 ` Jeff King
  2012-09-20  5:57   ` Shawn Pearce
  2012-09-20  4:14 ` Re* [PATCH] Disable dumb HTTP fallback with GIT_CURL_FALLBACK=0 Junio C Hamano
  2 siblings, 1 reply; 36+ messages in thread
From: Jeff King @ 2012-09-20  3:48 UTC (permalink / raw)
  To: Shawn O. Pearce; +Cc: Junio C Hamano, git

On Wed, Sep 19, 2012 at 07:55:53PM -0700, Shawn O. Pearce wrote:

> From: "Shawn O. Pearce" <spearce@spearce.org>
> 
> If the user doesn't want to use the dumb HTTP protocol, she may
> set GIT_CURL_FALLBACK=0 in the environment before invoking a Git
> protocol operation. This is mostly useful when testing against
> servers that are known to not support the dumb protocol. If the
> smart service detection fails the client should not continue with
> dumb behavior, but instead provide accurate HTTP failure data.

I have been looking into this recently, as well. GitHub does not allow
dumb http at all these days, so transient errors on the initial smart
contact can cause us to fall back to dumb, and end up reporting a
totally useless 403 Forbidden error.  I guess Google Code has a similar
issue.

Note that it is not really do not "fall back to dumb"; we detect the
dumb nature from the response. It is really "fall back to trying the URL
without the query string, because there are some servers that cannot
handle it". With your patch, we might still end up performing a dumb
transfer.

I think what you're doing here is sane, because you have to turn it on
manually, and thus there are no possible backwards compatibility issues.
But it might be nice to make things work better out of the box. Here are
two client-side changes I've been toying with:

  1. If both smart and dumb requests fail, report the error for the
     smart request. Now that smart-http clients are common, I'd expect
     most http servers to be smart these days. Of course I don't have
     any sort of numbers to back this up (nor am I sure how to get them;
     obviously big sites like GitHub and Google Code do a lot of
     traffic, but who knows how many one-off repo-on-a-generic-web-host
     sites still exist?).

     An alternative would be to simply be more verbose, and mention that
     we tried to fallback and list both failures (or we could do this
     with just "fetch -v").

  2. Be more discerning about which errors will cause a fallback.
     Something like "504 Gateway Timeout" should not give a fallback.
     The problem is that you are really guessing at what kinds of http
     errors you are going to get from a dumb server when you try the
     smart URL. I dug back into the list thread that spawned the "retry
     without query string" patch (703e6e7).

     The thread is here:

       http://thread.gmane.org/gmane.comp.version-control.git/137609

     If you read the thread, it turns out that the problem in this case
     (which is the only reported case I could find in the archive) is
     that the server was misconfigured to treat _anything_ with a query
     string as a gitweb URL. And then it got fixed pretty much
     immediately.

     So as far as we know, there may be zero servers for which this
     fallback is actually doing anything useful.

I'm tempted to just reverse the logic. Try the request with the query
string and immediately fail if it doesn't work. For the few (if any)
people who are hitting a server that will not serve the dumb file in
that case, add a "remote.*.dumbhttp" setting that will turn off smart
completely as a workaround.

That would serve the (presumed) majority who are using smart http,
everyone using dumb http on a reasonably-configured server, and still
allow an easy workaround for people with badly configured servers.

What do you think?

> ---
>  remote-curl.c | 12 ++++++++++--
>  1 file changed, 10 insertions(+), 2 deletions(-)

If we do go this route, the patch itself looks fairly obvious, although:

> @@ -868,6 +870,12 @@ int main(int argc, const char **argv)
>  	options.verbosity = 1;
>  	options.progress = !!isatty(2);
>  	options.thin = 1;
> +	options.fallback = 1;
> +
> +	if (getenv("GIT_CURL_FALLBACK")) {
> +		char *fb = getenv("GIT_CURL_FALLBACK");
> +		options.fallback = *fb != '0';
> +	}

This can just be:

  options.fallback = git_env_bool("GIT_CURL_FALLBACK", 1);

Fewer lines, and you get all of the true/false parsing for free.

-Peff

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] Disable dumb HTTP fallback with GIT_CURL_FALLBACK=0
  2012-09-20  3:22 ` Shawn Pearce
@ 2012-09-20  3:52   ` Jeff King
  0 siblings, 0 replies; 36+ messages in thread
From: Jeff King @ 2012-09-20  3:52 UTC (permalink / raw)
  To: Shawn Pearce; +Cc: Junio C Hamano, git

On Wed, Sep 19, 2012 at 08:22:58PM -0700, Shawn O. Pearce wrote:

> On Wed, Sep 19, 2012 at 7:55 PM, Shawn O. Pearce <spearce@spearce.org> wrote:
> > From: "Shawn O. Pearce" <spearce@spearce.org>
> 
> I can't explain why git send-email did this. I obviously didn't need
> the extra From header here. format-patch didn't write it to the patch
> file, it was injected by send-email. My .git/config is pretty simple,
> the name/email are derived from there:
> 
>   [user]
> 	name = Shawn O. Pearce
> 	email = spearce@spearce.org
> 
> Ick. I really don't want to debug this right now so I'm just going to
> pretend it wasn't written.

I was looking at the send-email author comparison code a month or three
ago, and I remember noticing that it totally fails to canonicalize
before comparing.  Without even looking at it again, I'm fairly sure
that it thinks '"Shawn O. Pearce"' and 'Shawn O. Pearce' (i.e., with and
without the quotes) are different, and therefore author != committer.

In your case it is particularly egregious because the quotes are
introduced (correctly) by format-patch, so it is not even like you have
configured two different versions of your name. 

I think the same bug exists for different rfc2047 encodings of a name
with non-ascii characters. Fixing both would involve canonicalizing the
names before comparing. I wonder if it would be simpler to just compare
the email addresses and ignore the names entirely.

-Peff

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re* [PATCH] Disable dumb HTTP fallback with GIT_CURL_FALLBACK=0
  2012-09-20  2:55 [PATCH] Disable dumb HTTP fallback with GIT_CURL_FALLBACK=0 Shawn O. Pearce
  2012-09-20  3:22 ` Shawn Pearce
  2012-09-20  3:48 ` Jeff King
@ 2012-09-20  4:14 ` Junio C Hamano
  2012-09-20  4:14   ` [PATCH 1/2] Disable dumb HTTP fallback with GIT_DUMB_HTTP_FALLBACK=false Junio C Hamano
  2012-09-20  4:14   ` [PATCH 2/2] remote-curl: make dumb-http fallback configurable per URL Junio C Hamano
  2 siblings, 2 replies; 36+ messages in thread
From: Junio C Hamano @ 2012-09-20  4:14 UTC (permalink / raw)
  To: git; +Cc: Shawn O. Pearce

"Shawn O. Pearce" <spearce@spearce.org> writes:

> From: "Shawn O. Pearce" <spearce@spearce.org>
>
> If the user doesn't want to use the dumb HTTP protocol, she may
> set GIT_CURL_FALLBACK=0 in the environment before invoking a Git
> protocol operation. This is mostly useful when testing against
> servers that are known to not support the dumb protocol. If the
> smart service detection fails the client should not continue with
> dumb behavior, but instead provide accurate HTTP failure data.
>
> Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
> ---
>  remote-curl.c | 12 ++++++++++--
>  1 file changed, 10 insertions(+), 2 deletions(-)

I can see how this feature may be useful, but I have to say the
external interface triply sucks.

 - If it is primarily for debugging smart HTTP, a solution with an
   environment variable without more permanent configuration
   variable would be sufficient, but then the environment variable
   is better named GIT_SMART_HTTP_DEBUG, or something no?

 - If it is useful outside the context of debugging, perhaps a per
   remote configuration variable remote.$name.$variable (or
   http.$prefix_of_server_url.$variable) might be necessary?

 - I do not see this as "fallback (to) curl"; you still talk your
   smart protocol over curl library.  "fallback to dumb http" is
   more understandable.  

In any case, I think CURL_FALLBACK was named with CURL in its name
primarily because the environment applies only to remote-curl, but
that means we cannot have any fallback logic other than the current
"smart does not work, fall back on dumb" in the future.

Here is a bit of rewrite.  [1/2] is yours but with a bit more
sensible name. [2/2] is entirely optional.


Junio C Hamano (1):
  remote-curl: make dumb-http fallback configurable per URL

Shawn O. Pearce (1):
  Disable dumb HTTP fallback with GIT_DUMB_HTTP_FALLBACK=false

 remote-curl.c | 55 +++++++++++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 53 insertions(+), 2 deletions(-)

-- 
1.7.12.1.389.g3dff30b

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH 1/2] Disable dumb HTTP fallback with GIT_DUMB_HTTP_FALLBACK=false
  2012-09-20  4:14 ` Re* [PATCH] Disable dumb HTTP fallback with GIT_CURL_FALLBACK=0 Junio C Hamano
@ 2012-09-20  4:14   ` Junio C Hamano
  2012-09-20  4:14   ` [PATCH 2/2] remote-curl: make dumb-http fallback configurable per URL Junio C Hamano
  1 sibling, 0 replies; 36+ messages in thread
From: Junio C Hamano @ 2012-09-20  4:14 UTC (permalink / raw)
  To: git; +Cc: Shawn O. Pearce

From: "Shawn O. Pearce" <spearce@spearce.org>

If the user doesn't want to use the dumb HTTP protocol, she may set
GIT_DUMB_HTTP_FALLBACK=false in the environment before invoking a
Git protocol operation. This is mostly useful when testing against
servers that are known to not support the dumb protocol. If the
smart service detection fails the client should not continue with
dumb behavior, but instead provide accurate HTTP failure data.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 remote-curl.c | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/remote-curl.c b/remote-curl.c
index 3ec474f..f25cf3c 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -17,7 +17,8 @@ struct options {
 	unsigned progress : 1,
 		followtags : 1,
 		dry_run : 1,
-		thin : 1;
+		thin : 1,
+		fallback : 1;
 };
 static struct options options;
 
@@ -115,7 +116,8 @@ static struct discovery* discover_refs(const char *service)
 	http_ret = http_get_strbuf(refs_url, &buffer, HTTP_NO_CACHE);
 
 	/* try again with "plain" url (no ? or & appended) */
-	if (http_ret != HTTP_OK && http_ret != HTTP_NOAUTH) {
+	if (options.fallback && http_ret != HTTP_OK
+	    && http_ret != HTTP_NOAUTH) {
 		free(refs_url);
 		strbuf_reset(&buffer);
 
@@ -853,6 +855,8 @@ static void parse_push(struct strbuf *buf)
 	free(specs);
 }
 
+static const char DUMB_HTTP_FALLBACK_ENV[] = "GIT_DUMB_HTTP_FALLBACK";
+
 int main(int argc, const char **argv)
 {
 	struct strbuf buf = STRBUF_INIT;
@@ -868,6 +872,12 @@ int main(int argc, const char **argv)
 	options.verbosity = 1;
 	options.progress = !!isatty(2);
 	options.thin = 1;
+	options.fallback = 1;
+
+	if (getenv(DUMB_HTTP_FALLBACK_ENV)) {
+		char *fb = getenv(DUMB_HTTP_FALLBACK_ENV);
+		options.fallback = git_config_bool(DUMB_HTTP_FALLBACK_ENV, fb);
+	}
 
 	remote = remote_get(argv[1]);
 
-- 
1.7.12.1.389.g3dff30b

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 2/2] remote-curl: make dumb-http fallback configurable per URL
  2012-09-20  4:14 ` Re* [PATCH] Disable dumb HTTP fallback with GIT_CURL_FALLBACK=0 Junio C Hamano
  2012-09-20  4:14   ` [PATCH 1/2] Disable dumb HTTP fallback with GIT_DUMB_HTTP_FALLBACK=false Junio C Hamano
@ 2012-09-20  4:14   ` Junio C Hamano
  1 sibling, 0 replies; 36+ messages in thread
From: Junio C Hamano @ 2012-09-20  4:14 UTC (permalink / raw)
  To: git

Introduce http.$url_prefix.dumbhttpfallback configuration variables
so that users do not have to set GIT_DUMB_HTTP_FALLBACK environment
depending on which remote they are talking with.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 remote-curl.c | 53 +++++++++++++++++++++++++++++++++++++++++++++++------
 1 file changed, 47 insertions(+), 6 deletions(-)

diff --git a/remote-curl.c b/remote-curl.c
index f25cf3c..44544c7 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -855,6 +855,46 @@ static void parse_push(struct strbuf *buf)
 	free(specs);
 }
 
+struct dumb_fallback_cb {
+	const char *url;
+	int value;
+};
+
+static int dumb_fallback_cb(const char *key, const char *value, void *cb_)
+{
+	struct dumb_fallback_cb *cb = cb_;
+	int i, len;
+
+	/* Is this "http.$url.dumbHttpFallback"? */
+	if (prefixcmp(key, "http."))
+		return 0;
+	len = strrchr(key, '.') - key;
+	if (len <= 5)
+		/* we found the dot at the end of "http." */
+		return 0;
+	key += 5; /* skip over http. part */
+	len -= 5;
+	if (strcmp(key + len, ".dumbhttpfallback"))
+		return 0;
+
+	/* Does the $url part match? */
+	for (i = 0; i < len; i++)
+		if (cb->url[i] != key[i])
+			return 0;
+	cb->value = git_config_bool(key, value);
+	return 0;
+}
+
+static int dumb_fallback_config(const char *url)
+{
+	struct dumb_fallback_cb cb;
+
+	cb.url = url;
+	cb.value = 1; /* defaults to true */
+	git_config(dumb_fallback_cb, &cb);
+	return cb.value;
+}
+
 static const char DUMB_HTTP_FALLBACK_ENV[] = "GIT_DUMB_HTTP_FALLBACK";
 
 int main(int argc, const char **argv)
@@ -872,12 +912,6 @@ int main(int argc, const char **argv)
 	options.verbosity = 1;
 	options.progress = !!isatty(2);
 	options.thin = 1;
-	options.fallback = 1;
-
-	if (getenv(DUMB_HTTP_FALLBACK_ENV)) {
-		char *fb = getenv(DUMB_HTTP_FALLBACK_ENV);
-		options.fallback = git_config_bool(DUMB_HTTP_FALLBACK_ENV, fb);
-	}
 
 	remote = remote_get(argv[1]);
 
@@ -889,6 +923,13 @@ int main(int argc, const char **argv)
 
 	url = strbuf_detach(&buf, NULL);
 
+	if (getenv(DUMB_HTTP_FALLBACK_ENV)) {
+		char *fb = getenv(DUMB_HTTP_FALLBACK_ENV);
+		options.fallback = git_config_bool(DUMB_HTTP_FALLBACK_ENV, fb);
+	} else {
+		options.fallback = dumb_fallback_config(url);
+	}
+
 	http_init(remote, url, 0);
 
 	do {
-- 
1.7.12.1.389.g3dff30b

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PATCH] Disable dumb HTTP fallback with GIT_CURL_FALLBACK=0
  2012-09-20  3:48 ` Jeff King
@ 2012-09-20  5:57   ` Shawn Pearce
  2012-09-20  5:58     ` [PATCH] Revert "retry request without query when info/refs?query fails" Shawn O. Pearce
  2012-09-20 17:24     ` [PATCH] Disable dumb HTTP fallback with GIT_CURL_FALLBACK=0 Jeff King
  0 siblings, 2 replies; 36+ messages in thread
From: Shawn Pearce @ 2012-09-20  5:57 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, git

On Wed, Sep 19, 2012 at 8:48 PM, Jeff King <peff@peff.net> wrote:
> On Wed, Sep 19, 2012 at 07:55:53PM -0700, Shawn O. Pearce wrote:
>
>> If the user doesn't want to use the dumb HTTP protocol, she may
>> set GIT_CURL_FALLBACK=0 in the environment before invoking a Git
>> protocol operation. This is mostly useful when testing against
>> servers that are known to not support the dumb protocol. If the
>> smart service detection fails the client should not continue with
>> dumb behavior, but instead provide accurate HTTP failure data.
>
> I have been looking into this recently, as well. GitHub does not allow
> dumb http at all these days,

Interesting that GitHub doesn't support dumb transfer either.

> so transient errors on the initial smart
> contact can cause us to fall back to dumb,

Transient errors is actually what is leading me down this path. We see
about 0.0455% of our requests to the Android hosting service as these
dumb fallbacks. This means the client never got a proper smart service
reply. Server logs suggest we sent a valid response, so I am
suspecting transient proxy errors, but its hard to debug because
clients discard the error.

> and end up reporting a
> totally useless 403 Forbidden error.

Today I posted a change to JGit [1] to make this 406 Not Acceptable as
I think it better matches the description of the failure. It also no
longer reuses the common 403 Forbidden error that a server might
choose to return if a request was given with invalid credentials.

[1] https://git.eclipse.org/r/7847

>  I guess Google Code has a similar
> issue.

Yes, and {android,kernel,gerrit}.googlesource.com also only support
the smart protocol.

> Note that it is not really do not "fall back to dumb"; we detect the
> dumb nature from the response. It is really "fall back to trying the URL
> without the query string, because there are some servers that cannot
> handle it". With your patch, we might still end up performing a dumb
> transfer.

Yes, the server could ignore the parameter and return a dumb info/refs
response on the initial request, forcing the client to a dumb
transfer. I forgot this case being common on a dumb server. I have
only been working with or worrying about smart servers, whose
transient failures have this info/refs request that they always
fail... giving a poor user experience.

> I think what you're doing here is sane, because you have to turn it on
> manually, and thus there are no possible backwards compatibility issues.
> But it might be nice to make things work better out of the box. Here are
> two client-side changes I've been toying with:
>
>   1. If both smart and dumb requests fail, report the error for the
>      smart request. Now that smart-http clients are common, I'd expect
>      most http servers to be smart these days. Of course I don't have
>      any sort of numbers to back this up (nor am I sure how to get them;
>      obviously big sites like GitHub and Google Code do a lot of
>      traffic, but who knows how many one-off repo-on-a-generic-web-host
>      sites still exist?).

I suspect there are still a number of servers that rely on dumb
protocol to host repositories because its very simple to setup.
Breaking support for this wouldn't be a good idea.

kernel.org and eclipse.org both support the smart protocol, but I
think this is a common trend. Sites that host a lot of Git
repositories take the time to setup the smart protocol. Everyone else,
its hit-or-miss, mostly miss.

>      An alternative would be to simply be more verbose, and mention that
>      we tried to fallback and list both failures (or we could do this
>      with just "fetch -v").

I would bet the big hosting providers would appreciate having the
failure listed in non-verbose mode. Many Git users are going to use a
hosting service like GitHub or Google Code, or work with their
organization like kernel.org for central Git hosting. Consumers of
hosted projects that are less familiar with Git will be accessing
these sites often enough.

I considered logging this failure to stderr, but didn't.

>   2. Be more discerning about which errors will cause a fallback.
>      Something like "504 Gateway Timeout" should not give a fallback.
>      The problem is that you are really guessing at what kinds of http
>      errors you are going to get from a dumb server when you try the
>      smart URL.
>
> I dug back into the list thread that spawned the "retry
>      without query string" patch (703e6e7).
>
>      The thread is here:
>
>        http://thread.gmane.org/gmane.comp.version-control.git/137609
>
>      If you read the thread, it turns out that the problem in this case
>      (which is the only reported case I could find in the archive) is
>      that the server was misconfigured to treat _anything_ with a query
>      string as a gitweb URL. And then it got fixed pretty much
>      immediately.
>
>      So as far as we know, there may be zero servers for which this
>      fallback is actually doing anything useful.
>
> I'm tempted to just reverse the logic.

After re-reading that thread, it was a mistake to apply 703e6e76. We
should just revert it.

> Try the request with the query
> string and immediately fail if it doesn't work. For the few (if any)
> people who are hitting a server that will not serve the dumb file in
> that case, add a "remote.*.dumbhttp" setting that will turn off smart
> completely as a workaround.
>
> That would serve the (presumed) majority who are using smart http,
> everyone using dumb http on a reasonably-configured server, and still
> allow an easy workaround for people with badly configured servers.
>
> What do you think?

Yes, this is the better idea. Revert 703e6e76 and add a new feature to
disable the smart protocol entirely as an escape hatch.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH] Revert "retry request without query when info/refs?query fails"
  2012-09-20  5:57   ` Shawn Pearce
@ 2012-09-20  5:58     ` Shawn O. Pearce
  2012-09-20  6:29       ` Junio C Hamano
  2012-09-20 17:24     ` [PATCH] Disable dumb HTTP fallback with GIT_CURL_FALLBACK=0 Jeff King
  1 sibling, 1 reply; 36+ messages in thread
From: Shawn O. Pearce @ 2012-09-20  5:58 UTC (permalink / raw)
  To: Junio C Hamano, Jeff King; +Cc: git, Shawn O. Pearce

From: "Shawn O. Pearce" <spearce@spearce.org>

This reverts commit 703e6e76a14825e5b0c960d525f34e607154b4f7.

Retrying without the query parameter was added as a workaround
for a single broken HTTP server at git.debian.org[1]. The server
was misconfigured to route every request with a query parameter
into gitweb.cgi. Admins fixed the server's configuration within
16 hours of the bug report to the Git mailing list, but we still
patched Git with this fallback and have been paying for it since.

Most Git hosting services configure the smart HTTP protocol and the
retry logic confuses users when there is a transient HTTP error as
Git dropped the real error from the smart HTTP request. Removing the
retry makes root causes easier to identify.

[1] http://thread.gmane.org/gmane.comp.version-control.git/137609
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
---
 remote-curl.c | 18 ++----------------
 1 file changed, 2 insertions(+), 16 deletions(-)

diff --git a/remote-curl.c b/remote-curl.c
index 3ec474f..2359f59 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -95,7 +95,7 @@ static struct discovery* discover_refs(const char *service)
 	struct strbuf buffer = STRBUF_INIT;
 	struct discovery *last = last_discovery;
 	char *refs_url;
-	int http_ret, is_http = 0, proto_git_candidate = 1;
+	int http_ret, is_http = 0;
 
 	if (last && !strcmp(service, last->service))
 		return last;
@@ -113,19 +113,6 @@ static struct discovery* discover_refs(const char *service)
 	refs_url = strbuf_detach(&buffer, NULL);
 
 	http_ret = http_get_strbuf(refs_url, &buffer, HTTP_NO_CACHE);
-
-	/* try again with "plain" url (no ? or & appended) */
-	if (http_ret != HTTP_OK && http_ret != HTTP_NOAUTH) {
-		free(refs_url);
-		strbuf_reset(&buffer);
-
-		proto_git_candidate = 0;
-		strbuf_addf(&buffer, "%sinfo/refs", url);
-		refs_url = strbuf_detach(&buffer, NULL);
-
-		http_ret = http_get_strbuf(refs_url, &buffer, HTTP_NO_CACHE);
-	}
-
 	switch (http_ret) {
 	case HTTP_OK:
 		break;
@@ -144,8 +131,7 @@ static struct discovery* discover_refs(const char *service)
 	last->buf_alloc = strbuf_detach(&buffer, &last->len);
 	last->buf = last->buf_alloc;
 
-	if (is_http && proto_git_candidate
-		&& 5 <= last->len && last->buf[4] == '#') {
+	if (is_http && 5 <= last->len && last->buf[4] == '#') {
 		/* smart HTTP response; validate that the service
 		 * pkt-line matches our request.
 		 */
-- 
1.7.12.1.512.g9b230e6

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PATCH] Revert "retry request without query when info/refs?query fails"
  2012-09-20  5:58     ` [PATCH] Revert "retry request without query when info/refs?query fails" Shawn O. Pearce
@ 2012-09-20  6:29       ` Junio C Hamano
  2012-09-20  6:31         ` Junio C Hamano
  2012-09-20 16:24         ` Jeff King
  0 siblings, 2 replies; 36+ messages in thread
From: Junio C Hamano @ 2012-09-20  6:29 UTC (permalink / raw)
  To: Shawn O. Pearce; +Cc: Jeff King, git

"Shawn O. Pearce" <spearce@spearce.org> writes:

> This reverts commit 703e6e76a14825e5b0c960d525f34e607154b4f7.
>
> Retrying without the query parameter was added as a workaround
> for a single broken HTTP server at git.debian.org[1]. The server
> was misconfigured to route every request with a query parameter
> into gitweb.cgi. Admins fixed the server's configuration within
> 16 hours of the bug report to the Git mailing list, but we still
> patched Git with this fallback and have been paying for it since.

As the consequence of the above, the only two things we know about
the servers in the wild are (1) a misconfiguration that requires
this retry was once made, so it is not very unlikely others did the
same misconfiguration, and (2) those unknown number of servers have
been happily serving the current clients because the workaround
patch have been hiding the misconfiguration ever since.

But as long as the failure diagnosis from updated clients that
revert this workaround is sufficient to allow such misconfigured
servers, I think it is OK.  We might see a large number of small
people having to run around and fix the configuration as a fallout,
though.

> Most Git hosting services configure the smart HTTP protocol and the
> retry logic confuses users when there is a transient HTTP error as
> Git dropped the real error from the smart HTTP request. Removing the
> retry makes root causes easier to identify.

Does that hold true also for dumb only small people installations?
They are the ones that need more help than the large installations
staffed sufficiently and run smart http gateway.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] Revert "retry request without query when info/refs?query fails"
  2012-09-20  6:29       ` Junio C Hamano
@ 2012-09-20  6:31         ` Junio C Hamano
  2012-09-20 16:24         ` Jeff King
  1 sibling, 0 replies; 36+ messages in thread
From: Junio C Hamano @ 2012-09-20  6:31 UTC (permalink / raw)
  To: Shawn O. Pearce; +Cc: Jeff King, git

Junio C Hamano <gitster@pobox.com> writes:

> "Shawn O. Pearce" <spearce@spearce.org> writes:
>
>> This reverts commit 703e6e76a14825e5b0c960d525f34e607154b4f7.
>>
>> Retrying without the query parameter was added as a workaround
>> for a single broken HTTP server at git.debian.org[1]. The server
>> was misconfigured to route every request with a query parameter
>> into gitweb.cgi. Admins fixed the server's configuration within
>> 16 hours of the bug report to the Git mailing list, but we still
>> patched Git with this fallback and have been paying for it since.
>
> As the consequence of the above, the only two things we know about
> the servers in the wild are (1) a misconfiguration that requires
> this retry was once made, so it is not very unlikely others did the
> same misconfiguration, and (2) those unknown number of servers have
> been happily serving the current clients because the workaround
> patch have been hiding the misconfiguration ever since.
>
> But as long as the failure diagnosis from updated clients that
> revert this workaround is sufficient to allow such misconfigured
> servers, I think it is OK.  We might see a large number of small

s/servers,/servers diagnosed,/;

> people having to run around and fix the configuration as a fallout,
> though.
>
>> Most Git hosting services configure the smart HTTP protocol and the
>> retry logic confuses users when there is a transient HTTP error as
>> Git dropped the real error from the smart HTTP request. Removing the
>> retry makes root causes easier to identify.
>
> Does that hold true also for dumb only small people installations?
> They are the ones that need more help than the large installations
> staffed sufficiently and run smart http gateway.

In any case, will queue.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] Revert "retry request without query when info/refs?query fails"
  2012-09-20  6:29       ` Junio C Hamano
  2012-09-20  6:31         ` Junio C Hamano
@ 2012-09-20 16:24         ` Jeff King
  2012-09-20 16:59           ` [PATCH 0/2] smart http toggle switch fails" Jeff King
  1 sibling, 1 reply; 36+ messages in thread
From: Jeff King @ 2012-09-20 16:24 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Shawn O. Pearce, git

On Wed, Sep 19, 2012 at 11:29:34PM -0700, Junio C Hamano wrote:

> "Shawn O. Pearce" <spearce@spearce.org> writes:
> 
> > This reverts commit 703e6e76a14825e5b0c960d525f34e607154b4f7.
> >
> > Retrying without the query parameter was added as a workaround
> > for a single broken HTTP server at git.debian.org[1]. The server
> > was misconfigured to route every request with a query parameter
> > into gitweb.cgi. Admins fixed the server's configuration within
> > 16 hours of the bug report to the Git mailing list, but we still
> > patched Git with this fallback and have been paying for it since.
> 
> As the consequence of the above, the only two things we know about
> the servers in the wild are (1) a misconfiguration that requires
> this retry was once made, so it is not very unlikely others did the
> same misconfiguration, and (2) those unknown number of servers have
> been happily serving the current clients because the workaround
> patch have been hiding the misconfiguration ever since.

The misconfiguration was pretty wild in this case. I'd be much more
worried about stupidly non-compliant servers that will not serve
"foo/bar" when asked for "foo/bar?key=value".

> But as long as the failure diagnosis from updated clients that
> revert this workaround is sufficient to allow such misconfigured
> servers, I think it is OK.  We might see a large number of small
> people having to run around and fix the configuration as a fallout,
> though.

I think Shawn's revert is the right thing to do. But it is not complete
without the manual workaround. I'm putting that patch together now and
should have it out in a few minutes.

> > Most Git hosting services configure the smart HTTP protocol and the
> > retry logic confuses users when there is a transient HTTP error as
> > Git dropped the real error from the smart HTTP request. Removing the
> > retry makes root causes easier to identify.
> 
> Does that hold true also for dumb only small people installations?
> They are the ones that need more help than the large installations
> staffed sufficiently and run smart http gateway.

For the most part, yes. They will get a useful error out of the smart
request if there is a transient error, the repo does not exist, etc.
The real fallout is the people who are hitting a broken or misconfigured
server and may get a confusing error code (in the one case we know
about, it was a 404, but it really could be anything, depending on the
exact nature of the misconfiguration).

-Peff

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH 0/2] smart http toggle switch fails"
  2012-09-20 16:24         ` Jeff King
@ 2012-09-20 16:59           ` Jeff King
  2012-09-20 17:00             ` [PATCH 1/2] remote-curl: rename is_http variable Jeff King
  2012-09-20 17:05             ` [PATCH 2/2] remote-curl: let users turn off smart http Jeff King
  0 siblings, 2 replies; 36+ messages in thread
From: Jeff King @ 2012-09-20 16:59 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Shawn O. Pearce, git

On Thu, Sep 20, 2012 at 12:24:56PM -0400, Jeff King wrote:

> I think Shawn's revert is the right thing to do. But it is not complete
> without the manual workaround. I'm putting that patch together now and
> should have it out in a few minutes.

And here it is. This goes on top of Shawn's revert patch (it might be
nice to mention the commit id of that in the commit message of the
second patch. I couldn't do so because it is not yet in your repo).

  [1/2]: remote-curl: rename is_http variable
  [2/2]: remote-curl: let users turn off smart http

-Peff

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH 1/2] remote-curl: rename is_http variable
  2012-09-20 16:59           ` [PATCH 0/2] smart http toggle switch fails" Jeff King
@ 2012-09-20 17:00             ` Jeff King
  2012-09-20 17:05             ` [PATCH 2/2] remote-curl: let users turn off smart http Jeff King
  1 sibling, 0 replies; 36+ messages in thread
From: Jeff King @ 2012-09-20 17:00 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Shawn O. Pearce, git

We don't actually care whether the connection is http or
not; what we care about is whether it might be smart http.
Rename the variable to be more accurate, which will make it
easier to later make smart-http optional.

Signed-off-by: Jeff King <peff@peff.net>
---
 remote-curl.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/remote-curl.c b/remote-curl.c
index 2359f59..c0b98cc 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -95,7 +95,7 @@ static struct discovery* discover_refs(const char *service)
 	struct strbuf buffer = STRBUF_INIT;
 	struct discovery *last = last_discovery;
 	char *refs_url;
-	int http_ret, is_http = 0;
+	int http_ret, maybe_smart = 0;
 
 	if (last && !strcmp(service, last->service))
 		return last;
@@ -103,7 +103,7 @@ static struct discovery* discover_refs(const char *service)
 
 	strbuf_addf(&buffer, "%sinfo/refs", url);
 	if (!prefixcmp(url, "http://") || !prefixcmp(url, "https://")) {
-		is_http = 1;
+		maybe_smart = 1;
 		if (!strchr(url, '?'))
 			strbuf_addch(&buffer, '?');
 		else
@@ -131,7 +131,7 @@ static struct discovery* discover_refs(const char *service)
 	last->buf_alloc = strbuf_detach(&buffer, &last->len);
 	last->buf = last->buf_alloc;
 
-	if (is_http && 5 <= last->len && last->buf[4] == '#') {
+	if (maybe_smart && 5 <= last->len && last->buf[4] == '#') {
 		/* smart HTTP response; validate that the service
 		 * pkt-line matches our request.
 		 */
-- 
1.7.11.7.15.g085c6bd

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 2/2] remote-curl: let users turn off smart http
  2012-09-20 16:59           ` [PATCH 0/2] smart http toggle switch fails" Jeff King
  2012-09-20 17:00             ` [PATCH 1/2] remote-curl: rename is_http variable Jeff King
@ 2012-09-20 17:05             ` Jeff King
  2012-09-20 17:53               ` Junio C Hamano
  1 sibling, 1 reply; 36+ messages in thread
From: Jeff King @ 2012-09-20 17:05 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Shawn O. Pearce, git

Usually there is no need for users to specify whether an
http remote is smart or dumb; the protocol is designed so
that a single initial request is made, and the client can
determine the server's capability from the response.

However, some misconfigured dumb-only servers may not like
the initial request by a smart client, as it contains a
query string. Until recently, commit 703e6e7 worked around
this by making a second request. However, that commit was
recently reverted due to its side effect of masking the
initial request's error code.

This patch takes a different approach to the workaround. We
assume that the common case is that the server is either
smart-http or a reasonably configured dumb-http. If that is
not the case, we provide both a per-remote config option and
an environment variable with which the user can manually
work around the issue.

Signed-off-by: Jeff King <peff@peff.net>
---
I added the config item as remote.foo.smarthttp. You could also allow
"http.$url.smart" (and just "http.smart", for that matter), which could
be more flexible if you have multiple remotes pointing to the same
broken server. However, it is also more complex to use, and is a lot
more code. Since we don't know if any such servers even exist, I tried
to give the minimal escape hatch, and we can easily build more features
on it later if people complain.

 Documentation/config.txt | 11 +++++++++++
 remote-curl.c            |  3 ++-
 remote.c                 |  3 +++
 remote.h                 |  1 +
 t/t5551-http-fetch.sh    | 17 +++++++++++++++++
 5 files changed, 34 insertions(+), 1 deletion(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 6416cae..651b23c 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1871,6 +1871,17 @@ remote.<name>.uploadpack::
 	The default program to execute on the remote side when fetching.  See
 	option \--upload-pack of linkgit:git-fetch-pack[1].
 
+remote.<name>.smartHttp::
+	If true, this remote will attempt to use git's smart http
+	protocol when making remote http requests. Normally git sends an
+	initial smart-http request, and falls back to the older "dumb"
+	protocol if the server does not claim to support the smart
+	protocol. However, some misconfigured dumb-only servers may
+	produce confusing results for the initial request. Setting this
+	option to false disables the initial smart request, which can
+	workaround problems with such servers. You should not generally
+	need to set this. Defaults to `true`.
+
 remote.<name>.tagopt::
 	Setting this value to \--no-tags disables automatic tag following when
 	fetching from remote <name>. Setting it to \--tags will fetch every
diff --git a/remote-curl.c b/remote-curl.c
index c0b98cc..8829bfb 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -102,7 +102,8 @@ static struct discovery* discover_refs(const char *service)
 	free_discovery(last);
 
 	strbuf_addf(&buffer, "%sinfo/refs", url);
-	if (!prefixcmp(url, "http://") || !prefixcmp(url, "https://")) {
+	if ((!prefixcmp(url, "http://") || !prefixcmp(url, "https://")) &&
+	     git_env_bool("GIT_SMART_HTTP", remote->smart_http)) {
 		maybe_smart = 1;
 		if (!strchr(url, '?'))
 			strbuf_addch(&buffer, '?');
diff --git a/remote.c b/remote.c
index 04fd9ea..a334390 100644
--- a/remote.c
+++ b/remote.c
@@ -152,6 +152,7 @@ static struct remote *make_remote(const char *name, int len)
 		ret->name = xstrndup(name, len);
 	else
 		ret->name = xstrdup(name);
+	ret->smart_http = 1;
 	return ret;
 }
 
@@ -453,6 +454,8 @@ static int handle_config(const char *key, const char *value, void *cb)
 					 key, value);
 	} else if (!strcmp(subkey, ".vcs")) {
 		return git_config_string(&remote->foreign_vcs, key, value);
+	} else if (!strcmp(subkey, ".smarthttp")) {
+		remote->smart_http = git_config_bool(key, value);
 	}
 	return 0;
 }
diff --git a/remote.h b/remote.h
index 251d8fd..9031d18 100644
--- a/remote.h
+++ b/remote.h
@@ -40,6 +40,7 @@ struct remote {
 	int fetch_tags;
 	int skip_default_update;
 	int mirror;
+	int smart_http;
 
 	const char *receivepack;
 	const char *uploadpack;
diff --git a/t/t5551-http-fetch.sh b/t/t5551-http-fetch.sh
index 2db5c35..48173ed 100755
--- a/t/t5551-http-fetch.sh
+++ b/t/t5551-http-fetch.sh
@@ -129,6 +129,23 @@ test_expect_success 'clone from auth-only-for-push repository' '
 	test_cmp expect actual
 '
 
+test_expect_success 'disable dumb http on server' '
+	git --git-dir="$HTTPD_DOCUMENT_ROOT_PATH/repo.git" \
+		config http.getanyfile false
+'
+
+test_expect_success 'GIT_SMART_HTTP can disable smart http' '
+	(GIT_SMART_HTTP=0 &&
+	 export GIT_SMART_HTTP &&
+	 cd clone &&
+	 test_must_fail git fetch)
+'
+
+test_expect_success 'remote.*.smartHTTP can disable smart http' '
+	(cd clone &&
+	 test_must_fail git -c remote.origin.smartHTTP=false fetch)
+'
+
 test -n "$GIT_TEST_LONG" && test_set_prereq EXPENSIVE
 
 test_expect_success EXPENSIVE 'create 50,000 tags in the repo' '
-- 
1.7.11.7.15.g085c6bd

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PATCH] Disable dumb HTTP fallback with GIT_CURL_FALLBACK=0
  2012-09-20  5:57   ` Shawn Pearce
  2012-09-20  5:58     ` [PATCH] Revert "retry request without query when info/refs?query fails" Shawn O. Pearce
@ 2012-09-20 17:24     ` Jeff King
  2012-09-20 23:05       ` Shawn Pearce
  1 sibling, 1 reply; 36+ messages in thread
From: Jeff King @ 2012-09-20 17:24 UTC (permalink / raw)
  To: Shawn Pearce; +Cc: Junio C Hamano, git

On Wed, Sep 19, 2012 at 10:57:35PM -0700, Shawn O. Pearce wrote:

> > I have been looking into this recently, as well. GitHub does not allow
> > dumb http at all these days,
> 
> Interesting that GitHub doesn't support dumb transfer either.

Our objects are still in regular repos, and is served by fairly stock
git. The main show-stopper is that we share objects via alternates, and
we aggressively repack the alternates repos. So a dumb client would end
up being quite inefficient.

We also toyed with having mixed public/private fork networks at one
point, which would obviously necessitate disabling dumb access. But we
gave it up as too risky, as you can do all sorts of weird tricks to
convince git to disclose information about what's in the private repos
(e.g., you can reliably find out which sha1s exist, and you can lie
about which sha1s you have to ask git to create deltas against them).

Doing a shared-object store right would require marking which repo
"knows" which objects (I assume that's what Google Code does, but I
don't remember if Dave talked about that aspect at last year's
GitTogether).

> > so transient errors on the initial smart
> > contact can cause us to fall back to dumb,
> 
> Transient errors is actually what is leading me down this path. We see
> about 0.0455% of our requests to the Android hosting service as these
> dumb fallbacks. This means the client never got a proper smart service
> reply. Server logs suggest we sent a valid response, so I am
> suspecting transient proxy errors, but its hard to debug because
> clients discard the error.

Yup, we see the same thing. I've tracked a few down manually to actual
things like gateway timeouts on our reverse proxy.

> > and end up reporting a
> > totally useless 403 Forbidden error.
> 
> Today I posted a change to JGit [1] to make this 406 Not Acceptable as
> I think it better matches the description of the failure. It also no
> longer reuses the common 403 Forbidden error that a server might
> choose to return if a request was given with invalid credentials.

That might be worth doing for git-http-backend, too. It might even make
sense for the git client to recognize the 406 (and possibly the 403) and
print a more useful message.

> >   1. If both smart and dumb requests fail, report the error for the
> >      smart request. Now that smart-http clients are common, I'd expect
> >      most http servers to be smart these days. Of course I don't have
> >      any sort of numbers to back this up (nor am I sure how to get them;
> >      obviously big sites like GitHub and Google Code do a lot of
> >      traffic, but who knows how many one-off repo-on-a-generic-web-host
> >      sites still exist?).
> 
> I suspect there are still a number of servers that rely on dumb
> protocol to host repositories because its very simple to setup.
> Breaking support for this wouldn't be a good idea.

I don't think it would break on most servers, though. Even for a dumb
server, the initial error will be a useful one. It's only the
weirdly-configured ones where you get wildly different results depending
on whether the query string is there.

In other words, it is really no worse than reverting 703e6e76, and it
might be better. In the common case, you get a better error message. In
the broken-server case, we still try the fallback. So it will keep
working on a broken server without any manual intervention, whereas
reverting and adding a manual escape hatch means the user has to do
something.

BUT.

I still think it's better to revert 703e6e76, because I really do think
the broken-server case is an extreme enough minority that it is not even
worth wasting the time of clients to make a second request (especially
because the first request may very well have failed because of a network
error that causes a long timeout, and the user then has to wait double
to find out what the error is).

Of course, I have no actual data aside from reading the original thread
that led to 703e6e76, and the fact that nobody else mentioned it (not
during the time when it was broken, and not even after, when people
often still complain because they haven't upgraded yet). But who knows?
Maybe I will eat my words, and we will end up getting that data in the
form of complaints. :)

We can always switch to fallback-but-prefer-the-initial-error then. And
we'll have more data on exactly how the misconfigured servers behave.

-Peff

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 2/2] remote-curl: let users turn off smart http
  2012-09-20 17:05             ` [PATCH 2/2] remote-curl: let users turn off smart http Jeff King
@ 2012-09-20 17:53               ` Junio C Hamano
  2012-09-20 18:12                 ` Jeff King
  0 siblings, 1 reply; 36+ messages in thread
From: Junio C Hamano @ 2012-09-20 17:53 UTC (permalink / raw)
  To: Jeff King; +Cc: Shawn O. Pearce, git

Jeff King <peff@peff.net> writes:

> I added the config item as remote.foo.smarthttp. You could also allow
> "http.$url.smart" (and just "http.smart", for that matter), which could
> be more flexible if you have multiple remotes pointing to the same
> broken server.

What would the user experience be when we introduce "even smarter"
http server protocol extension?  Will we add remote.foo.starterhttp?

Perhaps

    remote.$name.httpvariants = [smart] [dumb]

to allow users to say "smart only", "dumb only", or "smart and/or
dumb" might be more code but less burden on the users.

The code obviously looks correct, and the documentation reads fine.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 2/2] remote-curl: let users turn off smart http
  2012-09-20 17:53               ` Junio C Hamano
@ 2012-09-20 18:12                 ` Jeff King
  2012-09-20 18:36                   ` Junio C Hamano
  0 siblings, 1 reply; 36+ messages in thread
From: Jeff King @ 2012-09-20 18:12 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Shawn O. Pearce, git

On Thu, Sep 20, 2012 at 10:53:15AM -0700, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > I added the config item as remote.foo.smarthttp. You could also allow
> > "http.$url.smart" (and just "http.smart", for that matter), which could
> > be more flexible if you have multiple remotes pointing to the same
> > broken server.
> 
> What would the user experience be when we introduce "even smarter"
> http server protocol extension?  Will we add remote.foo.starterhttp?

I would hope that it would actually be negotiated reliably at the
protocol level so we do not have to deal with this mess again.

> Perhaps
> 
>     remote.$name.httpvariants = [smart] [dumb]
> 
> to allow users to say "smart only", "dumb only", or "smart and/or
> dumb" might be more code but less burden on the users.

I don't mind that format if we are going that direction, but is there
anybody who actually wants to say "smart only?"

-Peff

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 2/2] remote-curl: let users turn off smart http
  2012-09-20 18:12                 ` Jeff King
@ 2012-09-20 18:36                   ` Junio C Hamano
  2012-09-20 20:51                     ` Jeff King
  0 siblings, 1 reply; 36+ messages in thread
From: Junio C Hamano @ 2012-09-20 18:36 UTC (permalink / raw)
  To: Jeff King; +Cc: Shawn O. Pearce, git

Jeff King <peff@peff.net> writes:

> On Thu, Sep 20, 2012 at 10:53:15AM -0700, Junio C Hamano wrote:
>
>> Jeff King <peff@peff.net> writes:
>> 
>> > I added the config item as remote.foo.smarthttp. You could also allow
>> > "http.$url.smart" (and just "http.smart", for that matter), which could
>> > be more flexible if you have multiple remotes pointing to the same
>> > broken server.
>> 
>> What would the user experience be when we introduce "even smarter"
>> http server protocol extension?  Will we add remote.foo.starterhttp?
>
> I would hope that it would actually be negotiated reliably at the
> protocol level so we do not have to deal with this mess again.

The original dumb vs smart was supposed to be "negotiated reliably
at the protocol level", no?  Yet we need this band-aid, so...

>> Perhaps
>> 
>>     remote.$name.httpvariants = [smart] [dumb]
>> 
>> to allow users to say "smart only", "dumb only", or "smart and/or
>> dumb" might be more code but less burden on the users.
>
> I don't mind that format if we are going that direction, but is there
> anybody who actually wants to say "smart only?"

With 703e6e7 reverted, we take a failure from the initial smart
request to mean the server is simply not serving, so "smart only" to
fail quickly without trying dumb fallback is not needed.  "smart
only" to say "I wouldn't want to talk to dumb-only server---I do not
have infinite amount of time, and I'd rather try another server" is
still a possibility, but likely not worth supporting.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 2/2] remote-curl: let users turn off smart http
  2012-09-20 18:36                   ` Junio C Hamano
@ 2012-09-20 20:51                     ` Jeff King
  2012-09-20 21:15                       ` Junio C Hamano
  0 siblings, 1 reply; 36+ messages in thread
From: Jeff King @ 2012-09-20 20:51 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Shawn O. Pearce, git

On Thu, Sep 20, 2012 at 11:36:34AM -0700, Junio C Hamano wrote:

> >> What would the user experience be when we introduce "even smarter"
> >> http server protocol extension?  Will we add remote.foo.starterhttp?
> >
> > I would hope that it would actually be negotiated reliably at the
> > protocol level so we do not have to deal with this mess again.
> 
> The original dumb vs smart was supposed to be "negotiated reliably
> at the protocol level", no?  Yet we need this band-aid, so...

I started to write much more in my original response, but deleted it as
being too wordy. I guess I will have to rewrite it now. :)

The difference is that jumping from dumb to smart had to give context
clues at the HTTP layer. That is, by sending a query string, the client
sends a single bit that tells the server "I understand smart http", and
the server responds with output that indicates it also understands. We
had to embed this in the HTTP layer, because the previous iteration
wasn't running any custom git code at all.

Whereas if we were to enhance the protocol again, it would probably
_still_ begin with the same type of initial query, but we would
negotiate more at the git-protocol level. And there we are in charge of
how the implementation responds and handles backwards compatibility.

This has already happened to some degree. We have added new capabilities
at the git-protocol level, and it worked without these problems. It's
not a "new protocol", but it is a backwards-compatible enhancement. And
it's the likely mode for new enhancements in the future.

It's possible we could have something drastically different in the
future that does not even start with the same initial git conversation.
But even then, I think we'd do it with a new "git-upload-pack2" service
tag, or git:// and ssh access would be left behind.

> >> Perhaps
> >> 
> >>     remote.$name.httpvariants = [smart] [dumb]
> >> 
> >> to allow users to say "smart only", "dumb only", or "smart and/or
> >> dumb" might be more code but less burden on the users.
> >
> > I don't mind that format if we are going that direction, but is there
> > anybody who actually wants to say "smart only?"
> 
> With 703e6e7 reverted, we take a failure from the initial smart
> request to mean the server is simply not serving, so "smart only" to
> fail quickly without trying dumb fallback is not needed.  "smart
> only" to say "I wouldn't want to talk to dumb-only server---I do not
> have infinite amount of time, and I'd rather try another server" is
> still a possibility, but likely not worth supporting.

Yes. I do still need to resurrect my fetch-a-bundle-by-http code, which
could also be covered by such a switch. But I guess I am just not sure
if there is any point in spending effort to implement toggles that
nobody has actually asked for.

I'm also a little iffy on it because we would be inventing new config
syntax.  I don't think we want to split the list across multiple config
items (which makes our usual later-config-overwrites-earlier rules
behave badly). So what is the value format? Is it a whitespace-delimited
case-insensitive list completely specifying the transports allowed? What
happens if a new value is added. Do people who have said "smart" not get
the new value, even though all they really wanted to say was "not dumb"?
What about people who write "bundle smart" because their new
version of git understands it, but then have old versions of git barf on
it?

Most of our current config is very toggle-oriented, and I'm not sure
there is precedent for an option exactly like this. We can try to come
up with answers to those questions, but I don't think doing it is as
simple as just changing a few lines of code to support !dumb and !smart
modes.

I'm half-tempted to just drop the config entirely, leave
GIT_SMART_HTTP=false as an escape hatch, and see if anybody even cares.
At least then we're not promising support for a config option that we
may want to change later.

What do you want to do?

-Peff

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 2/2] remote-curl: let users turn off smart http
  2012-09-20 20:51                     ` Jeff King
@ 2012-09-20 21:15                       ` Junio C Hamano
  2012-09-20 21:30                         ` Jeff King
  0 siblings, 1 reply; 36+ messages in thread
From: Junio C Hamano @ 2012-09-20 21:15 UTC (permalink / raw)
  To: Jeff King; +Cc: Shawn O. Pearce, git

Jeff King <peff@peff.net> writes:

> I'm half-tempted to just drop the config entirely, leave
> GIT_SMART_HTTP=false as an escape hatch, and see if anybody even cares.

Sounds like a very attractive minimalistic way to go forward.  We
can always add per-remote configuration when we find it necessary,
but once we add support, we cannot easily yank it out.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 2/2] remote-curl: let users turn off smart http
  2012-09-20 21:15                       ` Junio C Hamano
@ 2012-09-20 21:30                         ` Jeff King
  2012-09-21 17:34                           ` Junio C Hamano
  0 siblings, 1 reply; 36+ messages in thread
From: Jeff King @ 2012-09-20 21:30 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Shawn O. Pearce, git

On Thu, Sep 20, 2012 at 02:15:20PM -0700, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > I'm half-tempted to just drop the config entirely, leave
> > GIT_SMART_HTTP=false as an escape hatch, and see if anybody even cares.
> 
> Sounds like a very attractive minimalistic way to go forward.  We
> can always add per-remote configuration when we find it necessary,
> but once we add support, we cannot easily yank it out.

Like this?

-- >8 --
Subject: [PATCH] remote-curl: let users turn off smart http

Usually there is no need for users to specify whether an
http remote is smart or dumb; the protocol is designed so
that a single initial request is made, and the client can
determine the server's capability from the response.

However, some misconfigured dumb-only servers may not like
the initial request by a smart client, as it contains a
query string. Until recently, commit 703e6e7 worked around
this by making a second request. However, that commit was
recently reverted due to its side effect of masking the
initial request's error code.

Since git has had that workaround for several years, we
don't know exactly how many such misconfigured servers are
out there. The reversion of 703e6e7 assumes they are rare
enough not to worry about. Still, that reversion leaves
somebody who does run into such a server with no escape
hatch at all. Let's give them an environment variable they
can tweak to perform the "dumb" request.

This is intentionally not a documented interface. It's
overly simple and is really there for debugging in case
somebody does complain about git not working with their
server. A real user-facing interface would entail a
per-remote or per-URL config variable.

Signed-off-by: Jeff King <peff@peff.net>
---
 remote-curl.c         |  3 ++-
 t/t5551-http-fetch.sh | 12 ++++++++++++
 2 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/remote-curl.c b/remote-curl.c
index c0b98cc..7b19ebb 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -102,7 +102,8 @@ static struct discovery* discover_refs(const char *service)
 	free_discovery(last);

 	strbuf_addf(&buffer, "%sinfo/refs", url);
-	if (!prefixcmp(url, "http://") || !prefixcmp(url, "https://")) {
+	if ((!prefixcmp(url, "http://") || !prefixcmp(url, "https://")) &&
+	     git_env_bool("GIT_SMART_HTTP", 1)) {
 		maybe_smart = 1;
 		if (!strchr(url, '?'))
 			strbuf_addch(&buffer, '?');
diff --git a/t/t5551-http-fetch.sh b/t/t5551-http-fetch.sh
index 2db5c35..8427943 100755
--- a/t/t5551-http-fetch.sh
+++ b/t/t5551-http-fetch.sh
@@ -129,6 +129,18 @@ test_expect_success 'clone from auth-only-for-push repository' '
 	test_cmp expect actual
 '

+test_expect_success 'disable dumb http on server' '
+	git --git-dir="$HTTPD_DOCUMENT_ROOT_PATH/repo.git" \
+		config http.getanyfile false
+'
+
+test_expect_success 'GIT_SMART_HTTP can disable smart http' '
+	(GIT_SMART_HTTP=0 &&
+	 export GIT_SMART_HTTP &&
+	 cd clone &&
+	 test_must_fail git fetch)
+'
+
 test -n "$GIT_TEST_LONG" && test_set_prereq EXPENSIVE

 test_expect_success EXPENSIVE 'create 50,000 tags in the repo' '
-- 
1.7.11.7.15.g085c6bd

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PATCH] Disable dumb HTTP fallback with GIT_CURL_FALLBACK=0
  2012-09-20 17:24     ` [PATCH] Disable dumb HTTP fallback with GIT_CURL_FALLBACK=0 Jeff King
@ 2012-09-20 23:05       ` Shawn Pearce
  2012-09-21  5:26         ` Jeff King
  0 siblings, 1 reply; 36+ messages in thread
From: Shawn Pearce @ 2012-09-20 23:05 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, git

On Thu, Sep 20, 2012 at 10:24 AM, Jeff King <peff@peff.net> wrote:
> On Wed, Sep 19, 2012 at 10:57:35PM -0700, Shawn O. Pearce wrote:
>
>> > so transient errors on the initial smart
>> > contact can cause us to fall back to dumb,
>>
>> Transient errors is actually what is leading me down this path. We see
>> about 0.0455% of our requests to the Android hosting service as these
>> dumb fallbacks. This means the client never got a proper smart service
>> reply. Server logs suggest we sent a valid response, so I am
>> suspecting transient proxy errors, but its hard to debug because
>> clients discard the error.
>
> Yup, we see the same thing. I've tracked a few down manually to actual
> things like gateway timeouts on our reverse proxy.

Our reverse proxies also fail sometimes. :-)

But right now I am seeing failures in libcurl's SSL connection that
may also be causing the smart connection failures. For example this
trace, where libcurl was just not able to connect to respond to the
401 with a password. I suspect what is happening is the SSL session
dropped out of cache on our servers, and libcurl couldn't reuse the
existing SSL session. Instead of discarding the bad session and
retrying, Git aborts. I'm willing to bet modern browsers just discard
the bad session and start a new one, because clients can't assume the
remote server will be able to remember their session forever.

* Couldn't find host android.googlesource.com in the .netrc file; using defaults
* About to connect() to android.googlesource.com port 443 (#0)
*   Trying 2607:f8b0:400e:c02::52... * Connected to
android.googlesource.com (2607:f8b0:400e:c02::52) port 443 (#0)
* successfully set certificate verify locations:
*   CAfile: none
  CApath: /etc/ssl/certs
* SSL connection using RC4-SHA
* Server certificate:
*        subject: C=US; ST=California; L=Mountain View; O=Google Inc;
CN=*.googlecode.com
*        start date: 2012-08-16 12:25:39 GMT
*        expire date: 2013-06-07 19:43:27 GMT
*        subjectAltName: android.googlesource.com matched
*        issuer: C=US; O=Google Inc; CN=Google Internet Authority
*        SSL certificate verify ok.
> GET /a/platform/tools/build/info/refs?service=git-upload-pack HTTP/1.1
User-Agent: git/1.7.12.1.1.g9b7ccb3
Host: android.googlesource.com
Accept: */*
Pragma: no-cache

* The requested URL returned error: 401
* Closing connection #0
* Couldn't find host android.googlesource.com in the .netrc file; using defaults
* About to connect() to android.googlesource.com port 443 (#0)
*   Trying 2607:f8b0:400e:c02::52... * Connected to
android.googlesource.com (2607:f8b0:400e:c02::52) port 443 (#0)
* successfully set certificate verify locations:
*   CAfile: none
  CApath: /etc/ssl/certs
* SSL re-using session ID
* Unknown SSL protocol error in connection to android.googlesource.com:443
* Expire cleared
* Closing connection #0
error: Unknown SSL protocol error in connection to
android.googlesource.com:443  while accessing
https://android.googlesource.com/a/platform/tools/build/info/refs?service=git-upload-pack
fatal: HTTP request failed

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] Disable dumb HTTP fallback with GIT_CURL_FALLBACK=0
  2012-09-20 23:05       ` Shawn Pearce
@ 2012-09-21  5:26         ` Jeff King
  2012-09-21 14:19           ` Shawn Pearce
  0 siblings, 1 reply; 36+ messages in thread
From: Jeff King @ 2012-09-21  5:26 UTC (permalink / raw)
  To: Shawn Pearce; +Cc: Junio C Hamano, git

On Thu, Sep 20, 2012 at 04:05:03PM -0700, Shawn O. Pearce wrote:

> But right now I am seeing failures in libcurl's SSL connection that
> may also be causing the smart connection failures. For example this
> trace, where libcurl was just not able to connect to respond to the
> 401 with a password. I suspect what is happening is the SSL session
> dropped out of cache on our servers, and libcurl couldn't reuse the
> existing SSL session. Instead of discarding the bad session and
> retrying, Git aborts. I'm willing to bet modern browsers just discard
> the bad session and start a new one, because clients can't assume the
> remote server will be able to remember their session forever.

That's something I haven't seen. But then, I don't usually see the
client side; I just see the fallback dumb fetch in our logs, and
have occasionally followed up.

Is there a long pause while the user is typing their password?

> * SSL re-using session ID
> * Unknown SSL protocol error in connection to android.googlesource.com:443
> * Expire cleared
> * Closing connection #0
> error: Unknown SSL protocol error in connection to
> android.googlesource.com:443  while accessing
> https://android.googlesource.com/a/platform/tools/build/info/refs?service=git-upload-pack
> fatal: HTTP request failed

You could try turning off CURLOPT_SSL_SESSIONID_CACHE and seeing if that
improves it. Of course, it is probably hard to reproduce, so it would be
tough to know if that helped or not. It would also be nice if you could
dump more information on the error from the ssl library (I typically
build curl against openssl; I wonder if it could be related to using
gnutls or something).

-Peff

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] Disable dumb HTTP fallback with GIT_CURL_FALLBACK=0
  2012-09-21  5:26         ` Jeff King
@ 2012-09-21 14:19           ` Shawn Pearce
  2012-10-01 21:23             ` [PATCH] Retry HTTP requests on SSL connect failures Shawn O. Pearce
  0 siblings, 1 reply; 36+ messages in thread
From: Shawn Pearce @ 2012-09-21 14:19 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, git

On Thu, Sep 20, 2012 at 10:26 PM, Jeff King <peff@peff.net> wrote:
> On Thu, Sep 20, 2012 at 04:05:03PM -0700, Shawn O. Pearce wrote:
>
>> But right now I am seeing failures in libcurl's SSL connection that
>> may also be causing the smart connection failures. For example this
>> trace, where libcurl was just not able to connect to respond to the
>> 401 with a password. I suspect what is happening is the SSL session
>> dropped out of cache on our servers, and libcurl couldn't reuse the
>> existing SSL session. Instead of discarding the bad session and
>> retrying, Git aborts. I'm willing to bet modern browsers just discard
>> the bad session and start a new one, because clients can't assume the
>> remote server will be able to remember their session forever.
>
> That's something I haven't seen. But then, I don't usually see the
> client side; I just see the fallback dumb fetch in our logs, and
> have occasionally followed up.

I hadn't seen this either until I deleted the fallback code from
remote-curl.c and ran git ls-remote in a while true loop for 6 hours.
Its obviously happening though.

> Is there a long pause while the user is typing their password?

No. The password comes off a credential helper that has access to it
from a credential store. There is very little lag here, under 100 ms.

>> * SSL re-using session ID
>> * Unknown SSL protocol error in connection to android.googlesource.com:443
>> * Expire cleared
>> * Closing connection #0
>> error: Unknown SSL protocol error in connection to
>> android.googlesource.com:443  while accessing
>> https://android.googlesource.com/a/platform/tools/build/info/refs?service=git-upload-pack
>> fatal: HTTP request failed
>
> You could try turning off CURLOPT_SSL_SESSIONID_CACHE and seeing if that
> improves it. Of course, it is probably hard to reproduce, so it would be
> tough to know if that helped or not. It would also be nice if you could
> dump more information on the error from the ssl library (I typically
> build curl against openssl; I wonder if it could be related to using
> gnutls or something).

This is OpenSSL, because I also always build against OpenSSL.  :-)

I'll try the CURLOPT_SSL_SESSIONID_CACHE today. It is hard to
reproduce, so not producing it doesn't necessarily mean it isn't still
there.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 2/2] remote-curl: let users turn off smart http
  2012-09-20 21:30                         ` Jeff King
@ 2012-09-21 17:34                           ` Junio C Hamano
  2012-09-21 17:41                             ` Jeff King
  0 siblings, 1 reply; 36+ messages in thread
From: Junio C Hamano @ 2012-09-21 17:34 UTC (permalink / raw)
  To: Jeff King; +Cc: Shawn O. Pearce, git

Jeff King <peff@peff.net> writes:

> On Thu, Sep 20, 2012 at 02:15:20PM -0700, Junio C Hamano wrote:
>
>> Jeff King <peff@peff.net> writes:
>> 
>> > I'm half-tempted to just drop the config entirely, leave
>> > GIT_SMART_HTTP=false as an escape hatch, and see if anybody even cares.
>> 
>> Sounds like a very attractive minimalistic way to go forward.  We
>> can always add per-remote configuration when we find it necessary,
>> but once we add support, we cannot easily yank it out.
>
> Like this?

Yeah.  Will queue this one instead.  The simpler, the better ;-)

>
> -- >8 --
> Subject: [PATCH] remote-curl: let users turn off smart http
>
> Usually there is no need for users to specify whether an
> http remote is smart or dumb; the protocol is designed so
> that a single initial request is made, and the client can
> determine the server's capability from the response.
>
> However, some misconfigured dumb-only servers may not like
> the initial request by a smart client, as it contains a
> query string. Until recently, commit 703e6e7 worked around
> this by making a second request. However, that commit was
> recently reverted due to its side effect of masking the
> initial request's error code.
>
> Since git has had that workaround for several years, we
> don't know exactly how many such misconfigured servers are
> out there. The reversion of 703e6e7 assumes they are rare
> enough not to worry about. Still, that reversion leaves
> somebody who does run into such a server with no escape
> hatch at all. Let's give them an environment variable they
> can tweak to perform the "dumb" request.
>
> This is intentionally not a documented interface. It's
> overly simple and is really there for debugging in case
> somebody does complain about git not working with their
> server. A real user-facing interface would entail a
> per-remote or per-URL config variable.
>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
>  remote-curl.c         |  3 ++-
>  t/t5551-http-fetch.sh | 12 ++++++++++++
>  2 files changed, 14 insertions(+), 1 deletion(-)
>
> diff --git a/remote-curl.c b/remote-curl.c
> index c0b98cc..7b19ebb 100644
> --- a/remote-curl.c
> +++ b/remote-curl.c
> @@ -102,7 +102,8 @@ static struct discovery* discover_refs(const char *service)
>  	free_discovery(last);
>  
>  	strbuf_addf(&buffer, "%sinfo/refs", url);
> -	if (!prefixcmp(url, "http://") || !prefixcmp(url, "https://")) {
> +	if ((!prefixcmp(url, "http://") || !prefixcmp(url, "https://")) &&
> +	     git_env_bool("GIT_SMART_HTTP", 1)) {
>  		maybe_smart = 1;
>  		if (!strchr(url, '?'))
>  			strbuf_addch(&buffer, '?');
> diff --git a/t/t5551-http-fetch.sh b/t/t5551-http-fetch.sh
> index 2db5c35..8427943 100755
> --- a/t/t5551-http-fetch.sh
> +++ b/t/t5551-http-fetch.sh
> @@ -129,6 +129,18 @@ test_expect_success 'clone from auth-only-for-push repository' '
>  	test_cmp expect actual
>  '
>  
> +test_expect_success 'disable dumb http on server' '
> +	git --git-dir="$HTTPD_DOCUMENT_ROOT_PATH/repo.git" \
> +		config http.getanyfile false
> +'
> +
> +test_expect_success 'GIT_SMART_HTTP can disable smart http' '
> +	(GIT_SMART_HTTP=0 &&
> +	 export GIT_SMART_HTTP &&
> +	 cd clone &&
> +	 test_must_fail git fetch)
> +'
> +
>  test -n "$GIT_TEST_LONG" && test_set_prereq EXPENSIVE
>  
>  test_expect_success EXPENSIVE 'create 50,000 tags in the repo' '

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 2/2] remote-curl: let users turn off smart http
  2012-09-21 17:34                           ` Junio C Hamano
@ 2012-09-21 17:41                             ` Jeff King
  0 siblings, 0 replies; 36+ messages in thread
From: Jeff King @ 2012-09-21 17:41 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Shawn O. Pearce, git

On Fri, Sep 21, 2012 at 10:34:22AM -0700, Junio C Hamano wrote:

> >> > I'm half-tempted to just drop the config entirely, leave
> >> > GIT_SMART_HTTP=false as an escape hatch, and see if anybody even cares.
> >> 
> >> Sounds like a very attractive minimalistic way to go forward.  We
> >> can always add per-remote configuration when we find it necessary,
> >> but once we add support, we cannot easily yank it out.
> >
> > Like this?
> 
> Yeah.  Will queue this one instead.  The simpler, the better ;-)

Thanks. I almost followed up with a rebased version of my config patch,
should we want to apply it later separately. But I think I would really
rather gather data from even a single bug report before we move any
further (and with any luck, there will be zero bug reports :) ).

-Peff

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH] Retry HTTP requests on SSL connect failures
  2012-09-21 14:19           ` Shawn Pearce
@ 2012-10-01 21:23             ` Shawn O. Pearce
  2012-10-01 21:47               ` Junio C Hamano
                                 ` (3 more replies)
  0 siblings, 4 replies; 36+ messages in thread
From: Shawn O. Pearce @ 2012-10-01 21:23 UTC (permalink / raw)
  To: Junio C Hamano, Jeff King; +Cc: git, Shawn O. Pearce

From: "Shawn O. Pearce" <spearce@spearce.org>

When libcurl fails to connect to an SSL server always retry the
request once. Since the connection failed before the HTTP headers
can be sent, no data has exchanged hands, so the remote side has
not learned of the request and will not perform it twice.

In the wild we have seen git-remote-https fail to connect to
some load-balanced SSL servers sporadically, while modern popular
browsers (e.g. Firefox and Chromium) have no trouble with the same
server pool.

Lets assume the site operators (Hi Google!) have a clue and are
doing everything they already can to ensure secure, successful
SSL connections from a wide range of HTTP clients. Implementing a
single level of retry in the client can make it more robust against
transient failure modes.
---
 http.c        | 19 ++++++++++++-------
 remote-curl.c |  2 ++
 2 files changed, 14 insertions(+), 7 deletions(-)

diff --git a/http.c b/http.c
index 345c171..953f2e6 100644
--- a/http.c
+++ b/http.c
@@ -784,7 +784,7 @@ static int http_request(const char *url, void *result, int target, int options)
 	struct slot_results results;
 	struct curl_slist *headers = NULL;
 	struct strbuf buf = STRBUF_INIT;
-	int ret;
+	int ret, attempts;
 
 	slot = get_active_slot();
 	slot->results = &results;
@@ -820,12 +820,17 @@ static int http_request(const char *url, void *result, int target, int options)
 	curl_easy_setopt(slot->curl, CURLOPT_HTTPHEADER, headers);
 	curl_easy_setopt(slot->curl, CURLOPT_ENCODING, "gzip");
 
-	if (start_active_slot(slot)) {
-		run_active_slot(slot);
-		ret = handle_curl_result(slot);
-	} else {
-		error("Unable to start HTTP request for %s", url);
-		ret = HTTP_START_FAILED;
+	for (attempts = 0; attempts < 2; attempts++) {
+		if (start_active_slot(slot)) {
+			run_active_slot(slot);
+			if (slot->results->curl_result == CURLE_SSL_CONNECT_ERROR)
+				continue;
+			ret = handle_curl_result(slot);
+		} else {
+			error("Unable to start HTTP request for %s", url);
+			ret = HTTP_START_FAILED;
+		}
+		break;
 	}
 
 	curl_slist_free_all(headers);
diff --git a/remote-curl.c b/remote-curl.c
index a269608..04a379c 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -353,6 +353,8 @@ static int run_slot(struct active_request_slot *slot)
 
 	slot->results = &results;
 	slot->curl_result = curl_easy_perform(slot->curl);
+	if (slot->curl_result == CURLE_SSL_CONNECT_ERROR)
+		slot->curl_result = curl_easy_perform(slot->curl);
 	finish_active_slot(slot);
 
 	err = handle_curl_result(slot);
-- 
1.7.12.1.590.g4bb1bc4

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PATCH] Retry HTTP requests on SSL connect failures
  2012-10-01 21:23             ` [PATCH] Retry HTTP requests on SSL connect failures Shawn O. Pearce
@ 2012-10-01 21:47               ` Junio C Hamano
  2012-10-01 21:53               ` Junio C Hamano
                                 ` (2 subsequent siblings)
  3 siblings, 0 replies; 36+ messages in thread
From: Junio C Hamano @ 2012-10-01 21:47 UTC (permalink / raw)
  To: Shawn O. Pearce; +Cc: Jeff King, git

"Shawn O. Pearce" <spearce@spearce.org> writes:

> Lets assume the site operators (Hi Google!) have a clue and are
> doing everything they already can to ensure secure, successful
> SSL connections from a wide range of HTTP clients. Implementing a
> single level of retry in the client can make it more robust against
> transient failure modes.
> ---

Sign off?

>  http.c        | 19 ++++++++++++-------
>  remote-curl.c |  2 ++
>  2 files changed, 14 insertions(+), 7 deletions(-)
>
> diff --git a/http.c b/http.c
> index 345c171..953f2e6 100644
> --- a/http.c
> +++ b/http.c
> @@ -784,7 +784,7 @@ static int http_request(const char *url, void *result, int target, int options)
>  	struct slot_results results;
>  	struct curl_slist *headers = NULL;
>  	struct strbuf buf = STRBUF_INIT;
> -	int ret;
> +	int ret, attempts;
>  
>  	slot = get_active_slot();
>  	slot->results = &results;
> @@ -820,12 +820,17 @@ static int http_request(const char *url, void *result, int target, int options)
>  	curl_easy_setopt(slot->curl, CURLOPT_HTTPHEADER, headers);
>  	curl_easy_setopt(slot->curl, CURLOPT_ENCODING, "gzip");
>  
> -	if (start_active_slot(slot)) {
> -		run_active_slot(slot);
> -		ret = handle_curl_result(slot);
> -	} else {
> -		error("Unable to start HTTP request for %s", url);
> -		ret = HTTP_START_FAILED;
> +	for (attempts = 0; attempts < 2; attempts++) {
> +		if (start_active_slot(slot)) {
> +			run_active_slot(slot);
> +			if (slot->results->curl_result == CURLE_SSL_CONNECT_ERROR)
> +				continue;
> +			ret = handle_curl_result(slot);
> +		} else {
> +			error("Unable to start HTTP request for %s", url);
> +			ret = HTTP_START_FAILED;
> +		}
> +		break;
>  	}

Two naïve questions, that applies to this and the one in remote-curl.c::run_slot().

 (1) why only twice?
 (2) no need for "wait a bit and then retry"?

> diff --git a/remote-curl.c b/remote-curl.c
> index a269608..04a379c 100644
> --- a/remote-curl.c
> +++ b/remote-curl.c
> @@ -353,6 +353,8 @@ static int run_slot(struct active_request_slot *slot)
>  
>  	slot->results = &results;
>  	slot->curl_result = curl_easy_perform(slot->curl);
> +	if (slot->curl_result == CURLE_SSL_CONNECT_ERROR)
> +		slot->curl_result = curl_easy_perform(slot->curl);
>  	finish_active_slot(slot);
>  
>  	err = handle_curl_result(slot);

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] Retry HTTP requests on SSL connect failures
  2012-10-01 21:23             ` [PATCH] Retry HTTP requests on SSL connect failures Shawn O. Pearce
  2012-10-01 21:47               ` Junio C Hamano
@ 2012-10-01 21:53               ` Junio C Hamano
  2012-10-01 22:23                 ` Jeff King
  2012-10-01 22:18               ` Jeff King
  2012-10-02  0:14               ` Drew Northup
  3 siblings, 1 reply; 36+ messages in thread
From: Junio C Hamano @ 2012-10-01 21:53 UTC (permalink / raw)
  To: Shawn O. Pearce; +Cc: Jeff King, git

"Shawn O. Pearce" <spearce@spearce.org> writes:

> +	for (attempts = 0; attempts < 2; attempts++) {
> +		if (start_active_slot(slot)) {
> +			run_active_slot(slot);
> +			if (slot->results->curl_result == CURLE_SSL_CONNECT_ERROR)
> +				continue;

Is it safe to continue and let start_active_slot() to add the same
curl handle again when USE_CURL_MULTI is in effect?

> +			ret = handle_curl_result(slot);
> +		} else {
> +			error("Unable to start HTTP request for %s", url);
> +			ret = HTTP_START_FAILED;
> +		}
> +		break;
>  	}
>  
>  	curl_slist_free_all(headers);
> diff --git a/remote-curl.c b/remote-curl.c
> index a269608..04a379c 100644
> --- a/remote-curl.c
> +++ b/remote-curl.c
> @@ -353,6 +353,8 @@ static int run_slot(struct active_request_slot *slot)
>  
>  	slot->results = &results;
>  	slot->curl_result = curl_easy_perform(slot->curl);
> +	if (slot->curl_result == CURLE_SSL_CONNECT_ERROR)
> +		slot->curl_result = curl_easy_perform(slot->curl);
>  	finish_active_slot(slot);
>  
>  	err = handle_curl_result(slot);

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] Retry HTTP requests on SSL connect failures
  2012-10-01 21:23             ` [PATCH] Retry HTTP requests on SSL connect failures Shawn O. Pearce
  2012-10-01 21:47               ` Junio C Hamano
  2012-10-01 21:53               ` Junio C Hamano
@ 2012-10-01 22:18               ` Jeff King
  2012-10-02  2:38                 ` Shawn Pearce
  2012-10-02  0:14               ` Drew Northup
  3 siblings, 1 reply; 36+ messages in thread
From: Jeff King @ 2012-10-01 22:18 UTC (permalink / raw)
  To: Shawn O. Pearce; +Cc: Junio C Hamano, git

On Mon, Oct 01, 2012 at 02:23:06PM -0700, Shawn O. Pearce wrote:

> From: "Shawn O. Pearce" <spearce@spearce.org>
> 
> When libcurl fails to connect to an SSL server always retry the
> request once. Since the connection failed before the HTTP headers
> can be sent, no data has exchanged hands, so the remote side has
> not learned of the request and will not perform it twice.
> 
> In the wild we have seen git-remote-https fail to connect to
> some load-balanced SSL servers sporadically, while modern popular
> browsers (e.g. Firefox and Chromium) have no trouble with the same
> server pool.
> 
> Lets assume the site operators (Hi Google!) have a clue and are
> doing everything they already can to ensure secure, successful
> SSL connections from a wide range of HTTP clients. Implementing a
> single level of retry in the client can make it more robust against
> transient failure modes.

I find this a little distasteful just because we haven't figured out the
actual _reason_ for the failure. That is, I'm not convinced this isn't
something that curl or the ssl library can't handle internally if we
would only configure them correctly. Did you ever follow up on tweaking
the session caching options for curl?

Have you tried running your fails-after-a-few-hours request with other
clients that don't have the problem and seeing what they do (I'm
thinking a small webkit harness or something would be the most
feasible)?

That being said, you did make it so that it only kicks in during ssl
connect errors:

> +	for (attempts = 0; attempts < 2; attempts++) {
> +		if (start_active_slot(slot)) {
> +			run_active_slot(slot);
> +			if (slot->results->curl_result == CURLE_SSL_CONNECT_ERROR)
> +				continue;
> +			ret = handle_curl_result(slot);
> +		} else {
> +			error("Unable to start HTTP request for %s", url);
> +			ret = HTTP_START_FAILED;
> +		}
> +		break;

which means it shouldn't really be affecting the general populace. So
even though it feels like a dirty hack, at least it is self-contained,
and it does fix a real-world problem. If your answer to the above
questions is "hunting this further is just not worth the effort", I can
live with that.

> diff --git a/remote-curl.c b/remote-curl.c
> index a269608..04a379c 100644
> --- a/remote-curl.c
> +++ b/remote-curl.c
> @@ -353,6 +353,8 @@ static int run_slot(struct active_request_slot *slot)
>  
>  	slot->results = &results;
>  	slot->curl_result = curl_easy_perform(slot->curl);
> +	if (slot->curl_result == CURLE_SSL_CONNECT_ERROR)
> +		slot->curl_result = curl_easy_perform(slot->curl);
>  	finish_active_slot(slot);

How come the first hunk gets a nice for-loop and this one doesn't?

Also, are these hunks the only two spots where this error can come up?
The first one does http_request, which handles smart-http GET requests.
the second does run_slot, which handles smart-http POST requests.

Some of the dumb http fetches will go through http_request. But some
will not. And I think almost none of dumb http push will.

-Peff

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] Retry HTTP requests on SSL connect failures
  2012-10-01 21:53               ` Junio C Hamano
@ 2012-10-01 22:23                 ` Jeff King
  2012-10-01 23:20                   ` Junio C Hamano
  0 siblings, 1 reply; 36+ messages in thread
From: Jeff King @ 2012-10-01 22:23 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Shawn O. Pearce, git

On Mon, Oct 01, 2012 at 02:53:17PM -0700, Junio C Hamano wrote:

> "Shawn O. Pearce" <spearce@spearce.org> writes:
> 
> > +	for (attempts = 0; attempts < 2; attempts++) {
> > +		if (start_active_slot(slot)) {
> > +			run_active_slot(slot);
> > +			if (slot->results->curl_result == CURLE_SSL_CONNECT_ERROR)
> > +				continue;
> 
> Is it safe to continue and let start_active_slot() to add the same
> curl handle again when USE_CURL_MULTI is in effect?

I _think_ so. We reuse the slots anyway. So the usual workflow would be
get_active_slot, then start_active_slot, then run_active_slot. This loop
omits get_active_slot, which is responsible for (re-)initializing a
bunch of aspects of the slot. But we wouldn't want that here, since it
would mean we'd have to set up our URL, callbacks, etc, again.

My only worry would be that the failed curl request actually ended up
writing some data or made some other state change. But since we are
explicitly catching only ssl connection failures, presumably that would
not have happened.

-Peff

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] Retry HTTP requests on SSL connect failures
  2012-10-01 22:23                 ` Jeff King
@ 2012-10-01 23:20                   ` Junio C Hamano
  0 siblings, 0 replies; 36+ messages in thread
From: Junio C Hamano @ 2012-10-01 23:20 UTC (permalink / raw)
  To: Jeff King; +Cc: Shawn O. Pearce, git

Jeff King <peff@peff.net> writes:

> On Mon, Oct 01, 2012 at 02:53:17PM -0700, Junio C Hamano wrote:
>
>> "Shawn O. Pearce" <spearce@spearce.org> writes:
>> 
>> > +	for (attempts = 0; attempts < 2; attempts++) {
>> > +		if (start_active_slot(slot)) {
>> > +			run_active_slot(slot);
>> > +			if (slot->results->curl_result == CURLE_SSL_CONNECT_ERROR)
>> > +				continue;
>> 
>> Is it safe to continue and let start_active_slot() to add the same
>> curl handle again when USE_CURL_MULTI is in effect?
>
> I _think_ so. 

It seems that at the beginning of curl_multi_add_handle() there is a
check to see if the incoming slot->curl has already been added to
some curl-multi-handle and the function would return an error code
CURLM_BAD_EASY_HANDLE without doing anything useful.  Doesn't the
second attempt to call start_active_slot() set the slot->in_use to
zero and return false, skipping the call to run_active_slot() in
that case?

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] Retry HTTP requests on SSL connect failures
  2012-10-01 21:23             ` [PATCH] Retry HTTP requests on SSL connect failures Shawn O. Pearce
                                 ` (2 preceding siblings ...)
  2012-10-01 22:18               ` Jeff King
@ 2012-10-02  0:14               ` Drew Northup
  3 siblings, 0 replies; 36+ messages in thread
From: Drew Northup @ 2012-10-02  0:14 UTC (permalink / raw)
  To: Shawn O. Pearce; +Cc: Junio C Hamano, Jeff King, git

On Mon, Oct 1, 2012 at 5:23 PM, Shawn O. Pearce <spearce@spearce.org> wrote:
> From: "Shawn O. Pearce" <spearce@spearce.org>
>
> When libcurl fails to connect to an SSL server always retry the
> request once. Since the connection failed before the HTTP headers
> can be sent, no data has exchanged hands, so the remote side has
> not learned of the request and will not perform it twice.
>
> In the wild we have seen git-remote-https fail to connect to
> some load-balanced SSL servers sporadically, while modern popular
> browsers (e.g. Firefox and Chromium) have no trouble with the same
> server pool.
>
> Lets assume the site operators (Hi Google!) have a clue and are
> doing everything they already can to ensure secure, successful
> SSL connections from a wide range of HTTP clients. Implementing a
> single level of retry in the client can make it more robust against
> transient failure modes.

Ok, this begs for some background info...
@Dayjob one of the many things I do is mange our load balancers
(redundant pair in our case). If the attempted SSL connections in one
"bin" (time-slot) exceeds the licensed size of that "bin" then the
excess attempts are just "dropped on the floor." Normal web browsers
detect this initial failure and try again. This may be implemented
internally—I haven't checked.

Google, as I am sure you are well aware, doesn't rely upon a
traditional L2/L3 network level load balancing architecture.
Therefore, I would not attempt to argue that the results that apply to
their systems would apply much of anywhere else. (They have done
presentations publicly, which are archived on the 'net, about how they
do things.)

-- 
-Drew Northup
--------------------------------------------------------------
"As opposed to vegetable or mineral error?"
-John Pescatore, SANS NewsBites Vol. 12 Num. 59

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] Retry HTTP requests on SSL connect failures
  2012-10-01 22:18               ` Jeff King
@ 2012-10-02  2:38                 ` Shawn Pearce
  2012-10-02 13:57                   ` Drew Northup
  0 siblings, 1 reply; 36+ messages in thread
From: Shawn Pearce @ 2012-10-02  2:38 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, git

On Mon, Oct 1, 2012 at 3:18 PM, Jeff King <peff@peff.net> wrote:
> On Mon, Oct 01, 2012 at 02:23:06PM -0700, Shawn O. Pearce wrote:
>>
>> When libcurl fails to connect to an SSL server always retry the
>> request once. Since the connection failed before the HTTP headers
>> can be sent, no data has exchanged hands, so the remote side has
>> not learned of the request and will not perform it twice.
>
> I find this a little distasteful just because we haven't figured out the
> actual _reason_ for the failure. That is, I'm not convinced this isn't
> something that curl or the ssl library can't handle internally if we
> would only configure them correctly. Did you ever follow up on tweaking
> the session caching options for curl?

No. I didn't try because I reproduced the issue on the initial "GET
/.../info/refs?service=git-upload-pack" request with no authentication
required. So the very first thing the remote-https process did was
fail on an SSL error. During this run I was using a patched Git that
had a different version of the retry logic, but it immediately retried
and the retry was successful. At that point I decided the SSL session
cache wasn't possibly relevant since the first request failed and the
immediate retry was OK.

> Have you tried running your fails-after-a-few-hours request with other
> clients that don't have the problem and seeing what they do

This is harder to reproduce than you think. It took me about 5 days of
continuous polling to reproduce the error. And I have thus far only
reproduced it against our production servers. This makes it very hard
to test anything. Or to prove that any given patch is better than a
different version.

> (I'm
> thinking a small webkit harness or something would be the most
> feasible)?

So I suspect the contrib/persistent-https proxy thing in Go actually
papers over this problem by having the Go SSL client handle the
connection. But this is only based on a test I ran for several days
through that proxy that did not reproduce the bug. This doesn't mean
it doesn't reproduce with the proxy, it just means _I_ didn't get
lucky with an error in a ~48 hour run.

> which means it shouldn't really be affecting the general populace. So
> even though it feels like a dirty hack, at least it is self-contained,
> and it does fix a real-world problem. If your answer to the above
> questions is "hunting this further is just not worth the effort", I can
> live with that.

I am sort of at that point, but the hack is so ugly... yea, we
shouldn't have to do this. Or pollute our code with it. I'm willing to
go back and iterate on this further, but its going to be a while
before I can provide any more information.

>> diff --git a/remote-curl.c b/remote-curl.c
>> index a269608..04a379c 100644
>> --- a/remote-curl.c
>> +++ b/remote-curl.c
>> @@ -353,6 +353,8 @@ static int run_slot(struct active_request_slot *slot)
>>
>>       slot->results = &results;
>>       slot->curl_result = curl_easy_perform(slot->curl);
>> +     if (slot->curl_result == CURLE_SSL_CONNECT_ERROR)
>> +             slot->curl_result = curl_easy_perform(slot->curl);
>>       finish_active_slot(slot);
>
> How come the first hunk gets a nice for-loop and this one doesn't?

Both hunks retry exactly once after an SSL connect error. I just tried
to pick something reasonably clean to implement. This hunk seemed
simple with the if, the other was uglier and a loop seemed the most
simple way to get a retry in there.

> Also, are these hunks the only two spots where this error can come up?
> The first one does http_request, which handles smart-http GET requests.
> the second does run_slot, which handles smart-http POST requests.

Grrr. I thought I caught all of the curl perform calls but I guess I
missed the dumb transport.

> Some of the dumb http fetches will go through http_request. But some
> will not. And I think almost none of dumb http push will.

Well, don't use those? :-)

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] Retry HTTP requests on SSL connect failures
  2012-10-02  2:38                 ` Shawn Pearce
@ 2012-10-02 13:57                   ` Drew Northup
  0 siblings, 0 replies; 36+ messages in thread
From: Drew Northup @ 2012-10-02 13:57 UTC (permalink / raw)
  To: Shawn Pearce; +Cc: Jeff King, Junio C Hamano, git

On Mon, Oct 1, 2012 at 10:38 PM, Shawn Pearce <spearce@spearce.org> wrote:
> On Mon, Oct 1, 2012 at 3:18 PM, Jeff King <peff@peff.net> wrote:
>> On Mon, Oct 01, 2012 at 02:23:06PM -0700, Shawn O. Pearce wrote:
>>>
>>> When libcurl fails to connect to an SSL server always retry the
>>> request once. Since the connection failed before the HTTP headers
>>> can be sent, no data has exchanged hands, so the remote side has
>>> not learned of the request and will not perform it twice.
>>
>> I find this a little distasteful just because we haven't figured out the
>> actual _reason_ for the failure.
>
> No. I didn't try because I reproduced the issue on the initial "GET
> /.../info/refs?service=git-upload-pack" request with no authentication
> required. So the very first thing the remote-https process did was
> fail on an SSL error. During this run I was using a patched Git that
> had a different version of the retry logic, but it immediately retried
> and the retry was successful. At that point I decided the SSL session
> cache wasn't possibly relevant since the first request failed and the
> immediate retry was OK.
>
>> Have you tried running your fails-after-a-few-hours request with other
>> clients that don't have the problem and seeing what they do
>
> This is harder to reproduce than you think. It took me about 5 days of
> continuous polling to reproduce the error. And I have thus far only
> reproduced it against our production servers. This makes it very hard
> to test anything. Or to prove that any given patch is better than a
> different version.

The only sure way to make sure your patch works is to get your load
balancers Slashdotted first (reason noted in my previous mail on this
subject). For the sake of your relationship with your networking crew
I'd not advise doing that intentionally.


>> which means it shouldn't really be affecting the general populace. So
>> even though it feels like a dirty hack, at least it is self-contained,
>> and it does fix a real-world problem. If your answer to the above
>> questions is "hunting this further is just not worth the effort", I can
>> live with that.
>
> I am sort of at that point, but the hack is so ugly... yea, we
> shouldn't have to do this. Or pollute our code with it. I'm willing to
> go back and iterate on this further, but its going to be a while
> before I can provide any more information.

>> How come the first hunk gets a nice for-loop and this one doesn't?
>
> Both hunks retry exactly once after an SSL connect error. I just tried
> to pick something reasonably clean to implement. This hunk seemed
> simple with the if, the other was uglier and a loop seemed the most
> simple way to get a retry in there.

If indeed the problem you are having is with a load balanced setup
then applying TCP/IP like back-off semantics is the right way to go.
The only reason the network stack isn't doing it for you is because
the load balancers wait for the SSL/TLS start before dumping the
"excess" (exceeding of license) SSL connections.

-- 
-Drew Northup
--------------------------------------------------------------
"As opposed to vegetable or mineral error?"
-John Pescatore, SANS NewsBites Vol. 12 Num. 59

^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2012-10-02 13:57 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-09-20  2:55 [PATCH] Disable dumb HTTP fallback with GIT_CURL_FALLBACK=0 Shawn O. Pearce
2012-09-20  3:22 ` Shawn Pearce
2012-09-20  3:52   ` Jeff King
2012-09-20  3:48 ` Jeff King
2012-09-20  5:57   ` Shawn Pearce
2012-09-20  5:58     ` [PATCH] Revert "retry request without query when info/refs?query fails" Shawn O. Pearce
2012-09-20  6:29       ` Junio C Hamano
2012-09-20  6:31         ` Junio C Hamano
2012-09-20 16:24         ` Jeff King
2012-09-20 16:59           ` [PATCH 0/2] smart http toggle switch fails" Jeff King
2012-09-20 17:00             ` [PATCH 1/2] remote-curl: rename is_http variable Jeff King
2012-09-20 17:05             ` [PATCH 2/2] remote-curl: let users turn off smart http Jeff King
2012-09-20 17:53               ` Junio C Hamano
2012-09-20 18:12                 ` Jeff King
2012-09-20 18:36                   ` Junio C Hamano
2012-09-20 20:51                     ` Jeff King
2012-09-20 21:15                       ` Junio C Hamano
2012-09-20 21:30                         ` Jeff King
2012-09-21 17:34                           ` Junio C Hamano
2012-09-21 17:41                             ` Jeff King
2012-09-20 17:24     ` [PATCH] Disable dumb HTTP fallback with GIT_CURL_FALLBACK=0 Jeff King
2012-09-20 23:05       ` Shawn Pearce
2012-09-21  5:26         ` Jeff King
2012-09-21 14:19           ` Shawn Pearce
2012-10-01 21:23             ` [PATCH] Retry HTTP requests on SSL connect failures Shawn O. Pearce
2012-10-01 21:47               ` Junio C Hamano
2012-10-01 21:53               ` Junio C Hamano
2012-10-01 22:23                 ` Jeff King
2012-10-01 23:20                   ` Junio C Hamano
2012-10-01 22:18               ` Jeff King
2012-10-02  2:38                 ` Shawn Pearce
2012-10-02 13:57                   ` Drew Northup
2012-10-02  0:14               ` Drew Northup
2012-09-20  4:14 ` Re* [PATCH] Disable dumb HTTP fallback with GIT_CURL_FALLBACK=0 Junio C Hamano
2012-09-20  4:14   ` [PATCH 1/2] Disable dumb HTTP fallback with GIT_DUMB_HTTP_FALLBACK=false Junio C Hamano
2012-09-20  4:14   ` [PATCH 2/2] remote-curl: make dumb-http fallback configurable per URL Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).