git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: Shawn Pearce <spearce@spearce.org>
Cc: Junio C Hamano <gitster@pobox.com>, git@vger.kernel.org
Subject: Re: [PATCH] Disable dumb HTTP fallback with GIT_CURL_FALLBACK=0
Date: Thu, 20 Sep 2012 13:24:08 -0400	[thread overview]
Message-ID: <20120920172408.GC18655@sigill.intra.peff.net> (raw)
In-Reply-To: <CAJo=hJvXtSBO3QEzhZCFfhk9OF_e0B10k8tjCUWMHZvGKt599Q@mail.gmail.com>

On Wed, Sep 19, 2012 at 10:57:35PM -0700, Shawn O. Pearce wrote:

> > I have been looking into this recently, as well. GitHub does not allow
> > dumb http at all these days,
> 
> Interesting that GitHub doesn't support dumb transfer either.

Our objects are still in regular repos, and is served by fairly stock
git. The main show-stopper is that we share objects via alternates, and
we aggressively repack the alternates repos. So a dumb client would end
up being quite inefficient.

We also toyed with having mixed public/private fork networks at one
point, which would obviously necessitate disabling dumb access. But we
gave it up as too risky, as you can do all sorts of weird tricks to
convince git to disclose information about what's in the private repos
(e.g., you can reliably find out which sha1s exist, and you can lie
about which sha1s you have to ask git to create deltas against them).

Doing a shared-object store right would require marking which repo
"knows" which objects (I assume that's what Google Code does, but I
don't remember if Dave talked about that aspect at last year's
GitTogether).

> > so transient errors on the initial smart
> > contact can cause us to fall back to dumb,
> 
> Transient errors is actually what is leading me down this path. We see
> about 0.0455% of our requests to the Android hosting service as these
> dumb fallbacks. This means the client never got a proper smart service
> reply. Server logs suggest we sent a valid response, so I am
> suspecting transient proxy errors, but its hard to debug because
> clients discard the error.

Yup, we see the same thing. I've tracked a few down manually to actual
things like gateway timeouts on our reverse proxy.

> > and end up reporting a
> > totally useless 403 Forbidden error.
> 
> Today I posted a change to JGit [1] to make this 406 Not Acceptable as
> I think it better matches the description of the failure. It also no
> longer reuses the common 403 Forbidden error that a server might
> choose to return if a request was given with invalid credentials.

That might be worth doing for git-http-backend, too. It might even make
sense for the git client to recognize the 406 (and possibly the 403) and
print a more useful message.

> >   1. If both smart and dumb requests fail, report the error for the
> >      smart request. Now that smart-http clients are common, I'd expect
> >      most http servers to be smart these days. Of course I don't have
> >      any sort of numbers to back this up (nor am I sure how to get them;
> >      obviously big sites like GitHub and Google Code do a lot of
> >      traffic, but who knows how many one-off repo-on-a-generic-web-host
> >      sites still exist?).
> 
> I suspect there are still a number of servers that rely on dumb
> protocol to host repositories because its very simple to setup.
> Breaking support for this wouldn't be a good idea.

I don't think it would break on most servers, though. Even for a dumb
server, the initial error will be a useful one. It's only the
weirdly-configured ones where you get wildly different results depending
on whether the query string is there.

In other words, it is really no worse than reverting 703e6e76, and it
might be better. In the common case, you get a better error message. In
the broken-server case, we still try the fallback. So it will keep
working on a broken server without any manual intervention, whereas
reverting and adding a manual escape hatch means the user has to do
something.

BUT.

I still think it's better to revert 703e6e76, because I really do think
the broken-server case is an extreme enough minority that it is not even
worth wasting the time of clients to make a second request (especially
because the first request may very well have failed because of a network
error that causes a long timeout, and the user then has to wait double
to find out what the error is).

Of course, I have no actual data aside from reading the original thread
that led to 703e6e76, and the fact that nobody else mentioned it (not
during the time when it was broken, and not even after, when people
often still complain because they haven't upgraded yet). But who knows?
Maybe I will eat my words, and we will end up getting that data in the
form of complaints. :)

We can always switch to fallback-but-prefer-the-initial-error then. And
we'll have more data on exactly how the misconfigured servers behave.

-Peff

  parent reply	other threads:[~2012-09-20 17:24 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-09-20  2:55 [PATCH] Disable dumb HTTP fallback with GIT_CURL_FALLBACK=0 Shawn O. Pearce
2012-09-20  3:22 ` Shawn Pearce
2012-09-20  3:52   ` Jeff King
2012-09-20  3:48 ` Jeff King
2012-09-20  5:57   ` Shawn Pearce
2012-09-20  5:58     ` [PATCH] Revert "retry request without query when info/refs?query fails" Shawn O. Pearce
2012-09-20  6:29       ` Junio C Hamano
2012-09-20  6:31         ` Junio C Hamano
2012-09-20 16:24         ` Jeff King
2012-09-20 16:59           ` [PATCH 0/2] smart http toggle switch fails" Jeff King
2012-09-20 17:00             ` [PATCH 1/2] remote-curl: rename is_http variable Jeff King
2012-09-20 17:05             ` [PATCH 2/2] remote-curl: let users turn off smart http Jeff King
2012-09-20 17:53               ` Junio C Hamano
2012-09-20 18:12                 ` Jeff King
2012-09-20 18:36                   ` Junio C Hamano
2012-09-20 20:51                     ` Jeff King
2012-09-20 21:15                       ` Junio C Hamano
2012-09-20 21:30                         ` Jeff King
2012-09-21 17:34                           ` Junio C Hamano
2012-09-21 17:41                             ` Jeff King
2012-09-20 17:24     ` Jeff King [this message]
2012-09-20 23:05       ` [PATCH] Disable dumb HTTP fallback with GIT_CURL_FALLBACK=0 Shawn Pearce
2012-09-21  5:26         ` Jeff King
2012-09-21 14:19           ` Shawn Pearce
2012-10-01 21:23             ` [PATCH] Retry HTTP requests on SSL connect failures Shawn O. Pearce
2012-10-01 21:47               ` Junio C Hamano
2012-10-01 21:53               ` Junio C Hamano
2012-10-01 22:23                 ` Jeff King
2012-10-01 23:20                   ` Junio C Hamano
2012-10-01 22:18               ` Jeff King
2012-10-02  2:38                 ` Shawn Pearce
2012-10-02 13:57                   ` Drew Northup
2012-10-02  0:14               ` Drew Northup
2012-09-20  4:14 ` Re* [PATCH] Disable dumb HTTP fallback with GIT_CURL_FALLBACK=0 Junio C Hamano
2012-09-20  4:14   ` [PATCH 1/2] Disable dumb HTTP fallback with GIT_DUMB_HTTP_FALLBACK=false Junio C Hamano
2012-09-20  4:14   ` [PATCH 2/2] remote-curl: make dumb-http fallback configurable per URL Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120920172408.GC18655@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=spearce@spearce.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).