git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: Konstantin Ryabitsev <konstantin@linuxfoundation.org>
Cc: Dennis Kaarsemaker <dennis@kaarsemaker.net>,
	gitster@pobox.com, git@vger.kernel.org
Subject: [PATCH v2 0/3] fix http deadlock on giant ref negotiations
Date: Wed, 20 May 2015 03:35:26 -0400	[thread overview]
Message-ID: <20150520073526.GA16784@peff.net> (raw)
In-Reply-To: <55563AD5.4090105@linuxfoundation.org>

On Fri, May 15, 2015 at 02:28:37PM -0400, Konstantin Ryabitsev wrote:

> On 15/05/15 02:22 PM, Junio C Hamano wrote:
> > Also, is it worth allocating small and then growing up to the maximum?
> > I think this only relays one request at a time anyway, and I suspect
> > that a single 1MB allocation at the first call kept getting reused
> > may be sufficient (and much simpler).
> 
> Does it make sense to make that configurable via an env variable at all?
> I suspect the vast majority of people would not hit this bug unless they
> are dealing with repositories polluted with hundreds of refs created by
> automation (like the codeaurora chromium repo).
> 
> E.g. can be set via an Apache directive like
> 
> SetEnv FOO_MAX_SIZE 2048
> 
> The backend can then be configured to emit an error message when the
> spool size is exhausted saying "foo exhausted, increase FOO_MAX_SIZE to
> allow for moar foo."

Yeah, that was the same conclusion I came to elsewhere in the thread.
Here's a re-roll:

  [1/3]: http-backend: fix die recursion with custom handler
  [2/3]: t5551: factor out tag creation
  [3/3]: http-backend: spool ref negotiation requests to buffer

It makes the size configurable (either through the environment, which is
convenient for setting via Apache; or through the config, which is
convenient if you have one absurdly-sized repo). It mentions the
variable name when it barfs into the Apache log. I also bumped the
default to 10MB, which I think should be enough to handle even
ridiculous cases.

I also adapted Dennis's test into the third patch. Beware that it's
quite slow to run (and is protected by the "EXPENSIVE" prerequisite).
Patch 2 is new, and just refactors the script to make adding the new
test easier.

I really wanted to add a test like:

diff --git a/t/t5551-http-fetch-smart.sh b/t/t5551-http-fetch-smart.sh
index e2a2fa1..1fff812 100755
--- a/t/t5551-http-fetch-smart.sh
+++ b/t/t5551-http-fetch-smart.sh
@@ -273,6 +273,16 @@ test_expect_success 'large fetch-pack requests can be split across POSTs' '
 	test_line_count = 2 posts
 '
 
+test_expect_success 'http-backend does not buffer arbitrarily large requests' '
+	test_when_finished "(
+		cd \"$HTTPD_DOCUMENT_ROOT_PATH/repo.git\" &&
+		test_unconfig http.maxrequestbuffer
+	)" &&
+	git -C "$HTTPD_DOCUMENT_ROOT_PATH/repo.git" \
+		config http.maxrequestbuffer 100 &&
+	test_must_fail git clone $HTTPD_URL/smart/repo.git foo.git
+'
+
 test_expect_success EXPENSIVE 'http can handle enormous ref negotiation' '
 	(
 		cd "$HTTPD_DOCUMENT_ROOT_PATH/repo.git" &&

to test that the maxRequestBuffer code does indeed work. Unfortunately,
even though the server behaves reasonably in this case, the client ends
up hanging forever. I'm not sure there is a simple solution to that; I
think it is a protocol issue where remote-http is waiting for fetch-pack
to speak, but fetch-pack is waiting for more data from the remote.

Personally, I think I'd be much more interested in pursuing a saner,
full duplex http solution like git-over-websockets than trying to iron
out all of the corner cases in the stateless-rpc protocol.

-Peff

  reply	other threads:[~2015-05-20  7:35 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-05-13 21:04 Clone hangs when done over http with --reference Konstantin Ryabitsev
2015-05-14  0:47 ` Jeff King
2015-05-14  1:02   ` [PATCH] http-backend: fix die recursion with custom handler Jeff King
2015-05-15  6:20     ` Jeff King
2015-05-14 17:08   ` Clone hangs when done over http with --reference Konstantin Ryabitsev
2015-05-14 19:52     ` Jeff King
2015-05-15  6:29   ` [PATCH 0/2] fix http deadlock on giant ref negotiations Jeff King
2015-05-15  6:29     ` [PATCH 1/2] http-backend: fix die recursion with custom handler Jeff King
2015-05-15  6:33     ` [PATCH 2/2] http-backend: spool ref negotiation requests to buffer Jeff King
2015-05-15 18:22       ` Junio C Hamano
2015-05-15 18:28         ` Konstantin Ryabitsev
2015-05-20  7:35           ` Jeff King [this message]
2015-05-20  7:36             ` [PATCH v2 1/3] http-backend: fix die recursion with custom handler Jeff King
2015-05-20  7:36             ` [PATCH v2 2/3] t5551: factor out tag creation Jeff King
2015-05-20  7:37             ` [PATCH v2 3/3] http-backend: spool ref negotiation requests to buffer Jeff King
2015-05-26  2:07               ` Konstantin Ryabitsev
2015-05-26  2:24                 ` Jeff King
2015-05-26  3:43                   ` Junio C Hamano
2015-05-15 19:16         ` [PATCH 2/2] " Jeff King
2015-05-15  7:41     ` [PATCH 0/2] fix http deadlock on giant ref negotiations Dennis Kaarsemaker
2015-05-15  8:38       ` Jeff King
2015-05-15  8:44         ` Dennis Kaarsemaker
2015-05-15  8:53           ` Jeff King
2015-05-15  9:11             ` Dennis Kaarsemaker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150520073526.GA16784@peff.net \
    --to=peff@peff.net \
    --cc=dennis@kaarsemaker.net \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=konstantin@linuxfoundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).