git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: Junio C Hamano <gitster@pobox.com>
Cc: Konstantin Ryabitsev <konstantin@linuxfoundation.org>,
	git@vger.kernel.org
Subject: Re: [PATCH 2/2] http-backend: spool ref negotiation requests to buffer
Date: Fri, 15 May 2015 15:16:46 -0400	[thread overview]
Message-ID: <20150515191646.GA29934@peff.net> (raw)
In-Reply-To: <xmqqegmhhf9p.fsf@gitster.dls.corp.google.com>

On Fri, May 15, 2015 at 11:22:42AM -0700, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > The solution is fairly straight-forward: we read the request
> > body into an in-memory buffer in http-backend, freeing up
> > Apache, and then feed the data ourselves to upload-pack. But
> > there are a few important things to note:
> >
> >   1. We limit in-memory buffer to no larger than 1 megabyte
> >      to prevent an obvious denial-of-service attack. This
> >      is a new hard limit on requests, but it's likely that
> >      requests of this size didn't work before at all (i.e.,
> >      they would have run into the pipe buffer thing and
> >      deadlocked).

So this 1MB limit is clearly a problem, and the reasoning above is not
right. The case we are helping is when a large amount of input creates a
large amount of output. But we're _hurting_ the case where there's just
a large amount of input (as shown by the Dennis's test case).

What do we want to do about that? We can switch to streaming after
hitting our limit (so opening the opportunity for deadlock again in some
cases, but making sure we do no harm to cases that currently work). Or
we can just bump the input size and say "you'd be crazy to send more
than 10MB" (or 50, or whatever). We could make a configuration knob,
too, I guess.

> One unrelated thing I noticed was that three codepaths independently
> have close(0) in run_service() now, and made me follow the two
> helper functions to see they both do the close at the end.  It might
> have made the flow easier to follow if run_service() were
> 
>     ...
>     close(1);
>     if (gzip)
>     	inflate();
>     else if (buffer)
>         copy();
>     close(0);
>     ...
> 
> But that is minor.

I don't see the close(0) in the other (buffered) code paths. We close
the _output_ to the child, but of course we have to do that to tell it
we're done sending (I actually forgot it in an earlier version of
copy_request(), and things hang :) ).

I don't think there's any need to close(0) in the buffered cases. We
read until EOF in the copy() case. For gzip, we read until the end of
the gzipped data. I guess it would be better to close if we're not
expecting more input, as otherwise Apache might block trying to write to
us if the client sends bogus input (i.e., a zlib stream with more cruft
at the end).

> Also, is it worth allocating small and then growing up to the maximum?
> I think this only relays one request at a time anyway, and I suspect
> that a single 1MB allocation at the first call kept getting reused
> may be sufficient (and much simpler).

My initial attempt did exactly that, but I had a much smaller buffer. I
started to get worried around 1MB. If we bump it to 10MB (or make it
configurable), I get more so. I dunno. It is not _that_ much memory, but
it is per-request we are serving, so it might add up on a busy server.
OTOH, pack-objects thinks nothing of allocating 800MB just for the
book-keeping to serve a clone of torvalds/linux.

-Peff

  parent reply	other threads:[~2015-05-15 19:16 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-05-13 21:04 Clone hangs when done over http with --reference Konstantin Ryabitsev
2015-05-14  0:47 ` Jeff King
2015-05-14  1:02   ` [PATCH] http-backend: fix die recursion with custom handler Jeff King
2015-05-15  6:20     ` Jeff King
2015-05-14 17:08   ` Clone hangs when done over http with --reference Konstantin Ryabitsev
2015-05-14 19:52     ` Jeff King
2015-05-15  6:29   ` [PATCH 0/2] fix http deadlock on giant ref negotiations Jeff King
2015-05-15  6:29     ` [PATCH 1/2] http-backend: fix die recursion with custom handler Jeff King
2015-05-15  6:33     ` [PATCH 2/2] http-backend: spool ref negotiation requests to buffer Jeff King
2015-05-15 18:22       ` Junio C Hamano
2015-05-15 18:28         ` Konstantin Ryabitsev
2015-05-20  7:35           ` [PATCH v2 0/3] fix http deadlock on giant ref negotiations Jeff King
2015-05-20  7:36             ` [PATCH v2 1/3] http-backend: fix die recursion with custom handler Jeff King
2015-05-20  7:36             ` [PATCH v2 2/3] t5551: factor out tag creation Jeff King
2015-05-20  7:37             ` [PATCH v2 3/3] http-backend: spool ref negotiation requests to buffer Jeff King
2015-05-26  2:07               ` Konstantin Ryabitsev
2015-05-26  2:24                 ` Jeff King
2015-05-26  3:43                   ` Junio C Hamano
2015-05-15 19:16         ` Jeff King [this message]
2015-05-15  7:41     ` [PATCH 0/2] fix http deadlock on giant ref negotiations Dennis Kaarsemaker
2015-05-15  8:38       ` Jeff King
2015-05-15  8:44         ` Dennis Kaarsemaker
2015-05-15  8:53           ` Jeff King
2015-05-15  9:11             ` Dennis Kaarsemaker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150515191646.GA29934@peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=konstantin@linuxfoundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).