rack-devel archive mirror (unofficial) https://groups.google.com/group/rack-devel
 help / color / mirror / Atom feed
* Adding #size to rack.input
@ 2010-02-22 22:46 Ryan Tomayko
  2010-02-22 22:52 ` Hongli Lai
  2010-02-22 23:01 ` Magnus Holm
  0 siblings, 2 replies; 21+ messages in thread
From: Ryan Tomayko @ 2010-02-22 22:46 UTC (permalink / raw)
  To: rack-devel

On Mon, Feb 22, 2010 at 3:14 AM, Hongli Lai wrote:
> Hi Ryan.
>
> Rack.input is already required to be rewindable, meaning the input has
> to be backed by a file or a StringIO at some point. It should be
> trivial for pretty much all Rack web servers to implement #size. Do
> you have any objections against including #size in the spec?
>
> With kind regards,
> Hongli Lai

That's a good point. Any objections to adding #size to the SPEC and
Rack::RewindableInput? I'd be happy to put the patches together.

Thanks,
Ryan

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Adding #size to rack.input
  2010-02-22 22:46 Adding #size to rack.input Ryan Tomayko
@ 2010-02-22 22:52 ` Hongli Lai
  2010-02-22 22:59   ` Brian Lopez
  2010-02-22 23:01 ` Magnus Holm
  1 sibling, 1 reply; 21+ messages in thread
From: Hongli Lai @ 2010-02-22 22:52 UTC (permalink / raw)
  To: Rack Development

On Feb 22, 11:46 pm, Ryan Tomayko <r...@tomayko.com> wrote:
> On Mon, Feb 22, 2010 at 3:14 AM, Hongli Lai wrote:
> > Hi Ryan.
>
> > Rack.input is already required to be rewindable, meaning the input has
> > to be backed by a file or a StringIO at some point. It should be
> > trivial for pretty much all Rack web servers to implement #size. Do
> > you have any objections against including #size in the spec?
>
> > With kind regards,
> > Hongli Lai
>
> That's a good point. Any objections to adding #size to the SPEC and
> Rack::RewindableInput? I'd be happy to put the patches together.

It should be noted that this request came from Chad Fowler who
encountered the problem in the Gemcutter website. The existing code
already assumes that #size is available, which is the case for e.g.
Thin which exposes a StringIO or Tempfile as rack.input.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Adding #size to rack.input
  2010-02-22 22:52 ` Hongli Lai
@ 2010-02-22 22:59   ` Brian Lopez
  2010-02-23  2:30     ` James Tucker
  0 siblings, 1 reply; 21+ messages in thread
From: Brian Lopez @ 2010-02-22 22:59 UTC (permalink / raw)
  To: rack-devel

What does this mean for streaming requests?

-brianmario

On Feb 22, 2010, at 2:52 PM, Hongli Lai wrote:

> On Feb 22, 11:46 pm, Ryan Tomayko <r...@tomayko.com> wrote:
>> On Mon, Feb 22, 2010 at 3:14 AM, Hongli Lai wrote:
>>> Hi Ryan.
>> 
>>> Rack.input is already required to be rewindable, meaning the input has
>>> to be backed by a file or a StringIO at some point. It should be
>>> trivial for pretty much all Rack web servers to implement #size. Do
>>> you have any objections against including #size in the spec?
>> 
>>> With kind regards,
>>> Hongli Lai
>> 
>> That's a good point. Any objections to adding #size to the SPEC and
>> Rack::RewindableInput? I'd be happy to put the patches together.
> 
> It should be noted that this request came from Chad Fowler who
> encountered the problem in the Gemcutter website. The existing code
> already assumes that #size is available, which is the case for e.g.
> Thin which exposes a StringIO or Tempfile as rack.input.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Adding #size to rack.input
  2010-02-22 22:46 Adding #size to rack.input Ryan Tomayko
  2010-02-22 22:52 ` Hongli Lai
@ 2010-02-22 23:01 ` Magnus Holm
  2010-02-23 23:29   ` Hongli Lai
  1 sibling, 1 reply; 21+ messages in thread
From: Magnus Holm @ 2010-02-22 23:01 UTC (permalink / raw)
  To: rack-devel

[-- Attachment #1: Type: text/plain, Size: 720 bytes --]

Isn't the Content-Length already sent as a HTTP header, or isn't it always
there?
// Magnus Holm


On Mon, Feb 22, 2010 at 23:46, Ryan Tomayko <r@tomayko.com> wrote:

> On Mon, Feb 22, 2010 at 3:14 AM, Hongli Lai wrote:
> > Hi Ryan.
> >
> > Rack.input is already required to be rewindable, meaning the input has
> > to be backed by a file or a StringIO at some point. It should be
> > trivial for pretty much all Rack web servers to implement #size. Do
> > you have any objections against including #size in the spec?
> >
> > With kind regards,
> > Hongli Lai
>
> That's a good point. Any objections to adding #size to the SPEC and
> Rack::RewindableInput? I'd be happy to put the patches together.
>
> Thanks,
> Ryan
>

[-- Attachment #2: Type: text/html, Size: 1089 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Adding #size to rack.input
  2010-02-22 22:59   ` Brian Lopez
@ 2010-02-23  2:30     ` James Tucker
  2010-02-23  3:33       ` Ryan Tomayko
  2010-02-23 23:30       ` Hongli Lai
  0 siblings, 2 replies; 21+ messages in thread
From: James Tucker @ 2010-02-23  2:30 UTC (permalink / raw)
  To: rack-devel

[-- Attachment #1: Type: text/plain, Size: 188 bytes --]


On 22 Feb 2010, at 22:59, Brian Lopez wrote:

> What does this mean for streaming requests?

Block. This change just takes us further from async support, but it seems there's a need.

[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 3679 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Adding #size to rack.input
  2010-02-23  2:30     ` James Tucker
@ 2010-02-23  3:33       ` Ryan Tomayko
  2010-02-23 10:29         ` James Tucker
  2010-02-23 23:30       ` Hongli Lai
  1 sibling, 1 reply; 21+ messages in thread
From: Ryan Tomayko @ 2010-02-23  3:33 UTC (permalink / raw)
  To: rack-devel

On Mon, Feb 22, 2010 at 6:30 PM, James Tucker <jftucker@gmail.com> wrote:
>
> On 22 Feb 2010, at 22:59, Brian Lopez wrote:
>
>> What does this mean for streaming requests?
>
> Block. This change just takes us further from async support, but it seems there's a need.

Yeah. I've always hated #rewind to be honest but, as long as we have
it, I can't think of a good argument to not also support #size. Let's
either move away from seekable input as part of the SPEC or accept it
and allow stuff like this in. Also, Rack 2.0 (?!) is probably the
earliest we can drop rewind if we want to.

Thanks,
Ryan

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Adding #size to rack.input
  2010-02-23  3:33       ` Ryan Tomayko
@ 2010-02-23 10:29         ` James Tucker
  0 siblings, 0 replies; 21+ messages in thread
From: James Tucker @ 2010-02-23 10:29 UTC (permalink / raw)
  To: rack-devel

[-- Attachment #1: Type: text/plain, Size: 876 bytes --]


On 23 Feb 2010, at 03:33, Ryan Tomayko wrote:

> On Mon, Feb 22, 2010 at 6:30 PM, James Tucker <jftucker@gmail.com> wrote:
>> 
>> On 22 Feb 2010, at 22:59, Brian Lopez wrote:
>> 
>>> What does this mean for streaming requests?
>> 
>> Block. This change just takes us further from async support, but it seems there's a need.
> 
> Yeah. I've always hated #rewind to be honest but, as long as we have
> it, I can't think of a good argument to not also support #size. Let's
> either move away from seekable input as part of the SPEC or accept it
> and allow stuff like this in. Also, Rack 2.0 (?!) is probably the
> earliest we can drop rewind if we want to.

We could also say that rack.input may be either:

1. a string-like, supporting to_s
2. a file-like, supporting IO methods

And allow server authors to make it optional.

> 
> Thanks,
> Ryan


[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 3679 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Adding #size to rack.input
  2010-02-22 23:01 ` Magnus Holm
@ 2010-02-23 23:29   ` Hongli Lai
  0 siblings, 0 replies; 21+ messages in thread
From: Hongli Lai @ 2010-02-23 23:29 UTC (permalink / raw)
  To: Rack Development

On Feb 23, 12:01 am, Magnus Holm <judo...@gmail.com> wrote:
> Isn't the Content-Length already sent as a HTTP header, or isn't it always
> there?

No, HTTP allows chunked file uploads.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Adding #size to rack.input
  2010-02-23  2:30     ` James Tucker
  2010-02-23  3:33       ` Ryan Tomayko
@ 2010-02-23 23:30       ` Hongli Lai
  2010-02-24  2:57         ` Yehuda Katz
  1 sibling, 1 reply; 21+ messages in thread
From: Hongli Lai @ 2010-02-23 23:30 UTC (permalink / raw)
  To: Rack Development

On Feb 23, 3:30 am, James Tucker <jftuc...@gmail.com> wrote:
> Block. This change just takes us further from async support, but it seems there's a need.

Does it really break async support? It should be possible to support
async even when #size is supported, async frameworks/apps just musn't
call #size.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Adding #size to rack.input
  2010-02-23 23:30       ` Hongli Lai
@ 2010-02-24  2:57         ` Yehuda Katz
  2010-02-24 14:24           ` James Tucker
                             ` (2 more replies)
  0 siblings, 3 replies; 21+ messages in thread
From: Yehuda Katz @ 2010-02-24  2:57 UTC (permalink / raw)
  To: rack-devel

[-- Attachment #1: Type: text/plain, Size: 861 bytes --]

I've actually been really bugged about the rewindable requirement since it
was introduced.

I like a variant of James' proposal, which would work like this:

env["rack.input"] must either be an IO that, when #read, returns the full
input.

This would mean that if you #read from a non-rewindable input, you should
put a StringIO in place with the read data. If you read a rewindable input,
you should rewind it.

Yehuda Katz
Developer | Engine Yard
(ph) 718.877.1325


On Tue, Feb 23, 2010 at 3:30 PM, Hongli Lai <hongli@phusion.nl> wrote:

> On Feb 23, 3:30 am, James Tucker <jftuc...@gmail.com> wrote:
> > Block. This change just takes us further from async support, but it seems
> there's a need.
>
> Does it really break async support? It should be possible to support
> async even when #size is supported, async frameworks/apps just musn't
> call #size.
>

[-- Attachment #2: Type: text/html, Size: 1322 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Adding #size to rack.input
  2010-02-24  2:57         ` Yehuda Katz
@ 2010-02-24 14:24           ` James Tucker
  2010-02-26  8:05           ` Charles Oliver Nutter
  2010-02-26 19:53           ` Nick Sieger
  2 siblings, 0 replies; 21+ messages in thread
From: James Tucker @ 2010-02-24 14:24 UTC (permalink / raw)
  To: rack-devel


[-- Attachment #1.1: Type: text/plain, Size: 1384 bytes --]


On 24 Feb 2010, at 02:57, Yehuda Katz wrote:

> I've actually been really bugged about the rewindable requirement since it was introduced.
> 
> I like a variant of James' proposal, which would work like this:
> 
> env["rack.input"] must either be an IO that, when #read, returns the full input.
> 
> This would mean that if you #read from a non-rewindable input, you should put a StringIO in place with the read data. If you read a rewindable input, you should rewind it.

I think this boils down to two rules, with two notes:

1. env["rack.input"] must respond to #read and return the full input

Note: Middleware that consumes rack.input may use #rewind on rack.input if it is available, but:

2. Middleware must always ensure that it restores env['rack.input'] to contain an object that at minimum responds to #read and returns the full input.

Note: StringIO is recommended.



> 
> Yehuda Katz
> Developer | Engine Yard
> (ph) 718.877.1325
> 
> 
> On Tue, Feb 23, 2010 at 3:30 PM, Hongli Lai <hongli@phusion.nl> wrote:
> On Feb 23, 3:30 am, James Tucker <jftuc...@gmail.com> wrote:
> > Block. This change just takes us further from async support, but it seems there's a need.
> 
> Does it really break async support? It should be possible to support
> async even when #size is supported, async frameworks/apps just musn't
> call #size.
> 


[-- Attachment #1.2: Type: text/html, Size: 2170 bytes --]

[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 3679 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Adding #size to rack.input
  2010-02-24  2:57         ` Yehuda Katz
  2010-02-24 14:24           ` James Tucker
@ 2010-02-26  8:05           ` Charles Oliver Nutter
  2010-02-26 10:49             ` Christian Neukirchen
  2010-02-26 19:53           ` Nick Sieger
  2 siblings, 1 reply; 21+ messages in thread
From: Charles Oliver Nutter @ 2010-02-26  8:05 UTC (permalink / raw)
  To: rack-devel

Agreed. The ability to arbitrarily rewind read input at any time
*requires* adding extra buffering and indirection around the actual
stream, which makes Rack less suitable for handling small requests as
fast as possible. In jruby-rack, for example, we could directly funnel
the request's channel straight through the app with basically no extra
wrapping or buffering if not for this requirement. It doesn't seem
worth it to penalize small, fast requests just for the sake of
middleware that could do the right thing on it's own.

On Wednesday, February 24, 2010, Yehuda Katz <wycats@gmail.com> wrote:
> I've actually been really bugged about the rewindable requirement since it was introduced.
> I like a variant of James' proposal, which would work like this:
> env["rack.input"] must either be an IO that, when #read, returns the full input.
>
>
> This would mean that if you #read from a non-rewindable input, you should put a StringIO in place with the read data. If you read a rewindable input, you should rewind it.Yehuda Katz
> Developer | Engine Yard
> (ph) 718.877.1325
>
>
> On Tue, Feb 23, 2010 at 3:30 PM, Hongli Lai <hongli@phusion.nl> wrote:
>
>
> On Feb 23, 3:30 am, James Tucker <jftuc...@gmail.com> wrote:
>> Block. This change just takes us further from async support, but it seems there's a need.
>
> Does it really break async support? It should be possible to support
> async even when #size is supported, async frameworks/apps just musn't
> call #size.
>
>
>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Adding #size to rack.input
  2010-02-26  8:05           ` Charles Oliver Nutter
@ 2010-02-26 10:49             ` Christian Neukirchen
  2010-02-27  0:33               ` Eric Wong
  0 siblings, 1 reply; 21+ messages in thread
From: Christian Neukirchen @ 2010-02-26 10:49 UTC (permalink / raw)
  To: rack-devel

Charles Oliver Nutter <headius@headius.com> writes:

> Agreed. The ability to arbitrarily rewind read input at any time
> *requires* adding extra buffering and indirection around the actual
> stream, which makes Rack less suitable for handling small requests as
> fast as possible. In jruby-rack, for example, we could directly funnel
> the request's channel straight through the app with basically no extra
> wrapping or buffering if not for this requirement. It doesn't seem
> worth it to penalize small, fast requests just for the sake of
> middleware that could do the right thing on it's own.

What are the cases where we need #rewind, anyway?

Parsing the POST parameters twice, what else?

-- 
Christian Neukirchen  <chneukirchen@gmail.com>  http://chneukirchen.org

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Adding #size to rack.input
  2010-02-24  2:57         ` Yehuda Katz
  2010-02-24 14:24           ` James Tucker
  2010-02-26  8:05           ` Charles Oliver Nutter
@ 2010-02-26 19:53           ` Nick Sieger
  2010-02-26 21:20             ` Yehuda Katz
  2 siblings, 1 reply; 21+ messages in thread
From: Nick Sieger @ 2010-02-26 19:53 UTC (permalink / raw)
  To: rack-devel

On Tue, Feb 23, 2010 at 8:57 PM, Yehuda Katz <wycats@gmail.com> wrote:
> I've actually been really bugged about the rewindable requirement since it
> was introduced.

I agree, there are environments where it's not possible to provide a
rewindable input (e.g., AppEngine).

> I like a variant of James' proposal, which would work like this:
> env["rack.input"] must either be an IO that, when #read, returns the full
> input.
> This would mean that if you #read from a non-rewindable input, you should
> put a StringIO in place with the read data. If you read a rewindable input,
> you should rewind it.

What about a huge multipart upload? Do you really want to create a
StringIO that holds the whole thing in memory? Shouldn't there be a
size limit or a way to prevent that behavior?

/Nick

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Adding #size to rack.input
  2010-02-26 19:53           ` Nick Sieger
@ 2010-02-26 21:20             ` Yehuda Katz
  0 siblings, 0 replies; 21+ messages in thread
From: Yehuda Katz @ 2010-02-26 21:20 UTC (permalink / raw)
  To: rack-devel

[-- Attachment #1: Type: text/plain, Size: 1098 bytes --]

I think in that case, from the perspective of those downstream from the
middleware processing the huge upload, there *is no* input.

Yehuda Katz
Developer | Engine Yard
(ph) 718.877.1325


On Fri, Feb 26, 2010 at 11:53 AM, Nick Sieger <nicksieger@gmail.com> wrote:

> On Tue, Feb 23, 2010 at 8:57 PM, Yehuda Katz <wycats@gmail.com> wrote:
> > I've actually been really bugged about the rewindable requirement since
> it
> > was introduced.
>
> I agree, there are environments where it's not possible to provide a
> rewindable input (e.g., AppEngine).
>
> > I like a variant of James' proposal, which would work like this:
> > env["rack.input"] must either be an IO that, when #read, returns the full
> > input.
> > This would mean that if you #read from a non-rewindable input, you should
> > put a StringIO in place with the read data. If you read a rewindable
> input,
> > you should rewind it.
>
> What about a huge multipart upload? Do you really want to create a
> StringIO that holds the whole thing in memory? Shouldn't there be a
> size limit or a way to prevent that behavior?
>
> /Nick
>

[-- Attachment #2: Type: text/html, Size: 1592 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Adding #size to rack.input
  2010-02-26 10:49             ` Christian Neukirchen
@ 2010-02-27  0:33               ` Eric Wong
  2010-02-27  2:01                 ` Yehuda Katz
  2010-03-08 18:40                 ` Kyle Drake
  0 siblings, 2 replies; 21+ messages in thread
From: Eric Wong @ 2010-02-27  0:33 UTC (permalink / raw)
  To: rack-devel

Christian Neukirchen <chneukirchen@gmail.com> wrote:
> Charles Oliver Nutter <headius@headius.com> writes:
> 
> > Agreed. The ability to arbitrarily rewind read input at any time
> > *requires* adding extra buffering and indirection around the actual
> > stream, which makes Rack less suitable for handling small requests as
> > fast as possible. In jruby-rack, for example, we could directly funnel
> > the request's channel straight through the app with basically no extra
> > wrapping or buffering if not for this requirement. It doesn't seem
> > worth it to penalize small, fast requests just for the sake of
> > middleware that could do the right thing on it's own.

Unicorn/Rainbows!/Zbatery cheat in body-less requests and reuse the same
StringIO object for rack.input.  The only extra overhead is the constant
lookups and hash assignments:

  env[RACK_INPUT] = NULL_IO

> What are the cases where we need #rewind, anyway?
> 
> Parsing the POST parameters twice, what else?

One app I've played with in the past is a retrying reverse proxy
with the ability to replay PUT requests if some backends fail.

Rewindability is a nice-to-have, but it can (and should imho) be
implemented via middleware instead of being a requirement.

-- 
Eric Wong

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Adding #size to rack.input
  2010-02-27  0:33               ` Eric Wong
@ 2010-02-27  2:01                 ` Yehuda Katz
  2010-02-28 15:19                   ` Hongli Lai
  2010-03-08 18:40                 ` Kyle Drake
  1 sibling, 1 reply; 21+ messages in thread
From: Yehuda Katz @ 2010-02-27  2:01 UTC (permalink / raw)
  To: rack-devel

[-- Attachment #1: Type: text/plain, Size: 3508 bytes --]

Here's how I think about it:

Given the following middleware chain:

Passenger => LargeFileProcessor => StatsCollector => ParamsParser =>
AppEndpoint

Passenger, the server, is responsible for serving a Rack-compliant request
to LargeFileProcessor, which it perceives to be the endpoint.
LargeFileProcessor notices that the rack.input contains a large file, and
pulls it out, replacing the input with StringIO.new(""). It sends the
request to StatsCollector with a new key ("large_file.location").
StatsCollector wraps its call in a benchmarking block, saving the results
somewhere.

From the perspective of StatsCollector, LargeFileProcessor is the server,
which serves it a valid Rack request. StatsCollector sends the request to
ParamsParser with a new key ("large_file.location"). Again, From the
perspective of ParamsParser, StatsCollector is the server, which serves *it*
a valid Rack request. It parses the params from the Query string, seeing
nothing in the body, and sends the request on.

From the perspective of AppEndpoint, ParamsParser is the server, but with a
slightly modified Rack contract ("params.parsed_params" is a required key
containing the params and "large_file.location" contains a large file, if it
exists). It then returns a response, and each middleware in chain perceives
the response to be coming from "the endpoint", before it eventually makes
its way back to Passenger.

The important thing is that contract-extending middlewares (like
LargeFileProcessor and ParamsParser) do not break transparent middlewares
(like StatsCollector). Contract-extending middlewares are responsible for
serving a request that is still a valid Rack request (but not necessarily
identical to the original HTTP request), but they may provide additional
information for downstreams that participate in the modified contract.

The beauty of Rack is that it leverages dynamism via an arbitrary Hash to
allow contract extensions to be made at will, while retaining a fairly
strict list of requirements for what the resulting Hash (and response tuple)
must look like.

Yehuda Katz
Developer | Engine Yard
(ph) 718.877.1325


On Fri, Feb 26, 2010 at 4:33 PM, Eric Wong <normalperson@yhbt.net> wrote:

> Christian Neukirchen <chneukirchen@gmail.com> wrote:
> > Charles Oliver Nutter <headius@headius.com> writes:
> >
> > > Agreed. The ability to arbitrarily rewind read input at any time
> > > *requires* adding extra buffering and indirection around the actual
> > > stream, which makes Rack less suitable for handling small requests as
> > > fast as possible. In jruby-rack, for example, we could directly funnel
> > > the request's channel straight through the app with basically no extra
> > > wrapping or buffering if not for this requirement. It doesn't seem
> > > worth it to penalize small, fast requests just for the sake of
> > > middleware that could do the right thing on it's own.
>
> Unicorn/Rainbows!/Zbatery cheat in body-less requests and reuse the same
> StringIO object for rack.input.  The only extra overhead is the constant
> lookups and hash assignments:
>
>  env[RACK_INPUT] = NULL_IO
>
> > What are the cases where we need #rewind, anyway?
> >
> > Parsing the POST parameters twice, what else?
>
> One app I've played with in the past is a retrying reverse proxy
> with the ability to replay PUT requests if some backends fail.
>
> Rewindability is a nice-to-have, but it can (and should imho) be
> implemented via middleware instead of being a requirement.
>
> --
> Eric Wong
>

[-- Attachment #2: Type: text/html, Size: 4394 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Adding #size to rack.input
  2010-02-27  2:01                 ` Yehuda Katz
@ 2010-02-28 15:19                   ` Hongli Lai
  2010-03-01  1:56                     ` Charles Oliver Nutter
  0 siblings, 1 reply; 21+ messages in thread
From: Hongli Lai @ 2010-02-28 15:19 UTC (permalink / raw)
  To: Rack Development

I'm not opposed to removing the rewindability requirement, but I think
we're forgetting here what the rewindability requirement was meant to
solve. Eric Wong mentioned that rewindability should be outside the
Rack spec and should be implemented in a middleware instead. However
this was exactly the situation in Rails before the rewindability
requirement was introduced. Rails had a middleware which consumes the
input into a StringIO and replaces rack.input with that StringIO
(ActionController::RewindableInput). This lead to the following
situation:

- Some developers write input-consuming middlewares and place it
*before* ActionController::RewindableInput. As a result the app layer
does not receive any POST data at all. For a good case study, read
http://code.google.com/p/phusion-passenger/issues/detail?id=220. The
problem did not occur in Mongrel or Thin because they always buffer
their input into a StringIO or Tempfile.
- ActionController::RewindableInput's implementation is quite naive
and will happily slurp 2 GB of rack.input data into memory if there's
that much data.

In theory the rewindability requirement can be removed by agreeing on
a protocol, e.g. by demanding that rack.input-consuming middlewares
wrap rack.input, or by implementing rewindability in middleware. I
agree that this is technically more elegant than placing requirements
on the web server. But I'm not arguing technical elegance here. The
problem with the technically elegant method is that it leaves too much
room for developer mistakes and too easily causes confusion among
developers, such as the Facebooker bug mentioned in the Phusion
Passenger bug report.

Let's count the number of Rack web servers. There are about 4 I think
(Mongrel, Thin, Unicorn, Phusion Passenger); in any case probably less
than 10. Now let's denote the number libraries or apps out there that
use middlewares as x; the value of x is unknown but I'm pretty sure
it's greater than 10. The chance that there's at least one rack.input
consuming bug in x apps/libraries is a lot greater than the chance
that there's at least one rack.input consuming bug in one of the Rack
web servers. It's just easier to fix all the web servers than to fix
every broken app or library. Furthermore streaming can be implemented
even with the rewindability requirement, see Unicorn's TeeInput. This
is why I think having the rewindability requirement in the spec is a
good idea: it makes implementing a Rack web server a bit more
complicated, but it prevents library/app developer mistakes while at
the same time not making streaming impossible.

If the rewindability requirement is to be removed then steps must be
taken to ensure that developers cannot make mistakes and that any
input-wrapping code is efficient. I'm thinking about this:
- rack.input MUST NOT respond to #size, #rewind, #seek, etc. If the
web server exposes those methods then developers will rely on them,
and their code will fail when they switch to a web server that does
provide #size, #rewind, etc. and they might think that the web server
is bugged instead of their own code.
- The Rack library must provide a utility class for wrapping
rack.input into something rewindable, so that people don't naively
slurp all of rack.input into a StringIO. This utility class must be
implemented in the most efficient manner possible and must not hog
memory. The spec must make mention of this class so that developers
don't try to write their own naive implementations.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Adding #size to rack.input
  2010-02-28 15:19                   ` Hongli Lai
@ 2010-03-01  1:56                     ` Charles Oliver Nutter
  2010-03-11 23:56                       ` Graham
  0 siblings, 1 reply; 21+ messages in thread
From: Charles Oliver Nutter @ 2010-03-01  1:56 UTC (permalink / raw)
  To: rack-devel

I don't think it should be a requrement for server vendors to
compensate for bugs in libraries. It was arguably a bug in facebooker
that it consumed a stream in a destructive way. I would not have
modified rack to accommodate destructive plugins, since down that path
lies madness. The common cases should not be penalized for the unusual
(or buggy) cases.

I would propose that Rack nor provided a rewindable input by default,
but potentially provided an easy API to flip the input if needed,
either through the "replace IO with something rewindable" approach or
by providing a call to be made before the first destructive read from
the stream. This will allow requests that don't have destructive
middleware to have fastest-possible request processing without making
such middleware substantially more difficult to support.

FYI, you would want the JRuby based servers in that list as well, like
GlassFish server, GlassFish gem, JBoss TorqueBox, jruby-rack (used by
all WAR file deployments), whatever Google App Engine uses now, and
possibly other one-off servers like jetty-rails.

On Sunday, February 28, 2010, Hongli Lai <hongli@phusion.nl> wrote:
> I'm not opposed to removing the rewindability requirement, but I think
> we're forgetting here what the rewindability requirement was meant to
> solve. Eric Wong mentioned that rewindability should be outside the
> Rack spec and should be implemented in a middleware instead. However
> this was exactly the situation in Rails before the rewindability
> requirement was introduced. Rails had a middleware which consumes the
> input into a StringIO and replaces rack.input with that StringIO
> (ActionController::RewindableInput). This lead to the following
> situation:
>
> - Some developers write input-consuming middlewares and place it
> *before* ActionController::RewindableInput. As a result the app layer
> does not receive any POST data at all. For a good case study, read
> http://code.google.com/p/phusion-passenger/issues/detail?id=220. The
> problem did not occur in Mongrel or Thin because they always buffer
> their input into a StringIO or Tempfile.
> - ActionController::RewindableInput's implementation is quite naive
> and will happily slurp 2 GB of rack.input data into memory if there's
> that much data.
>
> In theory the rewindability requirement can be removed by agreeing on
> a protocol, e.g. by demanding that rack.input-consuming middlewares
> wrap rack.input, or by implementing rewindability in middleware. I
> agree that this is technically more elegant than placing requirements
> on the web server. But I'm not arguing technical elegance here. The
> problem with the technically elegant method is that it leaves too much
> room for developer mistakes and too easily causes confusion among
> developers, such as the Facebooker bug mentioned in the Phusion
> Passenger bug report.
>
> Let's count the number of Rack web servers. There are about 4 I think
> (Mongrel, Thin, Unicorn, Phusion Passenger); in any case probably less
> than 10. Now let's denote the number libraries or apps out there that
> use middlewares as x; the value of x is unknown but I'm pretty sure
> it's greater than 10. The chance that there's at least one rack.input
> consuming bug in x apps/libraries is a lot greater than the chance
> that there's at least one rack.input consuming bug in one of the Rack
> web servers. It's just easier to fix all the web servers than to fix
> every broken app or library. Furthermore streaming can be implemented
> even with the rewindability requirement, see Unicorn's TeeInput. This
> is why I think having the rewindability requirement in the spec is a
> good idea: it makes implementing a Rack web server a bit more
> complicated, but it prevents library/app developer mistakes while at
> the same time not making streaming impossible.
>
> If the rewindability requirement is to be removed then steps must be
> taken to ensure that developers cannot make mistakes and that any
> input-wrapping code is efficient. I'm thinking about this:
> - rack.input MUST NOT respond to #size, #rewind, #seek, etc. If the
> web server exposes those methods then developers will rely on them,
> and their code will fail when they switch to a web server that does
> provide #size, #rewind, etc. and they might think that the web server
> is bugged instead of their own code.
> - The Rack library must provide a utility class for wrapping
> rack.input into something rewindable, so that people don't naively
> slurp all of rack.input into a StringIO. This utility class must be
> implemented in the most efficient manner possible and must not hog
> memory. The spec must make mention of this class so that developers
> don't try to write their own naive implementations.
>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Adding #size to rack.input
  2010-02-27  0:33               ` Eric Wong
  2010-02-27  2:01                 ` Yehuda Katz
@ 2010-03-08 18:40                 ` Kyle Drake
  1 sibling, 0 replies; 21+ messages in thread
From: Kyle Drake @ 2010-03-08 18:40 UTC (permalink / raw)
  To: rack-devel

>
>> What are the cases where we need #rewind, anyway?
>>
>> Parsing the POST parameters twice, what else?
>
> One app I've played with in the past is a retrying reverse proxy
> with the ability to replay PUT requests if some backends fail.
>
> Rewindability is a nice-to-have, but it can (and should imho) be
> implemented via middleware instead of being a requirement.
>

Wow, now that you mention it, I can't really think of much either.
That's a neat use of #rewind though.

Even mp3/flv/ect. audio seeking AFAICT works through sending another
GET request with a byte-range:
http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.35.1

I'll know by the end of the week, planning on implementing a web app
with seekable content.

-Kyle

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Adding #size to rack.input
  2010-03-01  1:56                     ` Charles Oliver Nutter
@ 2010-03-11 23:56                       ` Graham
  0 siblings, 0 replies; 21+ messages in thread
From: Graham @ 2010-03-11 23:56 UTC (permalink / raw)
  To: Rack Development

I'd like to second Charles' sentiment that the rewindable requirement
is placing the onus on the wrong part of the stack. In the end, what
this means is that every request pays for a feature only a few
requests need to use. Using a hammer to kill a flea.

There will always, obviously, be more apps and middlewares out there
than handlers, but that doesn't mean that changes to the handler
interface should not be looked at critically. Especially when those
changes to the interface are basically bug fixes for middleware.

Graham.

On Feb 28, 6:56 pm, Charles Oliver Nutter <head...@headius.com> wrote:
> I don't think it should be a requrement for server vendors to
> compensate for bugs in libraries. It was arguably a bug in facebooker
> that it consumed a stream in a destructive way. I would not have
> modified rack to accommodate destructive plugins, since down that path
> lies madness. The common cases should not be penalized for the unusual
> (or buggy) cases.
>
> I would propose that Rack nor provided a rewindable input by default,
> but potentially provided an easy API to flip the input if needed,
> either through the "replace IO with something rewindable" approach or
> by providing a call to be made before the first destructive read from
> the stream. This will allow requests that don't have destructive
> middleware to have fastest-possible request processing without making
> such middleware substantially more difficult to support.
>
> FYI, you would want the JRuby based servers in that list as well, like
> GlassFish server, GlassFish gem, JBoss TorqueBox, jruby-rack (used by
> all WAR file deployments), whatever Google App Engine uses now, and
> possibly other one-off servers like jetty-rails.

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2010-03-11 23:56 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-02-22 22:46 Adding #size to rack.input Ryan Tomayko
2010-02-22 22:52 ` Hongli Lai
2010-02-22 22:59   ` Brian Lopez
2010-02-23  2:30     ` James Tucker
2010-02-23  3:33       ` Ryan Tomayko
2010-02-23 10:29         ` James Tucker
2010-02-23 23:30       ` Hongli Lai
2010-02-24  2:57         ` Yehuda Katz
2010-02-24 14:24           ` James Tucker
2010-02-26  8:05           ` Charles Oliver Nutter
2010-02-26 10:49             ` Christian Neukirchen
2010-02-27  0:33               ` Eric Wong
2010-02-27  2:01                 ` Yehuda Katz
2010-02-28 15:19                   ` Hongli Lai
2010-03-01  1:56                     ` Charles Oliver Nutter
2010-03-11 23:56                       ` Graham
2010-03-08 18:40                 ` Kyle Drake
2010-02-26 19:53           ` Nick Sieger
2010-02-26 21:20             ` Yehuda Katz
2010-02-22 23:01 ` Magnus Holm
2010-02-23 23:29   ` Hongli Lai

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).