rack-devel archive mirror (unofficial) https://groups.google.com/group/rack-devel
 help / color / mirror / Atom feed
From: James Tucker <jftucker@gmail.com>
To: rack-devel@googlegroups.com
Subject: Re: [ANN/RFC] LMGTWTY - Web Sockets for Rack+Rainbows!
Date: Thu, 17 Dec 2009 03:23:59 +0000	[thread overview]
Message-ID: <5571C7DA-A701-4775-9F56-84CE1E9EBD30@gmail.com> (raw)
In-Reply-To: <20091216221427.GA21033@dcvr.yhbt.net>


On 16 Dec 2009, at 22:14, Eric Wong wrote:

> James Tucker <jftucker@gmail.com> wrote:
>> On 15 Dec 2009, at 21:32, Eric Wong wrote:
>>>> On 15 Dec 2009, at 04:37, Eric Wong wrote:
>>>>> James Tucker <jftucker@gmail.com> wrote:
>>>>>> On 14 Dec 2009, at 18:42, Eric Wong wrote:
>>>>>>> James Tucker <jftucker@gmail.com> wrote:
>>>>>>>> I really want to work out an abstraction away from IO instances, #read
>>>>>>>> and #write for this stuff. It's highly coupled, getting in the way of
>>>>>>>> tests, and heavy lifting environments. I have big plans for Rack 2.0
>>>>>>>> to remove all IO that has not been properly abstracted and decoupled
>>>>>>>> from implementation details, but that's a long way off, mostly due to
>>>>>>>> lack of time and incentive. In the meantime, I can implore you all to
>>>>>>>> take steps in the right direction :-)
>>>>>>> 
>>>>>>> Huh?  I don't see what the problem with IO instances/semantics is,
>>>>>>> especially with the availability of StringIO for testing.  "rack.input"
>>>>>>> is clearly specified and works fine as-is IMHO, though the rewindability
>>>>>>> requirement does add some unnecessary overhead.
>>>>>> 
>>>>>> I disagree with "works fine". It does not work fine with Thin, in
>>>>>> fact, on the contrary, it forces some real ugliness into the system. I
>>>>>> also think that StringIO is a really unfortunate thing to have to
>>>>>> resort to, as it has so many short comings.
>>>>> 
>>>>> Mind elaborating?  I do not see what's ugly about it[1].
>>> 
>>>> Sure. It forces authors to several (bad) APIs options for handling
>>>> complex IO scenarios.
>>>> 
>>>> 1. Buffer the entire request and then republish to the application
>>>> 2. Pass forward the raw IO
>>> 
>>> 3. stream input with a rewindable backing store like TeeInput:
>>>  http://unicorn.bogomips.org/Unicorn/TeeInput.html
>>>  This is already supported by several (but not all) Rainbows!
>>>  concurrency models: http://rainbows.rubyforge.org/Summary.html
>>> 
>>> I'll need to use Fibers to get this working with Rev and EventMachine,
>>> (the FiberSpawn/FiberPool models are select-based), but exposing an
>>> outwardly synchronous model is far easier for application developers to
>>> grok.
>> 
>> That's API dependent. And exactly my point, you're forcing them to use
>> stack based concurrency models. Doing this means one can never escape
>> the weight of stack storage, and moreover, stack wrangling will never
>> be as fast as object level state containers. This is not a fact that
>> can be escaped.
> 
> Everything is API dependent.  TeeInput just conforms to what
> Rack::Lint::InputWrapper allows.

Quite right and the point I'm simply trying to make, is that there are APIs that can work with all concurrency models. Moreover, it is trivial to turn an asynchronous api into a synchronous api, by contrast it is almost impossible to do the reverse. This is why my planned changes to rack are so far from becoming a reality at this time - as I have to rewrite the entire stack. By contrast, supporting rack 1.0 protocoled apps from the vapour 2.0 api, whatever it is, is absolutely trivial.

> 
> Of course stack-based concurrency models will always be slower on the
> machine than state machine + event-based models.
> 
> Ruby will also always be slower than a lot of other
> less-pleasant-to-write languages out there.  It's a tradeoff we make as
> Rubyists.  I believe most developers find things like
> fibers/actors/threads easier to work with/maintain than
> events+callbacks, so Sunshowers got support for linear programming
> models first.

Agreed. I'm not saying "must", but I would like to request that if you're going to try and support asynchronous models, please do it for real, rather than hacking it in the way we're having to in rack / thin / flow / etc. The more hacks we end up with, the more people get the impression that this is "the right way". As it is, the lack of protocol - app layer separation in for example, eventmachine applications is appalling, and something from which the community may never recover. The same kind of history is about to repeat itself in the async + long running cycle web areas - if it hasn't already gone past the point of no return.

Of course, if it seems like too much work, please do just ignore my request, maybe I'm an ideologist, or maybe I need to formalise my thoughts on paper before they'll really have an impact, but I have experience that tells me that async apis needn't be ugly, or hard, or slower to write - indeed javascript and C programmers use them all the time, unwittingly. One part of the failure in ruby is the implicit temptation to "solve" these "problems" by just "slapping in a few blocks" or the like. Proper OO design does a much better job, and of late the favoured patterns are most definitely channels, queues, and closure-less callbacks - but most of this stuff is in real use in commercial code, as there is no high scale high efficiency open source application server in ruby at this time (that i know of). I do not believe this is due to language limitations, but simply lack of demand and developer time.

> With a env["rack.io"] that exposes an IO-ish object, then nothing
> prevents developers from using EM.attach or Rev::IO.new on it, either.

Depending of course, one whether or not that's even a useful thing to do...

request -> app -> attach | release is required here, otherwise the attach was useless

Contrary to a sync approach, this now means we're back in hack-town to push our state somewhere other than the stack, then release the reactor (in a reactor pattern) to actually complete any more IO.

Of course, if we passed a channel, with a correctly restricted API, all these issues go away. It's down to the server to schedule IO (big tick here), and the app just consumes a callback based API in a standard manner.

Synchronous apps can simply drain the channel in a synchronous (busy wait) manner, or have their callbacks implicitly and immediately scheduled. Implicit and immediate scheduling of callbacks is the simplest common approach, leaving only a call that might be considered slightly excess in a "pure synchronous" environment, but really ends up being akin to #read and such.

It is only through proper abstraction that you can make it easy to introduce or maintain concepts such as zero copy, wait free blocking, demand based scheduling, and so on and so forth. I do realise that examples of more complex scheduling and IO patterns are for the most part limited to kernels, academia and telecomms at present, but, funnily enough, I think this has a lot to do with the restrictive APIs that more common patterns expose - I've mentioned already that it's nearly impossible to retro-fit these things.

> If people like that, I'd be happy to include it EM and Rev-based
> concurrency models for Rainbows! and add support for it in Sunshowers.

Like I say, I could just be too much of an efficiency ideologist. If you don't have the demand / desire for it, then you may not have a good reason to put the effort in - certainly there's nothing else in the middle-ground space that you're filling presently, so I don't think you're "losing a significant market" by not doing so. As it is, I know of a couple of handfuls of apps either in production or close to production doing relatively high load async web work in ruby, and those folks seem to have done alright with the nasty hack that is the current async.callback and deferrable body apis. Bigger deterrents to this area actually exist outside of the web server domain itself, as the 1.2+ Thin api works just fine, by contrast, there's no ORM or process framework for doing async work well, and the closest thing there really is to a "nice" async api in ruby open source at present is the sequel monkey patches that Aman published iirc with em-mysql.

P.S. it's late, so i apologise if this is too wordy/vague.

  reply	other threads:[~2009-12-17  3:24 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-12-11 20:19 [ANN/RFC] LMGTWTY - Web Sockets for Rack+Rainbows! Eric Wong
2009-12-11 21:37 ` Eric Wong
2009-12-12  0:09   ` Daniel N
2009-12-13  9:09 ` Eric Wong
2009-12-13 20:53 ` Eric Wong
2009-12-14  0:23   ` Lakshan Perera
2009-12-14  0:51     ` Eric Wong
2009-12-14  0:57       ` Eric Wong
2009-12-14 10:41     ` James Tucker
2009-12-14 18:42       ` Eric Wong
2009-12-15  1:00         ` James Tucker
2009-12-15  4:37           ` Eric Wong
2009-12-15 11:15             ` James Tucker
2009-12-15 21:32               ` Eric Wong
2009-12-16 10:57                 ` James Tucker
2009-12-16 22:14                   ` Eric Wong
2009-12-17  3:23                     ` James Tucker [this message]
2009-12-17  8:47                       ` Eric Wong
2009-12-17 11:54                         ` James Tucker
2009-12-16 12:38                 ` James Tucker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-list from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://groups.google.com/group/rack-devel

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5571C7DA-A701-4775-9F56-84CE1E9EBD30@gmail.com \
    --to=rack-devel@googlegroups.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).