rack-devel archive mirror (unofficial) https://groups.google.com/group/rack-devel
 help / color / mirror / Atom feed
* [PATCH 0/3] support reuse of HeaderHash objects
@ 2009-09-04 21:41 Eric Wong
  2009-09-04 21:41 ` [PATCH 1/3] HeaderHash.new avoids unnecessary object creation Eric Wong
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: Eric Wong @ 2009-09-04 21:41 UTC (permalink / raw)
  To: rack-devel


HeaderHash objects are needlessly created and discarded by various
pieces of Rack middleware, leading to unnecessary overhead
from both object creation + GC and iteration overhead via
headers#each.

Enable and promote the reuse of them by making HeaderHash.new
idempotent iff a HeaderHash object is passed to the class method.

These changes resulted in a 20-25% increase in performance
with the following simple config.ru:

------------- config.ru ---------------
use Rack::ContentLength
use Rack::Deflater
use Rack::ContentType
use Rack::ShowExceptions
run Rack::Lobster.new
---------------- 8< -------------------

Responses with more headers (especially those with multiple Set-Cookie
headers) are expected to benefit even more.  This adds a small amount of
overhead in that it adds a conditional to HeaderHash.new, so
applications that only use HeaderHash.new once in the entire request
cycle will suffer slightly.  However, the majority of applications
should benefit if they use more than one middleware.

Independent benchmarks on real applications are very much encouraged,
thanks.

Lighthouse ticket:
  http://rack.lighthouseapp.com/projects/22435-rack/tickets/74

Pull URL:
  git://git.bogomips.org/rack.git

cgit browser:
  http://git.bogomips.org/cgit/rack.git

Eric Wong (3):
      HeaderHash.new avoids unnecessary object creation
      avoid HeaderHash#to_hash in middlewares
      CommonLogger uses HeaderHash to lookup Content-Length

 lib/rack/chunked.rb      |    4 ++--
 lib/rack/commonlogger.rb |    9 +++------
 lib/rack/content_type.rb |    2 +-
 lib/rack/response.rb     |    4 ++--
 lib/rack/utils.rb        |    4 ++++
 test/spec_rack_utils.rb  |    8 ++++++++
 6 files changed, 20 insertions(+), 11 deletions(-)

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 1/3] HeaderHash.new avoids unnecessary object creation
  2009-09-04 21:41 [PATCH 0/3] support reuse of HeaderHash objects Eric Wong
@ 2009-09-04 21:41 ` Eric Wong
  2009-09-04 21:41 ` [PATCH 2/3] avoid HeaderHash#to_hash in middlewares Eric Wong
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Eric Wong @ 2009-09-04 21:41 UTC (permalink / raw)
  To: rack-devel; +Cc: Eric Wong


Creating a new HeaderHash is an O(n) operation in addition to
the cost of allocating a new object.  When using multiple pieces
of middleware, this can lead to unnecessary memory allocation
and iteration overhead.   We now explicitly define the
HeaderHash.new class method to return its original argument if
it is already a HeaderHash to avoid repeating work.
---
 lib/rack/utils.rb       |    4 ++++
 test/spec_rack_utils.rb |    8 ++++++++
 2 files changed, 12 insertions(+), 0 deletions(-)

diff --git a/lib/rack/utils.rb b/lib/rack/utils.rb
index 74303ef..21b58c9 100644
--- a/lib/rack/utils.rb
+++ b/lib/rack/utils.rb
@@ -258,6 +258,10 @@ module Rack
     # A case-insensitive Hash that preserves the original case of a
     # header when set.
     class HeaderHash < Hash
+      def self.new(hash={})
+        HeaderHash === hash ? hash : super(hash)
+      end
+
       def initialize(hash={})
         super()
         @names = {}
diff --git a/test/spec_rack_utils.rb b/test/spec_rack_utils.rb
index c397429..3586f8c 100644
--- a/test/spec_rack_utils.rb
+++ b/test/spec_rack_utils.rb
@@ -268,6 +268,14 @@ context "Rack::Utils::HeaderHash" do
     h = Rack::Utils::HeaderHash.new("foo" => "bar")
     h.delete("Hello").should.be.nil
   end
+
+  specify "should avoid unnecessary object creation if possible" do
+    a = Rack::Utils::HeaderHash.new("foo" => "bar")
+    b = Rack::Utils::HeaderHash.new(a)
+    b.object_id.should.equal(a.object_id)
+    b.should.equal(a)
+  end
+
 end
 
 context "Rack::Utils::Context" do
-- 
1.6.4.2.236.gf324c

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 2/3] avoid HeaderHash#to_hash in middlewares
  2009-09-04 21:41 [PATCH 0/3] support reuse of HeaderHash objects Eric Wong
  2009-09-04 21:41 ` [PATCH 1/3] HeaderHash.new avoids unnecessary object creation Eric Wong
@ 2009-09-04 21:41 ` Eric Wong
  2009-09-04 21:41 ` [PATCH 3/3] CommonLogger uses HeaderHash to lookup Content-Length Eric Wong
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Eric Wong @ 2009-09-04 21:41 UTC (permalink / raw)
  To: rack-devel; +Cc: Eric Wong


Since HeaderHash objects are valid header responses, avoid
converting the headers to Hash objects only to have it
reconverted back to HeaderHash in the next middleware.
---
 lib/rack/chunked.rb      |    4 ++--
 lib/rack/content_type.rb |    2 +-
 lib/rack/response.rb     |    4 ++--
 3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/lib/rack/chunked.rb b/lib/rack/chunked.rb
index 280d89d..dddf969 100644
--- a/lib/rack/chunked.rb
+++ b/lib/rack/chunked.rb
@@ -19,7 +19,7 @@ module Rack
          STATUS_WITH_NO_ENTITY_BODY.include?(status) ||
          headers['Content-Length'] ||
          headers['Transfer-Encoding']
-        [status, headers.to_hash, body]
+        [status, headers, body]
       else
         dup.chunk(status, headers, body)
       end
@@ -29,7 +29,7 @@ module Rack
       @body = body
       headers.delete('Content-Length')
       headers['Transfer-Encoding'] = 'chunked'
-      [status, headers.to_hash, self]
+      [status, headers, self]
     end
 
     def each
diff --git a/lib/rack/content_type.rb b/lib/rack/content_type.rb
index 0c1e1ca..874c28c 100644
--- a/lib/rack/content_type.rb
+++ b/lib/rack/content_type.rb
@@ -17,7 +17,7 @@ module Rack
       status, headers, body = @app.call(env)
       headers = Utils::HeaderHash.new(headers)
       headers['Content-Type'] ||= @content_type
-      [status, headers.to_hash, body]
+      [status, headers, body]
     end
   end
 end
diff --git a/lib/rack/response.rb b/lib/rack/response.rb
index d1f6a12..1932c82 100644
--- a/lib/rack/response.rb
+++ b/lib/rack/response.rb
@@ -71,9 +71,9 @@ module Rack
 
       if [204, 304].include?(status.to_i)
         header.delete "Content-Type"
-        [status.to_i, header.to_hash, []]
+        [status.to_i, header, []]
       else
-        [status.to_i, header.to_hash, self]
+        [status.to_i, header, self]
       end
     end
     alias to_a finish           # For *response
-- 
1.6.4.2.236.gf324c

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 3/3] CommonLogger uses HeaderHash to lookup Content-Length
  2009-09-04 21:41 [PATCH 0/3] support reuse of HeaderHash objects Eric Wong
  2009-09-04 21:41 ` [PATCH 1/3] HeaderHash.new avoids unnecessary object creation Eric Wong
  2009-09-04 21:41 ` [PATCH 2/3] avoid HeaderHash#to_hash in middlewares Eric Wong
@ 2009-09-04 21:41 ` Eric Wong
  2009-09-28 10:14 ` [PATCH 4/3] HeaderHash#each yields Lint-OK multivalue headers Eric Wong
  2010-01-05 23:58 ` [PATCH 0/3] support reuse of HeaderHash objects Eric Wong
  4 siblings, 0 replies; 6+ messages in thread
From: Eric Wong @ 2009-09-04 21:41 UTC (permalink / raw)
  To: rack-devel; +Cc: Eric Wong


Since HeaderHash is cheaper to use now, encourage its usage
instead of reinventing a way to lookup header values with
an enforced O(n) overhead.

Under best conditions, this can now be done in O(1) time if the
rest of our middleware stack already uses (and passes)
HeaderHash.  This does make things slower if CommonLogger is the
only middleware in the stack, however that's probably not too
common.
---
 lib/rack/commonlogger.rb |    9 +++------
 1 files changed, 3 insertions(+), 6 deletions(-)

diff --git a/lib/rack/commonlogger.rb b/lib/rack/commonlogger.rb
index 880f0fb..1edc9b8 100644
--- a/lib/rack/commonlogger.rb
+++ b/lib/rack/commonlogger.rb
@@ -16,6 +16,7 @@ module Rack
     def call(env)
       began_at = Time.now
       status, header, body = @app.call(env)
+      header = Utils::HeaderHash.new(header)
       log(env, status, header, began_at)
       [status, header, body]
     end
@@ -41,12 +42,8 @@ module Rack
     end
 
     def extract_content_length(headers)
-      headers.each do |key, value|
-        if key.downcase == 'content-length'
-          return value.to_s == '0' ? '-' : value
-        end
-      end
-      '-'
+      value = headers['Content-Length'] or return '-'
+      value.to_s == '0' ? '-' : value
     end
   end
 end
-- 
1.6.4.2.236.gf324c

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 4/3] HeaderHash#each yields Lint-OK multivalue headers
  2009-09-04 21:41 [PATCH 0/3] support reuse of HeaderHash objects Eric Wong
                   ` (2 preceding siblings ...)
  2009-09-04 21:41 ` [PATCH 3/3] CommonLogger uses HeaderHash to lookup Content-Length Eric Wong
@ 2009-09-28 10:14 ` Eric Wong
  2010-01-05 23:58 ` [PATCH 0/3] support reuse of HeaderHash objects Eric Wong
  4 siblings, 0 replies; 6+ messages in thread
From: Eric Wong @ 2009-09-28 10:14 UTC (permalink / raw)
  To: rack-devel


Rack::Lint does not allow header values yielded by #each to be
non-String objects, so we join them like we do in #to_hash.
This finally allows HeaderHash to be passed in the Rack response
as a header without needing #to_hash.
---
 Oops! :x

 lib/rack/utils.rb       |    6 ++++++
 test/spec_rack_utils.rb |    8 ++++++++
 2 files changed, 14 insertions(+), 0 deletions(-)

diff --git a/lib/rack/utils.rb b/lib/rack/utils.rb
index 21b58c9..ba36742 100644
--- a/lib/rack/utils.rb
+++ b/lib/rack/utils.rb
@@ -268,6 +268,12 @@ module Rack
         hash.each { |k, v| self[k] = v }
       end
 
+      def each
+        super do |k, v|
+          yield(k, v.respond_to?(:to_ary) ? v.to_ary.join("\n") : v)
+        end
+      end
+
       def to_hash
         inject({}) do |hash, (k,v)|
           if v.respond_to? :to_ary
diff --git a/test/spec_rack_utils.rb b/test/spec_rack_utils.rb
index 3586f8c..4a883ea 100644
--- a/test/spec_rack_utils.rb
+++ b/test/spec_rack_utils.rb
@@ -276,6 +276,14 @@ context "Rack::Utils::HeaderHash" do
     b.should.equal(a)
   end
 
+  specify "should convert Array values to Strings when responding to #each" do
+    h = Rack::Utils::HeaderHash.new("foo" => ["bar", "baz"])
+    h.each do |k,v|
+      k.should.equal("foo")
+      v.should.equal("bar\nbaz")
+    end
+  end
+
 end
 
 context "Rack::Utils::Context" do
-- 
Eric Wong

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH 0/3] support reuse of HeaderHash objects
  2009-09-04 21:41 [PATCH 0/3] support reuse of HeaderHash objects Eric Wong
                   ` (3 preceding siblings ...)
  2009-09-28 10:14 ` [PATCH 4/3] HeaderHash#each yields Lint-OK multivalue headers Eric Wong
@ 2010-01-05 23:58 ` Eric Wong
  4 siblings, 0 replies; 6+ messages in thread
From: Eric Wong @ 2010-01-05 23:58 UTC (permalink / raw)
  To: rack-devel

Eric Wong <normalperson@yhbt.net> wrote:
> HeaderHash objects are needlessly created and discarded by various
> pieces of Rack middleware, leading to unnecessary overhead
> from both object creation + GC and iteration overhead via
> headers#each.

Ugh, this series interacts badly with applications and frameworks that
set cookies without using Rack::Utils.  Rails 2.3.5 is one (popular)
example of a framework that sets cookies without help from Rack::Utils.

Below is a patch (against Rails 2.3.5) that pinpoints and fixes the
problematic code:  (I don't expect this patch to be applied)

diff --git a/actionpack/lib/action_controller/session/abstract_store.rb b/actionpack/lib/action_controller/session/abstract_store.rb
index f6369ab..2df45d8 100644
--- a/actionpack/lib/action_controller/session/abstract_store.rb
+++ b/actionpack/lib/action_controller/session/abstract_store.rb
@@ -144,8 +144,9 @@ module ActionController
           cookie << "; HttpOnly" if options[:httponly]
 
           headers = response[1]
-          unless headers[SET_COOKIE].blank?
-            headers[SET_COOKIE] << "\n#{cookie}"
+          set_cookie = headers[SET_COOKIE]
+          unless set_cookie.blank?
+            set_cookie << (Array === set_cookie ? cookie : "\n#{cookie}")
           else
             headers[SET_COOKIE] = cookie
           end

Without the above patch, I end up with a blank cookie with nothing in it
when (using Unicorn):

  Set-Cookie: set_by_app=cookie-1715-; path=/
  Set-Cookie: 
  Set-Cookie: _session_cookie=98e3f12545e70a6b7a943628a3a7578b; path=/; HttpOnly

This was because Rails was blindly prepending "\n" to the cookie thinking
it's hitting a string, but the value of headers[SET_COOKIE] is now an
array...

Below is a potential fix to Rack to fix the problem, but I'm not sure
about the severity of this problem with real HTTP clients (other than 12
bytes).

diff --git a/lib/rack/utils.rb b/lib/rack/utils.rb
index 68fd6ac..10dc8fe 100644
--- a/lib/rack/utils.rb
+++ b/lib/rack/utils.rb
@@ -273,19 +273,23 @@ module Rack
         hash.each { |k, v| self[k] = v }
       end
 
+      def value_string(v)
+        if v.respond_to?(:to_ary)
+          v = v.to_ary.join("\n")
+          v.gsub!(/\n\n+/, "\n")
+        end
+        v
+      end
+
       def each
         super do |k, v|
-          yield(k, v.respond_to?(:to_ary) ? v.to_ary.join("\n") : v)
+          yield(k, value_string(v))
         end
       end
 
       def to_hash
         inject({}) do |hash, (k,v)|
-          if v.respond_to? :to_ary
-            hash[k] = v.to_ary.join("\n")
-          else
-            hash[k] = v
-          end
+          hash[k] = value_string(v)
           hash
         end
       end
---

There's also a one-line fix that I could make to Unicorn by replacing
split(/\n/) with split(/\n+/), which may be a good idea anyways in case
frameworks/apps attempt more "creative" ways to set other headers...

-- 
Eric Wong

^ permalink raw reply related	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2010-01-05 23:58 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-09-04 21:41 [PATCH 0/3] support reuse of HeaderHash objects Eric Wong
2009-09-04 21:41 ` [PATCH 1/3] HeaderHash.new avoids unnecessary object creation Eric Wong
2009-09-04 21:41 ` [PATCH 2/3] avoid HeaderHash#to_hash in middlewares Eric Wong
2009-09-04 21:41 ` [PATCH 3/3] CommonLogger uses HeaderHash to lookup Content-Length Eric Wong
2009-09-28 10:14 ` [PATCH 4/3] HeaderHash#each yields Lint-OK multivalue headers Eric Wong
2010-01-05 23:58 ` [PATCH 0/3] support reuse of HeaderHash objects Eric Wong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).