rack-devel archive mirror (unofficial) https://groups.google.com/group/rack-devel
 help / color / mirror / Atom feed
* bug report and unit test for infinite loop parsing Content-Disposion header
@ 2012-05-04 21:37 Paul Rogers
  2012-05-04 23:34 ` Eric Wong
  0 siblings, 1 reply; 5+ messages in thread
From: Paul Rogers @ 2012-05-04 21:37 UTC (permalink / raw)
  To: Rack Development

Hi,

I created this

git@github.com:paulrogers/rack.git

showing a test that seems to have an infinite loop issue when parsing
a multipart form.

you can run the test using

bacon -I./lib:./test -a -t 'Rack::Multipart'

What seems to happen is that when parsing a header like this

Content-Disposition: inline; name=xml_product_config;
filename=XML_PRODUCT_CONFIG.xml

the regexp in the get_filename method in parser.rb seems to get stuck
in an infinite loop on   the line with

if head =~ RFC2183

This happens in the tests as well as in the unit test in the attached
git commit ( is that the correct term?)

Id be grateful if some one can take a look.

Thanks,,
Paul

^ permalink raw reply	[flat|nested] 5+ messages in thread

* bug report and unit test for infinite loop parsing Content-Disposion header
@ 2012-05-04 21:37 Paul Rogers
  0 siblings, 0 replies; 5+ messages in thread
From: Paul Rogers @ 2012-05-04 21:37 UTC (permalink / raw)
  To: Rack Development

Hi,

I created this

git@github.com:paulrogers/rack.git

showing a test that seems to have an infinite loop issue when parsing
a multipart form.

you can run the test using

bacon -I./lib:./test -a -t 'Rack::Multipart'

What seems to happen is that when parsing a header like this

Content-Disposition: inline; name=xml_product_config;
filename=XML_PRODUCT_CONFIG.xml

the regexp in the get_filename method in parser.rb seems to get stuck
in an infinite loop on   the line with

if head =~ RFC2183

This happens in the tests as well as in the unit test in the attached
git commit ( is that the correct term?)

Id be grateful if some one can take a look.

Thanks,,
Paul

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: bug report and unit test for infinite loop parsing Content-Disposion header
  2012-05-04 21:37 bug report and unit test for infinite loop parsing Content-Disposion header Paul Rogers
@ 2012-05-04 23:34 ` Eric Wong
  2012-05-07  0:39   ` Lawrence Pit
  0 siblings, 1 reply; 5+ messages in thread
From: Eric Wong @ 2012-05-04 23:34 UTC (permalink / raw)
  To: rack-devel

Paul Rogers <pmr16366@gmail.com> wrote:
> the regexp in the get_filename method in parser.rb seems to get stuck
> in an infinite loop on   the line with
> 
> if head =~ RFC2183

This is an unfortunate issue of the type of regexp engine used by Ruby

> This happens in the tests as well as in the unit test in the attached
> git commit ( is that the correct term?)

I'm not a regexp/finite-automata expert, but having multiple '*' or
mixing '*'/'+' in a regexp can be problematic.

I think the following should fix your issue (but I'm not sure it's
correct):

diff --git a/lib/rack/multipart.rb b/lib/rack/multipart.rb
index 3777106..6849248 100644
--- a/lib/rack/multipart.rb
+++ b/lib/rack/multipart.rb
@@ -12,7 +12,7 @@ module Rack
     MULTIPART = %r|\Amultipart/.*boundary=\"?([^\";,]+)\"?|n
     TOKEN = /[^\s()<>,;:\\"\/\[\]?=]+/
     CONDISP = /Content-Disposition:\s*#{TOKEN}\s*/i
-    DISPPARM = /;\s*(#{TOKEN})=("(?:\\"|[^"])*"|#{TOKEN})*/
+    DISPPARM = /;\s*(#{TOKEN})=("(?:\\"|[^"])*"|#{TOKEN})/
     RFC2183 = /^#{CONDISP}(#{DISPPARM})+$/i
     BROKEN_QUOTED = /^#{CONDISP}.*;\sfilename="(.*?)"(?:\s*$|\s*;\s*#{TOKEN}=)/i
     BROKEN_UNQUOTED = /^#{CONDISP}.*;\sfilename=(#{TOKEN})/i

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: bug report and unit test for infinite loop parsing Content-Disposion header
  2012-05-04 23:34 ` Eric Wong
@ 2012-05-07  0:39   ` Lawrence Pit
  2012-05-07 15:04     ` Paul Rogers
  0 siblings, 1 reply; 5+ messages in thread
From: Lawrence Pit @ 2012-05-07  0:39 UTC (permalink / raw)
  To: rack-devel


Given the value of DISPPARM must always have at least 1 character (according to RFC2183 and RFC2045) that fix seems correct to me.

In addition I would make the TOKEN regexp non-greedy (for the BROKEN_UNQUOTED case):

    TOKEN = /[^\s()<>,;:\\"\/\[\]?=]+?/

Also, why is the "@" character accepted as part of a TOKEN? It is part of the tspecials (in RFC2045), so I think it should not be accepted as a valid token character.


Cheers,
Lawrence


> I think the following should fix your issue (but I'm not sure it's
> correct):
> 
> diff --git a/lib/rack/multipart.rb b/lib/rack/multipart.rb
> index 3777106..6849248 100644
> --- a/lib/rack/multipart.rb
> +++ b/lib/rack/multipart.rb
> @@ -12,7 +12,7 @@ module Rack
>     MULTIPART = %r|\Amultipart/.*boundary=\"?([^\";,]+)\"?|n
>     TOKEN = /[^\s()<>,;:\\"\/\[\]?=]+/
>     CONDISP = /Content-Disposition:\s*#{TOKEN}\s*/i
> -    DISPPARM = /;\s*(#{TOKEN})=("(?:\\"|[^"])*"|#{TOKEN})*/
> +    DISPPARM = /;\s*(#{TOKEN})=("(?:\\"|[^"])*"|#{TOKEN})/
>     RFC2183 = /^#{CONDISP}(#{DISPPARM})+$/i
>     BROKEN_QUOTED = /^#{CONDISP}.*;\sfilename="(.*?)"(?:\s*$|\s*;\s*#{TOKEN}=)/i
>     BROKEN_UNQUOTED = /^#{CONDISP}.*;\sfilename=(#{TOKEN})/i

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: bug report and unit test for infinite loop parsing Content-Disposion header
  2012-05-07  0:39   ` Lawrence Pit
@ 2012-05-07 15:04     ` Paul Rogers
  0 siblings, 0 replies; 5+ messages in thread
From: Paul Rogers @ 2012-05-07 15:04 UTC (permalink / raw)
  To: Rack Development

Thanks for the responses, and sorry for the double posting, not sure
what happened there.

I also found I can quote the filename which passes the tests. The app
Im using this in is a mock for another service, and I'll have to check
if the real service accepts a quoted string.

I'll also try these fixes in case that works better for me

Thanks

Paul

On May 6, 6:39 pm, Lawrence Pit <lawrence....@gmail.com> wrote:
> Given the value of DISPPARM must always have at least 1 character (according to RFC2183 and RFC2045) that fix seems correct to me.
>
> In addition I would make the TOKEN regexp non-greedy (for the BROKEN_UNQUOTED case):
>
>     TOKEN = /[^\s()<>,;:\\"\/\[\]?=]+?/
>
> Also, why is the "@" character accepted as part of a TOKEN? It is part of the tspecials (in RFC2045), so I think it should not be accepted as a valid token character.
>
> Cheers,
> Lawrence
>
>
>
>
>
>
>
> > I think the following should fix your issue (but I'm not sure it's
> > correct):
>
> > diff --git a/lib/rack/multipart.rb b/lib/rack/multipart.rb
> > index 3777106..6849248 100644
> > --- a/lib/rack/multipart.rb
> > +++ b/lib/rack/multipart.rb
> > @@ -12,7 +12,7 @@ module Rack
> >     MULTIPART = %r|\Amultipart/.*boundary=\"?([^\";,]+)\"?|n
> >     TOKEN = /[^\s()<>,;:\\"\/\[\]?=]+/
> >     CONDISP = /Content-Disposition:\s*#{TOKEN}\s*/i
> > -    DISPPARM = /;\s*(#{TOKEN})=("(?:\\"|[^"])*"|#{TOKEN})*/
> > +    DISPPARM = /;\s*(#{TOKEN})=("(?:\\"|[^"])*"|#{TOKEN})/
> >     RFC2183 = /^#{CONDISP}(#{DISPPARM})+$/i
> >     BROKEN_QUOTED = /^#{CONDISP}.*;\sfilename="(.*?)"(?:\s*$|\s*;\s*#{TOKEN}=)/i
> >     BROKEN_UNQUOTED = /^#{CONDISP}.*;\sfilename=(#{TOKEN})/i

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2012-05-07 15:04 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-05-04 21:37 bug report and unit test for infinite loop parsing Content-Disposion header Paul Rogers
2012-05-04 23:34 ` Eric Wong
2012-05-07  0:39   ` Lawrence Pit
2012-05-07 15:04     ` Paul Rogers
  -- strict thread matches above, loose matches on Subject: below --
2012-05-04 21:37 Paul Rogers

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).