rack-devel archive mirror (unofficial) https://groups.google.com/group/rack-devel
 help / color / mirror / Atom feed
From: Hongli Lai <hongli@phusion.nl>
To: Rack Development <rack-devel@googlegroups.com>
Subject: Re: Rack environment encoding
Date: Mon, 13 Sep 2010 06:56:42 -0700 (PDT)	[thread overview]
Message-ID: <019c87a8-c806-4ad9-8380-1cea72d0cea9@c13g2000vbr.googlegroups.com> (raw)
In-Reply-To: <729D0748-DCFF-4643-AB90-3250D40D2370@gmail.com>

On Sep 12, 8:00 pm, James Tucker <jftuc...@gmail.com> wrote:
> I'm not sure, but I think they don't expect them to be utf-8, they actually expect them to be compatible with literals.

Yes. I'm fine with US-ASCII for PATH_INFO and friends as long as
comment #16 in the bug report doesn't result in breakage anymore.


> > If the app does
> > something like
> >  some_utf8_string + env['PATH_INFO']
> > then Ruby 1.9 will complain with an incompatible encoding error.
>
> On your system.

No. Specifically, it breaks in Phusion Passenger because we set the
encoding of the entire environment hash to binary, regardless of the
system encoding, exactly to prevent data loss as you've mentioned
earlier. However setting everything to binary results in breakages as
described in the bug report which is the reason why I proposed setting
some things to UTF-8/ASCII/whatever and other things to binary.


> Rails does a lot of work on the /client side/ to try and ensure it receives UTF-8, and tries to enforce UTF-8 elsewhere. Rack can't enforce this as it doesn't operate client side (build forms). It's also worth noting that rails accepts a percentile use case hit here, whereby it makes no attempt to expect full support for encodings that can't round-trip through unicode. For them this is sensible, and maybe it might be for us, but this is why I need particularly CP932 users to actually pay attention here. Until I hear from someone who deals with these issues in the real world, I cannot defer to the advice "just use unicode". Alas, one of the larger issues here is that I don't speak the languages required to actually track down most of these users, so I need help from people who do. I hope there's someone on this list proactive enough to do this, or knows someone to call on.

Woah, I think we have a misunderstanding here. I started this thread
to discuss what env['something'].encoding should return. Whether
env['something'] actually contains UTF-8 data is a different
discussion.

To re-iterate: the problem that we're running into is that
env['something'].encoding always returns #<Encoding: binary> in
Phusion Passenger, even if env['something'] contains valid UTF-8 data.
Should env['something'] - assuming it contains valid UTF-8 data or
ASCII data or whatever - have its #encoding return #<Encoding: UTF-8>?


Of course, the easiest way to solve this problem is to mandate all
Rack web servers to set the encoding to binary have the frameworks
deal with conversions.

  parent reply	other threads:[~2010-09-13 13:56 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-09-12 16:48 Rack environment encoding Hongli Lai
2010-09-12 18:00 ` James Tucker
2010-09-12 18:23   ` Steve Klabnik
2010-09-13 13:56   ` Hongli Lai [this message]
2010-09-13  4:21 ` Yehuda Katz
2010-09-13  9:05   ` naruse
2010-09-13 14:08   ` Hongli Lai
2010-09-15  1:23     ` naruse

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-list from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://groups.google.com/group/rack-devel

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=019c87a8-c806-4ad9-8380-1cea72d0cea9@c13g2000vbr.googlegroups.com \
    --to=rack-devel@googlegroups.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).