* [ruby-core:82922] [Ruby trunk Bug#13926] Non UTF response headers raise an Argument error since 2.4.2p198
[not found] <redmine.issue-13926.20170921180145@ruby-lang.org>
@ 2017-09-21 18:01 ` peterejhamilton
2017-09-22 4:03 ` [ruby-core:82928] " shevegen
` (4 subsequent siblings)
5 siblings, 0 replies; 6+ messages in thread
From: peterejhamilton @ 2017-09-21 18:01 UTC (permalink / raw
To: ruby-core
Issue #13926 has been reported by petehamilton (Pete Hamilton).
----------------------------------------
Bug #13926: Non UTF response headers raise an Argument error since 2.4.2p198
https://bugs.ruby-lang.org/issues/13926
* Author: petehamilton (Pete Hamilton)
* Status: Open
* Priority: Normal
* Assignee:
* Target version:
* ruby -v: ruby 2.4.2p198 (2017-09-14 revision 59899) [x86_64-darwin16]
* Backport: 2.2: UNKNOWN, 2.3: UNKNOWN, 2.4: UNKNOWN
----------------------------------------
When setting headers using `Net::HTTPHeader#add_header` or `Net::HTTPHeader#[]=` in v2.4.2, an `ArgumentError (invalid byte sequence in UTF-8)` is raised.
In 2.4.1, this behaviour didn't exist and it looks like it was introduced in one of the revisions associated with https://bugs.ruby-lang.org/issues/13852, where the header value is matched against a regular expression to prevent newlines.
Previously, `Net::HTTP` would accept non-UTF8 header values and just return them as invalid UTF8 strings. It was then on the user of `Net::HTTP` to handle this. With this change, there's now no way for the user to handle the case where they receive non-UTF8 header values as `Net::HTTP` raises an error.
[RFC2616](https://tools.ietf.org/html/rfc2616#section-4.2) allowed an HTTP header field content to be made up of any non-whitespace octets. Because of this [RFC7230](https://tools.ietf.org/html/rfc7230#section-3.2.4) makes an allowance for all characters in the ISO-8859-1 charset (both lower and extended ASCII characters).
Specifically, this section of RFC7230 suggests that although ideally response header values would be compatible with UTF-8, we can't assume this to be the case.
> Historically, HTTP has allowed field content with text in the
> ISO-8859-1 charset [ISO-8859-1], supporting other charsets only
> through use of [RFC2047] encoding. In practice, most HTTP header
> field values use only a subset of the US-ASCII charset [USASCII].
> Newly defined header fields SHOULD limit their field values to
> US-ASCII octets. A recipient SHOULD treat other octets in field
> content (obs-text) as opaque data.
Not entirely sure where to go from here or what the fix is but given this is a behaviour change, it'd be great to hear your thoughts.
---Files--------------------------------
net_http_utf8_tests.patch (1.14 KB)
--
https://bugs.ruby-lang.org/
^ permalink raw reply [flat|nested] 6+ messages in thread
* [ruby-core:82928] [Ruby trunk Bug#13926] Non UTF response headers raise an Argument error since 2.4.2p198
[not found] <redmine.issue-13926.20170921180145@ruby-lang.org>
2017-09-21 18:01 ` [ruby-core:82922] [Ruby trunk Bug#13926] Non UTF response headers raise an Argument error since 2.4.2p198 peterejhamilton
@ 2017-09-22 4:03 ` shevegen
2017-09-22 8:24 ` [ruby-core:82932] " peterejhamilton
` (3 subsequent siblings)
5 siblings, 0 replies; 6+ messages in thread
From: shevegen @ 2017-09-22 4:03 UTC (permalink / raw
To: ruby-core
Issue #13926 has been updated by shevegen (Robert A. Heiler).
Can you add a link to Net::HTTPHeader#add_header?
I was trying to find it but I can not find it at https://ruby-doc.org/stdlib/libdoc/net/http/rdoc/Net/HTTPHeader.html - I do find []= though https://ruby-doc.org/stdlib-2.4.2/libdoc/net/http/rdoc/Net/HTTPHeader.html#method-i-5B-5D-3D and the documentation does not mention anything about Encoding.
I think either way, the behaviour in regards to Encoding should be noted down in the documentation as well. I assume that the change introduced a regression but I have absolutely no idea; the documentation should mention this somewhere though, either at the method, or at the main page or Net::HTTPHeader imo.
----------------------------------------
Bug #13926: Non UTF response headers raise an Argument error since 2.4.2p198
https://bugs.ruby-lang.org/issues/13926#change-66829
* Author: petehamilton (Pete Hamilton)
* Status: Open
* Priority: Normal
* Assignee:
* Target version:
* ruby -v: ruby 2.4.2p198 (2017-09-14 revision 59899) [x86_64-darwin16]
* Backport: 2.2: UNKNOWN, 2.3: UNKNOWN, 2.4: UNKNOWN
----------------------------------------
When setting headers using `Net::HTTPHeader#add_header` or `Net::HTTPHeader#[]=` in v2.4.2, an `ArgumentError (invalid byte sequence in UTF-8)` is raised.
In 2.4.1, this behaviour didn't exist and it looks like it was introduced in one of the revisions associated with https://bugs.ruby-lang.org/issues/13852, where the header value is matched against a regular expression to prevent newlines.
Previously, `Net::HTTP` would accept non-UTF8 header values and just return them as invalid UTF8 strings. It was then on the user of `Net::HTTP` to handle this. With this change, there's now no way for the user to handle the case where they receive non-UTF8 header values as `Net::HTTP` raises an error.
[RFC2616](https://tools.ietf.org/html/rfc2616#section-4.2) allowed an HTTP header field content to be made up of any non-whitespace octets. Because of this [RFC7230](https://tools.ietf.org/html/rfc7230#section-3.2.4) makes an allowance for all characters in the ISO-8859-1 charset (both lower and extended ASCII characters).
Specifically, this section of RFC7230 suggests that although ideally response header values would be compatible with UTF-8, we can't assume this to be the case.
> Historically, HTTP has allowed field content with text in the
> ISO-8859-1 charset [ISO-8859-1], supporting other charsets only
> through use of [RFC2047] encoding. In practice, most HTTP header
> field values use only a subset of the US-ASCII charset [USASCII].
> Newly defined header fields SHOULD limit their field values to
> US-ASCII octets. A recipient SHOULD treat other octets in field
> content (obs-text) as opaque data.
Not entirely sure where to go from here or what the fix is but given this is a behaviour change, it'd be great to hear your thoughts.
---Files--------------------------------
net_http_utf8_tests.patch (1.14 KB)
--
https://bugs.ruby-lang.org/
^ permalink raw reply [flat|nested] 6+ messages in thread
* [ruby-core:82932] [Ruby trunk Bug#13926] Non UTF response headers raise an Argument error since 2.4.2p198
[not found] <redmine.issue-13926.20170921180145@ruby-lang.org>
2017-09-21 18:01 ` [ruby-core:82922] [Ruby trunk Bug#13926] Non UTF response headers raise an Argument error since 2.4.2p198 peterejhamilton
2017-09-22 4:03 ` [ruby-core:82928] " shevegen
@ 2017-09-22 8:24 ` peterejhamilton
2017-09-22 8:29 ` [ruby-core:82933] " peterejhamilton
` (2 subsequent siblings)
5 siblings, 0 replies; 6+ messages in thread
From: peterejhamilton @ 2017-09-22 8:24 UTC (permalink / raw
To: ruby-core
Issue #13926 has been updated by petehamilton (Pete Hamilton).
When setting headers using Net::HTTPHeader#add_field or Net::HTTPHeader#[]= in v2.4.2, an ArgumentError (invalid byte sequence in UTF-8) is raised.
In 2.4.1, this behaviour didn't exist and it looks like it was introduced in one of the revisions associated with https://bugs.ruby-lang.org/issues/13852, where the header value is matched against a regular expression to prevent newlines.
Previously, these methods would accept non-UTF8 header values and just return them as invalid UTF8 strings. It was then on the user of Net::HTTP to handle this. With this change, there's now no way for the user to handle the case where they receive non-UTF8 header values as Net::HTTP raises an error.
RFC2616 allowed an HTTP header field content to be made up of any non-whitespace octets. Because of this RFC7230 makes an allowance for all characters in the ISO-8859-1 charset (both lower and extended ASCII characters).
Specifically, this section of RFC7230 suggests that although ideally response header values would be compatible with UTF-8, we can't assume this to be the case.
Historically, HTTP has allowed field content with text in the
ISO-8859-1 charset [ISO-8859-1], supporting other charsets only
through use of [RFC2047] encoding. In practice, most HTTP header
field values use only a subset of the US-ASCII charset [USASCII].
Newly defined header fields SHOULD limit their field values to
US-ASCII octets. A recipient SHOULD treat other octets in field
content (obs-text) as opaque data.
Not entirely sure where to go from here or what the fix is but given this is a behaviour change, it'd be great to hear your thoughts.
----------------------------------------
Bug #13926: Non UTF response headers raise an Argument error since 2.4.2p198
https://bugs.ruby-lang.org/issues/13926#change-66833
* Author: petehamilton (Pete Hamilton)
* Status: Open
* Priority: Normal
* Assignee:
* Target version:
* ruby -v: ruby 2.4.2p198 (2017-09-14 revision 59899) [x86_64-darwin16]
* Backport: 2.2: UNKNOWN, 2.3: UNKNOWN, 2.4: UNKNOWN
----------------------------------------
When setting headers using `Net::HTTPHeader#add_header` or `Net::HTTPHeader#[]=` in v2.4.2, an `ArgumentError (invalid byte sequence in UTF-8)` is raised.
In 2.4.1, this behaviour didn't exist and it looks like it was introduced in one of the revisions associated with https://bugs.ruby-lang.org/issues/13852, where the header value is matched against a regular expression to prevent newlines.
Previously, `Net::HTTP` would accept non-UTF8 header values and just return them as invalid UTF8 strings. It was then on the user of `Net::HTTP` to handle this. With this change, there's now no way for the user to handle the case where they receive non-UTF8 header values as `Net::HTTP` raises an error.
[RFC2616](https://tools.ietf.org/html/rfc2616#section-4.2) allowed an HTTP header field content to be made up of any non-whitespace octets. Because of this [RFC7230](https://tools.ietf.org/html/rfc7230#section-3.2.4) makes an allowance for all characters in the ISO-8859-1 charset (both lower and extended ASCII characters).
Specifically, this section of RFC7230 suggests that although ideally response header values would be compatible with UTF-8, we can't assume this to be the case.
> Historically, HTTP has allowed field content with text in the
> ISO-8859-1 charset [ISO-8859-1], supporting other charsets only
> through use of [RFC2047] encoding. In practice, most HTTP header
> field values use only a subset of the US-ASCII charset [USASCII].
> Newly defined header fields SHOULD limit their field values to
> US-ASCII octets. A recipient SHOULD treat other octets in field
> content (obs-text) as opaque data.
Not entirely sure where to go from here or what the fix is but given this is a behaviour change, it'd be great to hear your thoughts.
---Files--------------------------------
net_http_utf8_tests.patch (1.14 KB)
--
https://bugs.ruby-lang.org/
^ permalink raw reply [flat|nested] 6+ messages in thread
* [ruby-core:82933] [Ruby trunk Bug#13926] Non UTF response headers raise an Argument error since 2.4.2p198
[not found] <redmine.issue-13926.20170921180145@ruby-lang.org>
` (2 preceding siblings ...)
2017-09-22 8:24 ` [ruby-core:82932] " peterejhamilton
@ 2017-09-22 8:29 ` peterejhamilton
2017-12-20 15:59 ` [ruby-core:84374] " nagachika00
2018-01-31 11:47 ` [ruby-core:85295] " usa
5 siblings, 0 replies; 6+ messages in thread
From: peterejhamilton @ 2017-09-22 8:29 UTC (permalink / raw
To: ruby-core
Issue #13926 has been updated by petehamilton (Pete Hamilton).
shevegen (Robert A. Heiler) wrote:
> Can you add a link to Net::HTTPHeader#add_header?
>
> I was trying to find it but I can not find it at https://ruby-doc.org/stdlib/libdoc/net/http/rdoc/Net/HTTPHeader.html
Apologies, I meant `Net::HTTPHeader#add_field` (https://ruby-doc.org/stdlib-2.4.2/libdoc/net/http/rdoc/Net/HTTPHeader.html#method-i-add_field). I would update the ticket description but I don't seem to be able to?
----------------------------------------
Bug #13926: Non UTF response headers raise an Argument error since 2.4.2p198
https://bugs.ruby-lang.org/issues/13926#change-66834
* Author: petehamilton (Pete Hamilton)
* Status: Open
* Priority: Normal
* Assignee:
* Target version:
* ruby -v: ruby 2.4.2p198 (2017-09-14 revision 59899) [x86_64-darwin16]
* Backport: 2.2: UNKNOWN, 2.3: UNKNOWN, 2.4: UNKNOWN
----------------------------------------
When setting headers using `Net::HTTPHeader#add_header` or `Net::HTTPHeader#[]=` in v2.4.2, an `ArgumentError (invalid byte sequence in UTF-8)` is raised.
In 2.4.1, this behaviour didn't exist and it looks like it was introduced in one of the revisions associated with https://bugs.ruby-lang.org/issues/13852, where the header value is matched against a regular expression to prevent newlines.
Previously, `Net::HTTP` would accept non-UTF8 header values and just return them as invalid UTF8 strings. It was then on the user of `Net::HTTP` to handle this. With this change, there's now no way for the user to handle the case where they receive non-UTF8 header values as `Net::HTTP` raises an error.
[RFC2616](https://tools.ietf.org/html/rfc2616#section-4.2) allowed an HTTP header field content to be made up of any non-whitespace octets. Because of this [RFC7230](https://tools.ietf.org/html/rfc7230#section-3.2.4) makes an allowance for all characters in the ISO-8859-1 charset (both lower and extended ASCII characters).
Specifically, this section of RFC7230 suggests that although ideally response header values would be compatible with UTF-8, we can't assume this to be the case.
> Historically, HTTP has allowed field content with text in the
> ISO-8859-1 charset [ISO-8859-1], supporting other charsets only
> through use of [RFC2047] encoding. In practice, most HTTP header
> field values use only a subset of the US-ASCII charset [USASCII].
> Newly defined header fields SHOULD limit their field values to
> US-ASCII octets. A recipient SHOULD treat other octets in field
> content (obs-text) as opaque data.
Not entirely sure where to go from here or what the fix is but given this is a behaviour change, it'd be great to hear your thoughts.
---Files--------------------------------
net_http_utf8_tests.patch (1.14 KB)
--
https://bugs.ruby-lang.org/
^ permalink raw reply [flat|nested] 6+ messages in thread
* [ruby-core:84374] [Ruby trunk Bug#13926] Non UTF response headers raise an Argument error since 2.4.2p198
[not found] <redmine.issue-13926.20170921180145@ruby-lang.org>
` (3 preceding siblings ...)
2017-09-22 8:29 ` [ruby-core:82933] " peterejhamilton
@ 2017-12-20 15:59 ` nagachika00
2018-01-31 11:47 ` [ruby-core:85295] " usa
5 siblings, 0 replies; 6+ messages in thread
From: nagachika00 @ 2017-12-20 15:59 UTC (permalink / raw
To: ruby-core
Issue #13926 has been updated by nagachika (Tomoyuki Chikanaga).
Backport changed from 2.2: DONTNEED, 2.3: REQUIRED, 2.4: REQUIRED to 2.2: DONTNEED, 2.3: REQUIRED, 2.4: DONE
ruby_2_4 r61373 merged revision(s) 60021.
----------------------------------------
Bug #13926: Non UTF response headers raise an Argument error since 2.4.2p198
https://bugs.ruby-lang.org/issues/13926#change-68563
* Author: petehamilton (Pete Hamilton)
* Status: Closed
* Priority: Normal
* Assignee:
* Target version:
* ruby -v: ruby 2.4.2p198 (2017-09-14 revision 59899) [x86_64-darwin16]
* Backport: 2.2: DONTNEED, 2.3: REQUIRED, 2.4: DONE
----------------------------------------
When setting headers using `Net::HTTPHeader#add_field` or `Net::HTTPHeader#[]=` in v2.4.2, an `ArgumentError (invalid byte sequence in UTF-8)` is raised.
In 2.4.1, this behaviour didn't exist and it looks like it was introduced in one of the revisions associated with https://bugs.ruby-lang.org/issues/13852, where the header value is matched against a regular expression to prevent newlines.
Previously, `Net::HTTP` would accept non-UTF8 header values and just return them as invalid UTF8 strings. It was then on the user of `Net::HTTP` to handle this. With this change, there's now no way for the user to handle the case where they receive non-UTF8 header values as `Net::HTTP` raises an error.
[RFC2616](https://tools.ietf.org/html/rfc2616#section-4.2) allowed an HTTP header field content to be made up of any non-whitespace octets. Because of this [RFC7230](https://tools.ietf.org/html/rfc7230#section-3.2.4) makes an allowance for all characters in the ISO-8859-1 charset (both lower and extended ASCII characters).
Specifically, this section of RFC7230 suggests that although ideally response header values would be compatible with UTF-8, we can't assume this to be the case.
> Historically, HTTP has allowed field content with text in the
> ISO-8859-1 charset [ISO-8859-1], supporting other charsets only
> through use of [RFC2047] encoding. In practice, most HTTP header
> field values use only a subset of the US-ASCII charset [USASCII].
> Newly defined header fields SHOULD limit their field values to
> US-ASCII octets. A recipient SHOULD treat other octets in field
> content (obs-text) as opaque data.
Not entirely sure where to go from here or what the fix is but given this is a behaviour change, it'd be great to hear your thoughts.
---Files--------------------------------
net_http_utf8_tests.patch (1.14 KB)
--
https://bugs.ruby-lang.org/
^ permalink raw reply [flat|nested] 6+ messages in thread
* [ruby-core:85295] [Ruby trunk Bug#13926] Non UTF response headers raise an Argument error since 2.4.2p198
[not found] <redmine.issue-13926.20170921180145@ruby-lang.org>
` (4 preceding siblings ...)
2017-12-20 15:59 ` [ruby-core:84374] " nagachika00
@ 2018-01-31 11:47 ` usa
5 siblings, 0 replies; 6+ messages in thread
From: usa @ 2018-01-31 11:47 UTC (permalink / raw
To: ruby-core
Issue #13926 has been updated by usa (Usaku NAKAMURA).
Backport changed from 2.2: DONTNEED, 2.3: REQUIRED, 2.4: DONE to 2.2: DONTNEED, 2.3: DONE, 2.4: DONE
ruby_2_3 r62133 merged revision(s) 60021.
----------------------------------------
Bug #13926: Non UTF response headers raise an Argument error since 2.4.2p198
https://bugs.ruby-lang.org/issues/13926#change-70076
* Author: petehamilton (Pete Hamilton)
* Status: Closed
* Priority: Normal
* Assignee:
* Target version:
* ruby -v: ruby 2.4.2p198 (2017-09-14 revision 59899) [x86_64-darwin16]
* Backport: 2.2: DONTNEED, 2.3: DONE, 2.4: DONE
----------------------------------------
When setting headers using `Net::HTTPHeader#add_field` or `Net::HTTPHeader#[]=` in v2.4.2, an `ArgumentError (invalid byte sequence in UTF-8)` is raised.
In 2.4.1, this behaviour didn't exist and it looks like it was introduced in one of the revisions associated with https://bugs.ruby-lang.org/issues/13852, where the header value is matched against a regular expression to prevent newlines.
Previously, `Net::HTTP` would accept non-UTF8 header values and just return them as invalid UTF8 strings. It was then on the user of `Net::HTTP` to handle this. With this change, there's now no way for the user to handle the case where they receive non-UTF8 header values as `Net::HTTP` raises an error.
[RFC2616](https://tools.ietf.org/html/rfc2616#section-4.2) allowed an HTTP header field content to be made up of any non-whitespace octets. Because of this [RFC7230](https://tools.ietf.org/html/rfc7230#section-3.2.4) makes an allowance for all characters in the ISO-8859-1 charset (both lower and extended ASCII characters).
Specifically, this section of RFC7230 suggests that although ideally response header values would be compatible with UTF-8, we can't assume this to be the case.
> Historically, HTTP has allowed field content with text in the
> ISO-8859-1 charset [ISO-8859-1], supporting other charsets only
> through use of [RFC2047] encoding. In practice, most HTTP header
> field values use only a subset of the US-ASCII charset [USASCII].
> Newly defined header fields SHOULD limit their field values to
> US-ASCII octets. A recipient SHOULD treat other octets in field
> content (obs-text) as opaque data.
Not entirely sure where to go from here or what the fix is but given this is a behaviour change, it'd be great to hear your thoughts.
---Files--------------------------------
net_http_utf8_tests.patch (1.14 KB)
--
https://bugs.ruby-lang.org/
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2018-01-31 11:48 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <redmine.issue-13926.20170921180145@ruby-lang.org>
2017-09-21 18:01 ` [ruby-core:82922] [Ruby trunk Bug#13926] Non UTF response headers raise an Argument error since 2.4.2p198 peterejhamilton
2017-09-22 4:03 ` [ruby-core:82928] " shevegen
2017-09-22 8:24 ` [ruby-core:82932] " peterejhamilton
2017-09-22 8:29 ` [ruby-core:82933] " peterejhamilton
2017-12-20 15:59 ` [ruby-core:84374] " nagachika00
2018-01-31 11:47 ` [ruby-core:85295] " usa
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).