ruby-core@ruby-lang.org archive (unofficial mirror)
 help / color / mirror / Atom feed
* [ruby-core:55590] [ruby-trunk - Bug #8560][Open] CSV, skip_lines option causes error when passing a string
@ 2013-06-22  6:53 kstevens715 (Kyle Stevens)
  2013-06-22  6:59 ` [ruby-core:55591] [ruby-trunk - Bug #8560] " charliesome (Charlie Somerville)
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: kstevens715 (Kyle Stevens) @ 2013-06-22  6:53 UTC (permalink / raw
  To: ruby-core


Issue #8560 has been reported by kstevens715 (Kyle Stevens).

----------------------------------------
Bug #8560: CSV, skip_lines option causes error when passing a string
https://bugs.ruby-lang.org/issues/8560

Author: kstevens715 (Kyle Stevens)
Status: Open
Priority: Low
Assignee: 
Category: 
Target version: 
ruby -v: ruby 2.0.0p0 (2013-02-24 revision 39474) [x86_64-linux]
Backport: 1.9.3: UNKNOWN, 2.0.0: UNKNOWN


There seems to be a bug in the CSV class when using the skip_lines option.This option is currently undocumented, but according to the GitHub PR, it accepts any object that responds to `match`. String responds to match, so one would imagine a string can be used. However, String#match can take either a Regexp or another String. If the argument passed is a string, it will first be converted to a Regexp.

So if you pass a string to the skip_lines option, it will attempt to convert the row in the csv file to a Regexp. This doesn't make sense, and it can also lead to exceptions. For example, if the csv contains a row with the data, "# )", you will get an error, "RegexpError: unmatched close parenthesis: /# )/".

My particular use case when I found this problem was I was trying to ignore lines beginning with a "#". 

This was my first, unsuccessful attempt:
csv = CSV.open(FILE_NAME, skip_lines: "#", encoding: "ISO8859-1")

What I ended up having to do was:
csv = CSV.open(FILE_NAME, skip_lines: /\A#/, encoding: "ISO8859-1")

This isn't a huge problem, since there's a perfectly acceptable work-around. However, it would be a very easy mistake to make and it could be a difficult problem for someone to debug. It could lead to quite strange behavior if each row in the csv is converted to a Regexp.

I think the skip_lines option should be converted to a Regexp if it's a string, because the alternative is the CSV row is going to be converted to a Regexp.

My proposal is if a string is passed to skip_lines, it should be converted to a regular expression to match the beginning of a line, excluding whitespace:
"#" => /\A\s+#/

I'd be willing to work on a pull request to implement a fix, but I'd love to hear some feedback first. I definitely think this should be fixed, but I'm not positive my solution is the best option...

Here is the original pull request that implemented this option:
https://github.com/ruby/ruby/pull/161

Thank you



-- 
http://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [ruby-core:55591] [ruby-trunk - Bug #8560] CSV, skip_lines option causes error when passing a string
  2013-06-22  6:53 [ruby-core:55590] [ruby-trunk - Bug #8560][Open] CSV, skip_lines option causes error when passing a string kstevens715 (Kyle Stevens)
@ 2013-06-22  6:59 ` charliesome (Charlie Somerville)
  2013-07-10  4:00   ` [ruby-core:55900] " Zachary Scott
  2013-11-23 19:08 ` [ruby-core:58537] " kstevens715 (Kyle Stevens)
  2013-11-23 23:14 ` [ruby-core:58540] [ruby-trunk - Bug #8560][Closed] " JEG2 (James Gray)
  2 siblings, 1 reply; 5+ messages in thread
From: charliesome (Charlie Somerville) @ 2013-06-22  6:59 UTC (permalink / raw
  To: ruby-core


Issue #8560 has been updated by charliesome (Charlie Somerville).

Assignee set to JEG2 (James Gray)


----------------------------------------
Bug #8560: CSV, skip_lines option causes error when passing a string
https://bugs.ruby-lang.org/issues/8560#change-40080

Author: kstevens715 (Kyle Stevens)
Status: Open
Priority: Low
Assignee: JEG2 (James Gray)
Category: 
Target version: 
ruby -v: ruby 2.0.0p0 (2013-02-24 revision 39474) [x86_64-linux]
Backport: 1.9.3: UNKNOWN, 2.0.0: UNKNOWN


There seems to be a bug in the CSV class when using the skip_lines option.This option is currently undocumented, but according to the GitHub PR, it accepts any object that responds to `match`. String responds to match, so one would imagine a string can be used. However, String#match can take either a Regexp or another String. If the argument passed is a string, it will first be converted to a Regexp.

So if you pass a string to the skip_lines option, it will attempt to convert the row in the csv file to a Regexp. This doesn't make sense, and it can also lead to exceptions. For example, if the csv contains a row with the data, "# )", you will get an error, "RegexpError: unmatched close parenthesis: /# )/".

My particular use case when I found this problem was I was trying to ignore lines beginning with a "#". 

This was my first, unsuccessful attempt:
csv = CSV.open(FILE_NAME, skip_lines: "#", encoding: "ISO8859-1")

What I ended up having to do was:
csv = CSV.open(FILE_NAME, skip_lines: /\A#/, encoding: "ISO8859-1")

This isn't a huge problem, since there's a perfectly acceptable work-around. However, it would be a very easy mistake to make and it could be a difficult problem for someone to debug. It could lead to quite strange behavior if each row in the csv is converted to a Regexp.

I think the skip_lines option should be converted to a Regexp if it's a string, because the alternative is the CSV row is going to be converted to a Regexp.

My proposal is if a string is passed to skip_lines, it should be converted to a regular expression to match the beginning of a line, excluding whitespace:
"#" => /\A\s+#/

I'd be willing to work on a pull request to implement a fix, but I'd love to hear some feedback first. I definitely think this should be fixed, but I'm not positive my solution is the best option...

Here is the original pull request that implemented this option:
https://github.com/ruby/ruby/pull/161

Thank you



-- 
http://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [ruby-core:55900] Re: [ruby-trunk - Bug #8560] CSV, skip_lines option causes error when passing a string
  2013-06-22  6:59 ` [ruby-core:55591] [ruby-trunk - Bug #8560] " charliesome (Charlie Somerville)
@ 2013-07-10  4:00   ` Zachary Scott
  0 siblings, 0 replies; 5+ messages in thread
From: Zachary Scott @ 2013-07-10  4:00 UTC (permalink / raw
  To: ruby-core@ruby-lang.org

@Kyle you might want to try to ping JEG2 on that ticket as well. He
may not see this.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [ruby-core:58537] [ruby-trunk - Bug #8560] CSV, skip_lines option causes error when passing a string
  2013-06-22  6:53 [ruby-core:55590] [ruby-trunk - Bug #8560][Open] CSV, skip_lines option causes error when passing a string kstevens715 (Kyle Stevens)
  2013-06-22  6:59 ` [ruby-core:55591] [ruby-trunk - Bug #8560] " charliesome (Charlie Somerville)
@ 2013-11-23 19:08 ` kstevens715 (Kyle Stevens)
  2013-11-23 23:14 ` [ruby-core:58540] [ruby-trunk - Bug #8560][Closed] " JEG2 (James Gray)
  2 siblings, 0 replies; 5+ messages in thread
From: kstevens715 (Kyle Stevens) @ 2013-11-23 19:08 UTC (permalink / raw
  To: ruby-core


Issue #8560 has been updated by kstevens715 (Kyle Stevens).

File csv-string-skip-lines.patch added

Attached is a patch that converts skip_lines to a Regexp if it's a string. Also see pull request on GitHub: https://github.com/ruby/ruby/pull/455

----------------------------------------
Bug #8560: CSV, skip_lines option causes error when passing a string
https://bugs.ruby-lang.org/issues/8560#change-43119

Author: kstevens715 (Kyle Stevens)
Status: Open
Priority: Low
Assignee: JEG2 (James Gray)
Category: 
Target version: 
ruby -v: ruby 2.0.0p0 (2013-02-24 revision 39474) [x86_64-linux]
Backport: 1.9.3: UNKNOWN, 2.0.0: UNKNOWN


There seems to be a bug in the CSV class when using the skip_lines option.This option is currently undocumented, but according to the GitHub PR, it accepts any object that responds to `match`. String responds to match, so one would imagine a string can be used. However, String#match can take either a Regexp or another String. If the argument passed is a string, it will first be converted to a Regexp.

So if you pass a string to the skip_lines option, it will attempt to convert the row in the csv file to a Regexp. This doesn't make sense, and it can also lead to exceptions. For example, if the csv contains a row with the data, "# )", you will get an error, "RegexpError: unmatched close parenthesis: /# )/".

My particular use case when I found this problem was I was trying to ignore lines beginning with a "#". 

This was my first, unsuccessful attempt:
csv = CSV.open(FILE_NAME, skip_lines: "#", encoding: "ISO8859-1")

What I ended up having to do was:
csv = CSV.open(FILE_NAME, skip_lines: /\A#/, encoding: "ISO8859-1")

This isn't a huge problem, since there's a perfectly acceptable work-around. However, it would be a very easy mistake to make and it could be a difficult problem for someone to debug. It could lead to quite strange behavior if each row in the csv is converted to a Regexp.

I think the skip_lines option should be converted to a Regexp if it's a string, because the alternative is the CSV row is going to be converted to a Regexp.

My proposal is if a string is passed to skip_lines, it should be converted to a regular expression to match the beginning of a line, excluding whitespace:
"#" => /\A\s+#/

I'd be willing to work on a pull request to implement a fix, but I'd love to hear some feedback first. I definitely think this should be fixed, but I'm not positive my solution is the best option...

Here is the original pull request that implemented this option:
https://github.com/ruby/ruby/pull/161

Thank you



-- 
http://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [ruby-core:58540] [ruby-trunk - Bug #8560][Closed] CSV, skip_lines option causes error when passing a string
  2013-06-22  6:53 [ruby-core:55590] [ruby-trunk - Bug #8560][Open] CSV, skip_lines option causes error when passing a string kstevens715 (Kyle Stevens)
  2013-06-22  6:59 ` [ruby-core:55591] [ruby-trunk - Bug #8560] " charliesome (Charlie Somerville)
  2013-11-23 19:08 ` [ruby-core:58537] " kstevens715 (Kyle Stevens)
@ 2013-11-23 23:14 ` JEG2 (James Gray)
  2 siblings, 0 replies; 5+ messages in thread
From: JEG2 (James Gray) @ 2013-11-23 23:14 UTC (permalink / raw
  To: ruby-core


Issue #8560 has been updated by JEG2 (James Gray).

Category set to lib
Status changed from Open to Closed
% Done changed from 0 to 100


----------------------------------------
Bug #8560: CSV, skip_lines option causes error when passing a string
https://bugs.ruby-lang.org/issues/8560#change-43121

Author: kstevens715 (Kyle Stevens)
Status: Closed
Priority: Low
Assignee: JEG2 (James Gray)
Category: lib
Target version: 
ruby -v: ruby 2.0.0p0 (2013-02-24 revision 39474) [x86_64-linux]
Backport: 1.9.3: UNKNOWN, 2.0.0: UNKNOWN


There seems to be a bug in the CSV class when using the skip_lines option.This option is currently undocumented, but according to the GitHub PR, it accepts any object that responds to `match`. String responds to match, so one would imagine a string can be used. However, String#match can take either a Regexp or another String. If the argument passed is a string, it will first be converted to a Regexp.

So if you pass a string to the skip_lines option, it will attempt to convert the row in the csv file to a Regexp. This doesn't make sense, and it can also lead to exceptions. For example, if the csv contains a row with the data, "# )", you will get an error, "RegexpError: unmatched close parenthesis: /# )/".

My particular use case when I found this problem was I was trying to ignore lines beginning with a "#". 

This was my first, unsuccessful attempt:
csv = CSV.open(FILE_NAME, skip_lines: "#", encoding: "ISO8859-1")

What I ended up having to do was:
csv = CSV.open(FILE_NAME, skip_lines: /\A#/, encoding: "ISO8859-1")

This isn't a huge problem, since there's a perfectly acceptable work-around. However, it would be a very easy mistake to make and it could be a difficult problem for someone to debug. It could lead to quite strange behavior if each row in the csv is converted to a Regexp.

I think the skip_lines option should be converted to a Regexp if it's a string, because the alternative is the CSV row is going to be converted to a Regexp.

My proposal is if a string is passed to skip_lines, it should be converted to a regular expression to match the beginning of a line, excluding whitespace:
"#" => /\A\s+#/

I'd be willing to work on a pull request to implement a fix, but I'd love to hear some feedback first. I definitely think this should be fixed, but I'm not positive my solution is the best option...

Here is the original pull request that implemented this option:
https://github.com/ruby/ruby/pull/161

Thank you



-- 
http://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2013-11-23 23:42 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-06-22  6:53 [ruby-core:55590] [ruby-trunk - Bug #8560][Open] CSV, skip_lines option causes error when passing a string kstevens715 (Kyle Stevens)
2013-06-22  6:59 ` [ruby-core:55591] [ruby-trunk - Bug #8560] " charliesome (Charlie Somerville)
2013-07-10  4:00   ` [ruby-core:55900] " Zachary Scott
2013-11-23 19:08 ` [ruby-core:58537] " kstevens715 (Kyle Stevens)
2013-11-23 23:14 ` [ruby-core:58540] [ruby-trunk - Bug #8560][Closed] " JEG2 (James Gray)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).