Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add specs for CGI.escapeURIComponent #1080

Merged
merged 7 commits into from
Oct 30, 2023
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
57 changes: 57 additions & 0 deletions library/cgi/escapeURIComponent_spec.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
require_relative '../../spec_helper'
require 'cgi'

ruby_version_is "3.2" do
describe "CGI.escapeURIComponent" do
it "escapes whitespace" do
string = "&<>\" \xE3\x82\x86\xE3\x82\x93\xE3\x82\x86\xE3\x82\x93"
CGI.escapeURIComponent(string).should == '%26%3C%3E%22%20%E3%82%86%E3%82%93%E3%82%86%E3%82%93'
end

it "does not escape with unreserved characters" do
string = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-._~"
CGI.escapeURIComponent(string).should == "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-._~"
end

it "supports String with invalid encoding" do
string = "\xC0\<\<".force_encoding("UTF-8")
CGI.escapeURIComponent(string).should == "%C0%3C%3C"
end

it "processes String bytes one by one, not characters" do
CGI.escapeURIComponent("β").should == "%CE%B2" # "β" bytes representation is CE B2
end

it "raises a TypeError with nil" do
-> {
CGI.escapeURIComponent(nil)
}.should raise_error(TypeError, 'no implicit conversion of nil into String')
end

it "encodes empty string" do
CGI.escapeURIComponent("").should == ""
end

it "encodes single whitespace" do
CGI.escapeURIComponent(" ").should == "%20"
end

it "encodes double whitespace" do
CGI.escapeURIComponent(" ").should == "%20%20"
end

it "preserves encoding" do
string = "whatever".encode("ASCII-8BIT")
CGI.escapeURIComponent(string).encoding.should == Encoding::ASCII_8BIT
end

it "uses implicit type conversion to String" do
object = Object.new
def object.to_str
"a b"
end

CGI.escapeURIComponent(object).should == "a%20b"
end
end
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: we could add some other cases:

  • for implicit type conversion of an argument (to String with method #to_str)
  • for ascii-compatible String and ascii-not-compatible (it treats a String as a bytes sequence and converts bytes, not characters)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit uncertain here. I added a spec for the implicit type conversion, but I'm not sure if I understood it correctly what you suggested. Please let me know if I misunderstood.

I could use a pointer regarding the ascii-not-compatible string idea... how could I approach this? Thanks for your help, I appreciate it!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding implicit conversion - you are correct. Actually you may look for similar test cases, e.g. here:

it "converts string argument with #to_str method" do
source_code = Object.new
def source_code.to_str() "1" end
a = BasicObject.new
a.instance_eval(source_code).should == 1
end
it "raises ArgumentError if returned value is not String" do
source_code = Object.new
def source_code.to_str() :symbol end
a = BasicObject.new
-> { a.instance_eval(source_code) }.should raise_error(TypeError, /can't convert Object to String/)
end

Copy link
Member

@andrykonchin andrykonchin Sep 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for ascii-compatible String and ascii-not-compatible (it treats a String as a bytes sequence and converts bytes, not characters)

I could use a pointer regarding the ascii-not-compatible string idea... how could I approach this?

I meant that most of the test cases use ASCII characters. But the method actually handles multibyte encodings, e.g. UTF-8/UTF-16/UTF-32 and just converts their bytes one by one (ignoring "characters" that could contain several bytes):

CGI.escapeURIComponent("β")
# => "%CE%B2"

"β".bytes.map {|c| c.to_s(16) }
# => ["ce", "b2"]