Trying to understand the Ruby.chr and .ord methods

Question

Trying to understand the Ruby.chr and .ord methods

I recently worked with Ruby chr and ord methods, and there are a few things that I don't understand.

My current project involves converting individual characters to and from ordinal values. As I understand it, if I have a string with an individual character of type "A" and I call ord on it, I get its position in the ASCII table, which is 65. Calling the opposite, 65.chr gives me the value of the character "A", so this tells me that Ruby has a collection somewhere of the ordered values of a character, and she can use this collection to give me the position of a specific character or character in a specific position. Maybe I'm wrong, please correct me if I will.

Now I also understand that the default character encoding of Ruby uses UTF-8, so it can work with thousands of possible characters. Thus, if I ask about it something like this:

 '好'.ord

I get the position of this character, which is 22909. However, if I call chr on this value:

 22909.chr

I get "RangeError: 22909 from a char range". I can get char to work with values up to 255 that are ASCII extended. So my questions are:

Why does Ruby seem to get the values for chr from the extended ASCII character set, but ord from UTF-8?
Is there any way to tell Ruby to use different encodings when using these methods? For example, tell me to use ASCII-8BIT encoding instead of what it defaults to?
If you can change the default encoding, is there a way to get the total number of characters available in the set used?

+7

ruby encoding

Jonathon nordquist Jun 14 '16 at 19:49

source share

2 answers

After working with this for some time, I realized that I can get the maximum number of characters for each encoding by running a binary search to find the largest value that RangeError does not raise.

 def get_highest_value(set) max = 10000000000 min = 0 guess = 5000000000 while true begin guess.chr(set) if (min > max) return max else min = guess + 1 guess = (max + min) / 2 end rescue if min > max return max else max = guess - 1 guess = (max + min) / 2 end end end end

The value entered into the method is the name of the encoding being checked.

0

Jonathon nordquist Jun 15 '16 at 5:22

source share

Nabeel · Accepted Answer · 2016-06-14T21:53:39+0000

According to Integer#chr you can use the following to force the encoding to be UTF_8.

 22909.chr(Encoding::UTF_8) #=> "好"

To list all available encoding names

 Encoding.name_list #=> ["ASCII-8BIT", "UTF-8", "US-ASCII", "UTF-16BE", "UTF-16LE", "UTF-32BE", "UTF-32LE", "UTF-16", "UTF-32", ...]

Hacker way to get the maximum number of characters

 2000000.times.reduce(0) do |x, i| begin i.chr(Encoding::UTF_8) x += 1 rescue end x end #=> 1112064

Trying to understand the Ruby.chr and .ord methods

More articles: