National (non-Arabic) numbers in Unicode?

I know that unicode contains all the characters from most of the world's afabits ... but what about numbers? Are they part of Unicode or not? I could not find a direct answer. Thanks

+6
unicode
source share
6 answers

As already mentioned, Indo-Arabic numerals (0,1, ..., 9) are included in Unicode inherited from ASCII. If you are talking about representing numbers in other languages, the answer is still yes, they are also part of Unicode.

//numbers (0-9) in Malayalam (language spoken in Kerala, India) ൦ ൧ ൨ ൩ ൪ ൫ ൬ ൭ ൮ ൯ //numbers (0-9) in Hindi (India national language) ० १ २ ३ ४ ५ ६ ७ ८ ९ 

You can use \p{N} or \p{Number} in the regular expression to match any type of numeric character in any script .

This document (Page-3) describes Unicode codes for Malayalam digits.

+10
source share

In short: yes, of course. There are three categories in UNICODE containing various representations of numbers and numbers:

  • Number, decimal digit ( characters ) - for example. Arabic, Thai, Devanagari numbers;
  • Number, letter ( characters ) - for example. Roman numerals;
  • A number, another ( characters ) - for example. fractions.
+3
source share

Yes, they are codepoints 0030 to 0039 , as you can see, for example on decodeunicode.org

btw, codepoints 0000-007E are the same as ASCII (0-127, 128+ is no longer ASCII), so everything you can find in ASCII can be found in Unicode.

+1
source share

Unicode points below 128 are exactly the same as ASCII, so yes, they are in U + 0030 through U + 0039 inclusive.

+1
source share

Yes, I think so: Information taken from here

 U+0030 0 30 DIGIT ZERO U+0031 1 31 DIGIT ONE U+0032 2 32 DIGIT TWO U+0033 3 33 DIGIT THREE U+0034 4 34 DIGIT FOUR U+0035 5 35 DIGIT FIVE U+0036 6 36 DIGIT SIX U+0037 7 37 DIGIT SEVEN U+0038 8 38 DIGIT EIGHT U+0039 9 39 DIGIT NINE 
+1
source share

You can answer this question yourself: if they are not part of Unicode, this will significantly reduce the usefulness of Unicode, right?

Basically, any text that should use numbers cannot be represented using Unicode code points. (It is assumed that you do not switch between different character encodings in the same text: I do not know any software / programming language that supports this, and for good reason.)

If such questions arise, you do not need to read the Absolute Minimum. Every software developer Absolutely, should know positively about the Unicode and character sets (No Excuses!) From Joel Spolsky. Jokes aside. Go read it.

+1
source share

All Articles