Looking for a gem with normalizers (NFD, NFKD, NFC, NFKC) for jruby 1.8.2 (own implementation)

Is there a stone of its own (so it can be used for jruby 1.8.2) that implements UTF8 normalizers (NFD, NFKD, NFC, NFKC)?

+4
source share
1 answer

Ruby v1.8 is really unstable in Unicode. I consider v1.9 the smallest version of Ruby for sane processing. Even then, unicode_utils stone for v1.9.1 for the best is absolutely necessary. It has such things as a full display of cases and normalization functions. You really need it.

Unfortunately, it does not include matching, so you cannot perform alphabetical sortings in Ruby in the same way as in Perl or languages ​​with access to ICU libraries. The most difficult to understand the comparison, so it is not surprising that it is absent. But this is important because it underlies almost everything we ever do with text. This is not just sorting; about simple string comparisons. Most people are not aware of this.

I talk about Rubys Unicode support and what you can do to make your life easier in my third OSCON talk a couple of weeks ago. I admit, I abandoned Ruby v1.8; it was too complicated.

This is not a blow to Ruby, because the same can be said of most modern languages ​​that are not the latest versions.

  • You will not be happy with Ruby and Unicode if you are not using v1.9.
  • If you are not using Python v3 (and preferably v3.2 or probably v3.3) with a wide build, you will be unhappy with Python with Unicode.
  • If you are not using Java v1.7, you will be unhappy with Java with Unicode - and perhaps even then. :(
  • If you are not using Perl v5.14 or higher, you may be unhappy in Perl with Unicode.

Thus, the situation with these four is completely different from the situation with PHP, Javascript and Go. With these last three languages ​​it doesn't matter which version you use, because

  • With the first two, you will always be unhappy with their Unicode support. This is really terrible because people using them can almost never switch to a real language with real Unicode support. Niche is too specialized.
  • Taking into account that with Go you will never be dissatisfied with its Unicode support - if you are not in a hurry: the normalization module is very close to being ready and ready to work while the comparison module is working, but in fact it is much more complicated.

Is there any possible way to use Ruby v1.9?

+1
source

All Articles