How to use daisy for UTF8 strings in ocaml?

I downloaded Camomile and installed it, and I am ready to go to use it.

The question is how to use it?

in ocaml, for the default string, I just do let s = "a string";;

but what about Camomile ?

for example, if I want to build a string utf8 こんにちは (Japanese greeting word copied from google translate), how can I do this with Camomile ?


Edit:

It's funny that it says ocaml cannot support utf8 , but I tried this code

 let s = "你好";; let _ = print_string s;print_string "\n";; 

he worked in okamla. But why ?? 你好 is Chinese, how can 你好 print and process it if everyone says ocaml 4.00.1 cannot handle utf8 ?

+7
source share
3 answers

Here is a short presentation by different participants:

  • ASCII is a set of characters (there are 127 of them) and a code for their representation (7 bits).

  • Unicode is a collection of characters (there are a lot more than 127).

  • UTF-8 is code for representing Unicode characters.

  • Your terminal. It interprets the bytes output by your program as UTF-8 encoded characters and displays the corresponding unicode characters.

  • OCaml handles byte sequences (OCaml uses the char name but is misleading, and the byte name would be more appropriate).

So, if OCaml prints a sequence of bytes corresponding to the UTF-8 code for "你好" , your terminal will interpret it as utf-8 and print 你好 . But for OCaml, "你好" is just a sequence of 6 bytes.

+7
source

TörökEdwin told you everything you need to know, I think. UTF-8 is specifically designed as a way to store Unicode values ​​(code points) in a series of 8-bit bytes when the code is used to work with ASCII C strings. Since OCaml strings are a series of 8-bit bytes, there is no problem storing the UTF value -8. If the program that you use to create your OCaml source processes UTF-8, then it will not be difficult to create a string containing the value of UTF-8. You do not need to do anything to make this happen. (As I said, I have done this many times myself.)

If you do not need to process the value, then OCaml I / O functions can also write such a value (or read it), and if the encoding of your display is UTF-8 (which is what I use), it will be displayed correctly. But most often you will need to process your values. If you change your code to (for example), just write the length of the string, you can begin to understand why you need a special library to handle UTF-8.

If you're wondering why a particular Unicode string is represented as a specific series of bytes in UTF-8 encoding, you just need to read UTF-8. A Wikipedia article ( UTF-8 ) may be a reasonable place to start.

+3
source

You need to use the UTF8 library only if you want to convert between different encodings, normalize Unicode or want to access individual code points.

OCaml treats strings as 8-bit binary values ​​of a specified length, so you can use any encoding directly. those. you can just assign the value of UTF8 to a variable:

 # let foo = "こんにちは";; val foo : string = "\227\129\147\227\130\147\227\129\171\227\129\161\227\129\175" 
+2
source

All Articles