Is there a way to get the libc6 functions regexp regcomp and regexec to work with multibyte characters correctly?
For example, if my pattern is utf8 猫机+猫 characters, the match search in utf8 encoded string 猫机机机猫 will fail, where it will be done.
I think this is because the representation of the 机 character is \xe6\x9c\xba , and + matches one or more bytes \xba . I can make this instance work by placing parentheses around each multibyte character in the template, but since this is for the application, I cannot require this from the user.
Is there a way to specify a pattern or string matching utf8 characters? Perhaps tell libc to save the template as wchar instead of char?
regex glibc utf-8 libc
bill_e
source share