Does Vim have the equivalent of \ X to match Unicode "grapheme clusters"?

Unicode indicates that \X should match an “extended grapheme cluster” —for example, a base character followed by zero or more character combinations. (I find this a simplification, but may be enough for my needs.)

I am sure that at least Perl supports \X in its regular expressions.

But Vim defines \X to match non-hexadecimal digits.

Does Vim have the equivalent of \X or any way to map to an extended Unicode grap cluster?

Vim has the concept of combining or "composing" characters, but its documentation does not cover whether they are supported or how they are supported in common expressions.

Vim doesn't seem to support this directly yet, but I'm still interested in a workaround where the search will highlight all characters that include the combining character, at least in the main range from U+0300 to U+0364 .

+8
vim regex unicode
source share
2 answers

If your vim installation is compiled with perl support , you can run:

 :perldo s/\X/replacement/g 

I installed vim-nox on debian (which contains perl support) and matching \X in with perldo really works, but I'm not sure if it will do what you want, since all normal characters also don't look like perldo helps you highlight in vim.

Although this is not ideal, if you can get perl support, you can use unicode blocks and categories. This means that you can use \p{Block: Combining_Diacritical_Marks} or \p{Category: Nonspacing_Mark} to at least detect certain characters, although you still won't highlight.

+3
source share

You can search for all characters and ignore compound characters with \Z Or you can search for a range of Unicode characters. Read :help /[] more information on everyone.

The last post here may offer additional help:

http://vim.1045645.n5.nabble.com/using-regexp-to-search-for-Unicode-code-points-and-properties-td1190333.html

But the Vim regex does not have a character class such as Perl.

+3
source share

All Articles