Flags in emoji are indicated by a pair of regional indicator symbols . I would like to write python regex to insert spaces between the emoji flag string.
For example, this line contains two Brazilian flags:
u"\U0001F1E7\U0001F1F7\U0001F1E7\U0001F1F7"
What will look like this: π§π·π§π·
I would like to insert spaces between any pairs of regional indicator symbols. Something like that:
re.sub(re.compile(u"([\U0001F1E6-\U0001F1FF][\U0001F1E6-\U0001F1FF])"), r"\1 ", u"\U0001F1E7\U0001F1F7\U0001F1E7\U0001F1F7")
This will lead to:
u"\U0001F1E7\U0001F1F7 \U0001F1E7\U0001F1F7 "
But this code gives me an error:
sre_constants.error: bad character range
The hint (I think) about what is going wrong is the following, which shows that \ U0001F1E7 is turning into two "characters" in the regular expression:
re.search(re.compile(u"([\U0001F1E7])"), u"\U0001F1E7\U0001F1F7\U0001F1E7\U0001F1F7").group(0)
This leads to:
u'\ud83c'
Unfortunately, my understanding of Unicode is too weak for me to make further progress.
EDIT: I am using python 2.7.10 on Mac.
source share