Regex on;) smilie

A small inconvenience that my users discovered is that if they use a smiley, such as >_> at the end of parentheses (like this:> _>), then it is processed through htmlspecialchars() during processing, which does >_>) - you can see the problem, I think. Then ;) at the end is replaced by the "Wink" emoticon.

Can someone give me a regex that replaces ;) with smilie, but only if ; is not the end of an HTML object? (I'm sure this will be related to lookbehind, but I can't figure out how to use them> _>)

Thanks!

+4
source share
6 answers

Processing emoticons like ;) always a bit complicated - the way I would do this is to convert it to “canonical” :wink: before encoding HTML objects, and then change only canonical forms :{smileyname}: emoticons.

+6
source

Example: (?<!&[a-zA-Z0-9]+);\)

(?>!...) is a zero-width statement that will allow only the next construct to match text that is not preceded by ...

+1
source

You should probably handle this on these lines, which completely eliminates the possibility of replacing replacements:

  • Separate the line separately wherever there is an emoticon, convert emoticons to markers
  • HTML deletes all text nodes
  • Convert all smilie tokens to their HTML tag equivalents
  • Glue everything together.

This is a bit nontrivial. :)

+1
source

Find: (&#?[a-z0-9]+;)\)
Replace: $0&#41;

We are looking for:

 Match the regular expression below and capture its match into backreference number 1 «(&#?[a-z0-9]+;)» Match the character "&" literally «&» Match the character "#" literally «#?» Between zero and one times, as many times as possible, giving back as needed (greedy) «?» Match a single character present in the list below «[a-z0-9]+» Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+» A character in the range between "a" and "z" «az» A character in the range between "0" and "9" «0-9» Match the character ";" literally «;» Match the character ")" literally «\)» Created with RegexBuddy 
0
source

Well, if you are interested in solving regex, try this maybe

(! t) ([A-Za-z0-9] |);)

0
source

If in php (preg_replace did you say?), You can use preg_replace_callback:

 preg_replace_callback('#(&[a-z0-9]+)?;\)#i', 'myFunction', 'myText'); 

in the "myFunction" function, you just need to check if there is any html element in the capture bracket.

 function myFunction($matches) { if(!empty($matches[1]) { return $matches[0]; } return '[Smilie]'; } 
0
source

All Articles