Insert CDATA into XML

I'm in a hurry right now, and I ask you to help the REGEX masters! I receive an XML request through an HTTP request, and I just cannot parse it, as it contains some special characters that are not wrapped in CDATA sections.

XML example:

<root> <node>good node</node> <node>bad node containing &</node> <root> 

Trying to simplexml_load_string($xml) this XML with simplexml_load_string($xml) , I get:

 Warning: simplexml_load_string() [function.simplexml-load-string]: Entity: line 3: parser error : xmlParseEntityRef: no name in /..../file.php on line ## 

Assuming that the bad nodes will not contain > or < , I need a REGEX that will wrap the text in these nodes in CDATA sections. I suppose there will be some sort of quest, I just can't do it fast.

Thank!

+1
xml php regex
Nov 17 '11 at 3:24 a.m.
source share
1 answer

If you can really assume that the nodes you want CDATA-ize will not have < or > characters, then this should work for your situation:

 >(?=[^<&]*&)([^<]*)< 

replace

 <!CDATA[\1]]> 

This expression searches only nodes containing & characters (regardless of whether they are part of HTML entities), and then wraps the contents of these nodes in the CDATA tag if you need to ignore & characters inside objects, which is much more complicated, but I would like to take a look on him.

+2
Nov 17 '11 at 15:38
source share



All Articles