Which character should not be set as values ​​in an XML file

I notice that when I set the values ​​of the XML file with the "&" character, the XML file does not open correctly

I assume that this is because the values ​​of the XML file should not have any character, but

And "&" should not be set as a value in an XML file

Please advice if there are more characters that should not be set in XML as a value? (or maybe the symbol is the only one?)

An example of a bad value from XML "&"

<FolderPath>\EEA\E1\C & W 100\AWQ</FolderPath> 

Example valid string from XML

  <FolderPath>\EEA\E1\C and W 100\AWQ</FolderPath> 
0
dom xml perl vbscript
source share
2 answers

I would call you: http://www.w3schools.com/xml/xml_syntax.asp

Some characters have special meanings in XML.

If you place a character like "<" inside an XML element, it will generate an error because the parser interprets it as the beginning of a new element.

There are 5 predefined object references in XML:

 &lt; < less than &gt; > greater than &amp; & ampersand &apos; ' apostrophe &quot; " quotation mark 

You can validate your XML with: http://www.w3schools.com/xml/xml_validator.asp

+1
source share

The safe thing is to always run as follows:

 & ⇒ &amp; < ⇒ &lt; > ⇒ &gt; " ⇒ &quot; ' ⇒ &apos; 

Detailed response:

  • & is the special meaning of element text and attribute. Use &amp; instead .
  • < is the special text of the element. Use &lt; instead .
  • > not an end in itself.
  • " is special in attribute values ​​separated by a character. " Use &quot; instead .
  • ' is special in attribute values ​​separated by ' . Use &apos; instead of [1] .
  • ]]> is the special text of the element. Use ]]&gt; instead .
  • ]]> is special in CDATA sections. Use ]]>]]&gt;<![CDATA[ .
  • -- cannot be used in comments.
  • Only the following characters can be found in XML: 0x0009, 0x000A, 0x00D, 0x0020..0xD7FF, 0xE000..0xFFFD, 0x10000..0x10FFFF. Unable to turn on others.

&amp; , &lt; , &gt; , &quot; and &apos; they can always be used in element texts and attribute values ​​even when this is not necessary, therefore the safe thing is always escape & , < , > , " and ' . (Alternatively, always avoid all of them except single quotes and never do not use single quotes as a separator for attribute values.)


  • In HTML you should use &#39; . The old version of Internet Explorer accidentally did not support &apos; for XHTML (XML-based HTML). So some people use &#39; for XML.
+5
source share

All Articles