It looks like your original string had HTML characters for " ( " ), so when you try to sanitize it, you simply delete & and;, leaving the rest of the quot string.
--- EDIT ---
Probably the easiest way to remove non-abedic numeric characters would be to decode the HTML characters with html_entity_decode and then run it through a regular expression. Since in this case you will not get anything that needs to be transcoded, you do not need to do htmlentities , but it is worth remembering that you have HTML data and now you have unprocessed unencoded data.
For example:
function string_sanitize($s) { $result = preg_replace("/[^a-zA-Z0-9]+/", "", html_entity_decode($s, ENT_QUOTES)); return $result; }
Note that ENT_QUOTES denotes the function "... convert both double and single quotes."
Hamish
source share