This is something that should be simple, but I cannot understand.
This site is encoded by UTF-8.
The client had problems filling out the form on our website. Here is an example of the data they entered.
SPICER-SMITHS LOST
It looks like a regular line, but when you copy this line to an application, such as notepad ++, you will see "?" appear in the word "SMITHS" ("SMITH? S").
The script deactivates the field and performs an additional step to delete the following characters: "\r\n", "\n", "\r", "\t", "\0", "\x0B" .
He did not catch this hidden character.
Does anyone know what is going on here?
EDIT: I am using php. Here is the function I use to disinfect the field:
function strip_hidden_chars($str) { $chars = array("\r\n", "\n", "\r", "\t", "\0", "\x0B"); $str = str_replace($chars," ",$str); return preg_replace('/\s+/',' ',$str); }
EDIT 2: @thaJeztah led me to the answer. The line that I tested is the exit from our support ticket after the client has copied and pasted it from any application that it uses. Actual input was
SPICER-SMITHS