What filters invalid utf8 from my php site?
My site is fully converted to utf-8, (mysql, http headers, PHP mb_string, etc.).
Im doing some penetration testing and trying to install the POST invalid utf on one of the scripts (using BurpSuite).
But when I submit an invalid utf, just a hex dump of var $ _POST var, I see that the incorrect utf sequence has already been sanitized before I try to check it with mb_detect_encoding.
This sounds like good news to me, but I want to know which layer converts the POST data?
This is a side effect of the Content-Type HTTP header, maybe my web server does this (lighttpd). Or is it PHP itself doing this when populating $ _POST?
I expected to see an invalid utf hexdumped, leaving me to misinform myself.
PHP itself does not filter the POST data, it simply processes it as binary data that is always "valid" (this is just data, do not check anything).
Therefore, I suspect that there is some module with your web server that changes data, or there is a PHP extension that filters data.
Check if you have a web firewall installed on your web server and a list of extensions that you download using PHP, and if there is anything related to input filtering.