I just don't know what to think anymore. It seems that the people who made javascript have gone out of their way to allow it to be written in millions of ways so that hackers can have a field day.
I finally got my whitelist using the html agility pack. He must remove
<scrpit></script>
Itβs not on my white list, plus any onclick, onmouse, etc.
However, it now seems that you can write javascript in attribute tags.
<IMG SRC="javascript:alert('hi');">
and since I allow SRC attributes, my whitelist cannot help me. So I came up with the idea to go through all the valid attributes at the end and look inside them.
This way it will find all my allowed attributes for each html tag (so src, href, etc.).
Then I found the inner text and placed it in lowercase. Then I performed an index check on this line for "javascript".
If an index was found, I started with that index and removed every character from that index. Thus, in the above case, the attribute would remain with src = "".
Now it seems that this is not good enough, since you can do something like
java script jav ascript
and probably a space between each letter.
Therefore, I do not know how to stop this. If it was just the space between java and the script, then I could just write a simple regular expression that didn't care how many spaces between them. But if this is true, you can put a space or tab or something else after each letter, then I donβt know.
Then, to complete this, you can do all these other wonderful ways.
<IMG SRC=javascript:alert('XSS')> // will work apparently <IMG SRC=javascript:alert('XSS')> // will work apparently <IMG SRC="jav ascript:alert('XSS');"> // will work apparently <IMG SRC="jav	ascript:alert('XSS');">// will work apparently <IMG SRC="jav
ascript:alert('XSS');"> // will work apparently <IMG SRC="jav
ascript:alert('XSS');"> // will work apparently
http://ha.ckers.org/xss.html
I know that this is for some kind of attack with several scripts (I do not do XSS asp.net mvc, it works well), but I do not understand why it cannot be used for other things, for example, warnings are made in all these examples therefore they can be used for something else.
So, I have no idea how to check and remove any of them.
I use C #, but I donβt know how to stop any of them and I donβt know anything about C # that could help me.