How to avoid double url encoding when rendering urls on my website?

Users provide both properly escaped URLs and raw URLs to my website in text format; for example, I consider these two URLs equivalent:

https://www.cool.com/cool%20beans https://www.cool.com/cool beans 

Now I want to display them as <a> tags later, when viewing this data. I am stuck between encoding this text and getting these links:

 <a href="https://www.cool.com/cool%2520beans"> <!-- This one is broken! --> <a href="https://www.cool.com/cool%20beans"> 

Or do not encode it and get the following:

 <a href="https://www.cool.com/cool%20beans"> <a href="https://www.cool.com/cool beans"> <!-- This one is broken! --> 

What is the best way out of the user's point of view with modern browsers? I am torn between doing a decoding skip over their input or the second option mentioned above, where we do not encode the href attribute.

+4
source share
1 answer

If you want to avoid double-coding links, you can simply use urldecode() on both links, and then urlencode() after that, as decoding a URL like https://www.cool.com/cool beans "will return same value whereas decoding https://www.cool.com/cool%20beans "will return with a space. This leaves both links free for proper encoding.

Alternatively, encoded characters can be scanned to use the strpos() function, for example.

 if ($pos = strpos($url, "%20") { //Encoded character found } 

Ideally, an array of common encoded characters will be scanned for this, instead of "% 20"

+10
source

All Articles