Why is there an URL encoding for the ASCII character set

In W3Schools , what

URLs can only be sent over the Internet using the ASCII character set.

Why is there a URL encoding for ASCII characters such as a, b, c when it can be sent over the Internet without any URL encoding?

For example: why encode 'a' when it can send as 'a'

What are the possible reasons for encoding ASCII characters? The only reason I can think of are hackers who are trying to make their URL as unreadable as possible in order to carry out XSS attacks.

+3
security url encoding ascii
Dec 31 '14 at 8:40
source share
4 answers

STD 66, Percent Encoding :

The percent encoding mechanism is used to represent a data octet in a component when the corresponding octet symbol is outside the permitted set or is used as a separator of the component or inside it.

Thus, percentage coding is a kind of evacuation mechanism: some characters have special meaning in URI components (β†’ they are reserved). If you want to use such a character without its special meaning, you encode it in percent.

Unreserved characters (e.g. a , b , c , ...) can always be used directly, but it is also allowed to encode them in percent. Such URIs will be equivalent :

URIs that differ by replacing an unreserved character with its corresponding US-ASCII percent encoded octet are equivalent: they identify the same resource.

Why is it allowed to encode unreserved characters in some percentage? outdated RFC 2396 contains (bold by me):

Unreserved characters can be escaped without changing the semantics of the URI, but this should not be done if the URI is not used in a context that prevents the unescaped character from being displayed .. p>

I cannot think of an example for such a β€œcontext,” but this sentence suggests that there may be some.

In addition, maybe some people / implementations simply simply encode everything (except for delimiters, etc.), so they don’t need to check whether / which characters need percent encoding in the corresponding component.

+2
Jan 01 '14 at 4:47
source share

URL encoding exists for the entire ASCII range, since it was easier to define an encoding that works for all characters than to determine what works only for a character set with special values.

+3
Dec 31 '14 at 8:46
source share

URL encoding allows characters that have a special meaning in the URL that should be included in the segment, without their special meaning. There are many examples, but the most common ones for coding are ","? "," = "AND" & "

+1
Dec 31 '14 at 8:43
source share

URL coding has been designed so that it can encode any ASCII character.

So far = encoded as %3d ? encoded as %3f , and & encoded as %26 , it makes sense to encode a as %61 and b , which will be encoded as %62 , since the hexadecimal number after % represents the ASCII code of this character.

+1
Dec 31 '14 at 17:31
source share



All Articles