Should "@" and "% 40" refer to URLs equivalently?

Follow Can I use the (@) character inside a URL?

Based on the voice response < , @ not a reserved character in the URL (although it is in the host).

However, given @ in transit, is the interchangeable form of URL encoding? In other words, is twitter.com/@user strictly equivalent to twitter.com/%40user ?

In practice, it seems that they are often used interchangeably, but it is curious if this is strictly so (for example, AbC@gmail.com technically different from AbC@gmail.com , but almost everyone AbC@gmail.com them the same way).

In a broader sense, when characters and there the URL-encoded version should be treated the same way, and when they are different (for example, example.com/path%2Fasdf NOT the same as example.com/path/asdf ) ...

+2
url uri rfc1738
Apr 28 '16 at 17:59 on
source share
1 answer

The URIs http://twitter.com/@user and http://twitter.com/%40user not equivalent.




The URI standard is STD 66 , which is currently mapped to RFC 3986 (which updates RFC 1738 ).

Section 6.2.2.2. Percent normalization normalization determines how to normalize URIs with percentage encoding so that they are compared for equivalence (after the upper case of hexadecimal digits A - F , as defined in 6.2.2.1 Case Normalization ).

It says:

[...] some URI manufacturers process percent-encoded octets that do not require percent encoding, resulting in URIs that are equivalent to their unencoded counterparts. These URIs must be normalized by decoding any percentage encoded octet that matches the unconditional character, as described in Section 2.3 .

Related section 2.3 lists unreserved characters that:

  • ALPHA ( A - z , A - z )
  • DIGIT ( 0 - 9 )
  • - . _ ~

These sections also indicate that even if normalization does not occur:

URIs that differ by replacing an unreserved character with its corresponding US-ASCII percent encoded octet are equivalent: they identify the same resource.

@ not part of the "unconditional" set. Its part is “reserved” , which says:

URIs that differ by replacing a reserved character with its corresponding percentage octet are not equivalent.

+2
Apr 29 '16 at 13:35
source share



All Articles