UTF URL encoded URL string

When I enter in Firefox (in the address bar) a URL, for example http://www.example.com/?query=Trailed , it is automatically encoded http://www.example.com/?query=%D2%F0%E0 % EB% E8% E2% E0% EB% E8 .

But a URL like http://www.example.com/#ajax_call?query=Trailed doesn’t translate.

Other browsers, such as IE8, do not convert the request at all.

Question: how to determine (in PHP) if the request is encoded? How to decode it?

I tried:

Test:

$ str = $ _GET ['str'];

(% D2% F0% E0% EB% E8% E2% E0% EB% E8));

d ('% D2% F0% E0% EB% E8% E2% E0% EB% E8' == $ str);

d ('Trawling' == $ str);

d (urldecode ($ str));

d (utf8_decode (urldecode ($ str)));

!!! d ('% D2% F0% E0% EB% E8% E2% E0% EB% E8' == urlencode ($ str)); !!!

Return:

[false] [false] [false] ???? [True]

Some solution: http://www.example.com/Trailed/ - send the request as part of the url and deal with mod_rewrite.

+4
source share
7 answers

It is not converted to having a query part of the URL after the fragment is invalid.

RFC 3986 defines a URI that consists of the following parts:

  foo://example.com:8042/over/there?name=ferret#nose \_/ \______________/\_________/ \_________/ \__/ | | | | | scheme authority path query fragment 

Order cannot be changed. Consequently,

 URL1: http://www.example.com/?query=#ajax_call 

will be processed correctly, and

 URL2: http://www.example.com/#ajax_call?query= 

will not. If we look at URL2 , IE will actually handle the URL correctly by detecting the fragment as #ajax_call?query= without a request. The fragment is always the last and never sent to the server .

IE will correctly encode the request component URL1 because it will detect it as a request.

As for decryption in PHP, %D2 and similarly are automatically decoded in the variable $_GET['query'] . The reason the $_GET variable was not properly populated was URL2 request in URL2 did not conform to the standard.

In addition, the last thing ... when executing '' == $_GET['query'] , this will be true only if your PHP script itself is encoded in UTF-8. Your text editor should be able to tell you the encoding of your file.

+6
source
 rawurldecode($_GET['query']); 

but it should have been done already by php; -)

edit , you declare that "nothing works" - what are you trying? if the text does not appear on the screen as desired, for example, with echo $_GET['query']; , your problem may be the encoding you specify for the page sent back to the browser.

Include line

 header("Content-Type: text/html; charset=utf-8"); 

and see if it helps.

+2
source

How the fragment is encoded, unfortunately, depends on the browser :

Fragment identifier (hash) encoded using RFC-specific URL escaping rules?
MSIE: NO
Firefox: PARTLY
Safari: YES
Opera: NO
Chrome: NO
Android: YES

Regarding the question of what encoding the browser uses to encode international characters (readable: not ASCII) before converting them to %nn escape sequences, "most browsers do this by sending default UTF-8 data to any text, manually entered in the URL string, and using page encoding for all subsequent links. " (same source ).

+2
source

You can use UTF8::autoconvert_request() for this.

See http://code.google.com/p/php5-utf8/ for more information.

+1
source

URLs are limited to specific ascii characters. It is assumed that non-url friendly characters must be encoded in url (the% hh encoding you see). Some browsers may automatically encode URLs that appear in the addr line.

0
source

The answer is simple: the string is always encoded. As stated in the HTTP standard.
And what are firefox displays - it doesn't matter.

In addition, since PHP decodes the query string automatically, decoding is not required.

Note that '% D2% F0% E0% EB% E8% E2% E0% EB% E8' is a single-byte encoding, so you have a page probably in 1251. At least the HTTP header says this in browser
Although AJAX always uses utf-8.

So, you need to either use single encoding (utf-8) for your pages, or distinguish between ajax calls from regular ones.

As for the fragment - do not use the value of the fragment to send it to the server. Have a JS variable, and then use it twice - to set the fragment and to send to the server using JSON.

0
source

RFC 1738 states that only letters, special characters $-_.+!*'()," And reserved characters ;/?:@=& Are not specified in the URL. Everything else is encoded by the HTTP client, i.e. the web by the browser. You can use rawurldecode () whether PHP will automatically decode the query string. There is no danger with double decoding.

0
source

All Articles