Special characters and URLs

I am currently working on an application that extracts JSON data from the blizzard community API and parses it using PHP. Everything works fine until I come to a character with a special character on their behalf.

To pull out character data, I need to know the names of their characters and the area in which they are located.

I have a name and an area going through the URL to the character page, and from there using this information to pull out the character data.

At this point, my urls look like this:

http://localhost/guildtree/characters.php?realm=argent-dawn&name=Ankzu 

At this point, if I try to pull out the data for a character with an accent, I get redirected to my page with an error, because it is not a valid character.

Only after I started rewriting the URLs did I find my problem. I am redirected to the error page because somewhere along the line special characters are replaced with some really elusive characters.

The following works with my new rewritten URLs:

  http://localhost/guildtree/argent-dawn/ankzu 

However, a character with a special character in the name causes an error message.

  http://localhost/guildtree/argent-dawn/notúk 

Results in the following error message:

"Not found

The requested URL / guildtree / argent-dawn / notúk was not found on this server. "

As you can see, ú is replaced with ú, but when I copy and paste the URL, ú is displayed as% C3% BA

I understand that the reason ú appears as ú is because two bytes of unicode ú are reset to two byte ASCII characters, resulting in a display of ú.

I made sure all of my pages have a headline in their heading:

 <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> 

For my application to work properly, I need these special characters to display correctly, so I need ú to actually display as ú, not to display as ú, but actually be ú or% C3% BA.

The character name is retrieved from the URL just like this:

 $charName = $_GET['name']; 

Is it possible to encode $ charName to correctly display special characters?

I tried everything I could and searched on Google, but nothing worked.

Also, since I am using URL rewriting, what would be the rewriting rule for these special characters?

Here is my current rewrite rule:

  RewriteRule ^([a-zA-Z0-9_'-]+)/([a-zA-Z]+)$ characters.php?realm=$1&name=$2 [NC] 

I know that ([a-zA-Z] +) does not allow special characters at all, I am currently working to ensure that special characters are displayed correctly. If I use ([a-zA-Z \ ú] +), it will work and display the page as it should be displayed. Adding \ ú to the rule seems like a very bad way to do this and does not always work when using the appropriate character for accented characters.

Any help would be greatly appreciated. If you need more information, please ask.

Edit:

Changing my rewrite rule to the following allows me to get the information in order, but it creates a redirect loop for my CSS.

  RewriteRule ^([a-zA-Z0-9_'-]+)/([^/]+)$ characters.php?realm=$1&name=$2 [NC] 

For example, my CSS redirects to

 http://localhost/guildtree/css/error 

instead

 http://localhost/guildtree/css/style2.css 

Update:

Through a few simple tests:

 $charName = $_GET['name']; $charNameTEST = utf8_encode($charName); 

A change will be made, but when I apply this on my page, it still says:

"Not found

The requested URL / guildtree / argent-dawn / notúk was not found on this server. "

I think the main problem now is with URL redirection, because JSON data can be parsed perfectly when they have accented characters. I just don’t understand why it shows me that it is in the guildtree / argent-dawn / notúk panel in the browser, but continues to try to pull up / guildtree / argent-dawn / notúk.

+7
source share
3 answers

ú is not a valid character for a URL.

When you bind a username, you must URL encode it.

Consequently, for the correct URL, you specify:

 http://localhost/guildtree/argent-dawn/not%C3%BAk 

You should print it in php as:

 echo '<a href="http://localhost/guildtree/argent-dawn/'. urlencode($name) .'">Link</a>; 
+3
source

I think this question may have your answer. I have not tried this myself, but from what I see, you need to rewrite your RewriteRule as:

 RewriteRule ^([a-zA-Z0-9_'-]+)/([a-zA-Z]+)$ characters.php?realm=$1&name=$2 [NC,B] 

Flag B guarantees that special characters will be escaped by URL, so the value that will be visible by name in $ 2 will be encrypted in percent. Since you are not doing the redirection, the original unicode character should still display in the url.

You will also need some changes to the regular expression to ensure that Unicode characters match. I'm not sure what it will be.

There are also several descriptions of how unicode characters work in the URLs here .

+2
source

For this to work correctly, you need to do two things.

First add this to your .htaccess

 AddDefaultCharset On AddDefaultCharset UTF-8 AddCharset UTF-8 .tpl AddCharset UTF-8 .js AddCharset UTF-8 .css AddCharset UTF-8 .php 

Secondly, change the portion of the rewrite rule that should allow special characters (. *) As follows:

  RewriteRule ^([a-zA-Z0-9_'-]+)/(.*)$ characters.php?realm=$1&name=$2 [NC] 

This will cause some redirects to other pages, but now I am fixing it.

+2
source

All Articles