Apache: escaped umlauts in the query string (URL) result in 403

I have a problem that I have never encountered before, and I think it has something to do with the Apache configuration, which I am not very good at.

Firstly, there is a php script with a search form. the form is submitted via POST.

that is, a list of search results. here the original search query is passed as part of the url e.g.: search.php? id = 1234 & query = foo. it also works - until the umlaut characters are transmitted (äöüÄÖÜß ...).

as soon as I include umlauts in the search query, the first part, which passes the query string as POST, works, but passing it (urlencoded) in the url leads to 403.

So:

  • search.php?id=1234&query=bar works
  • search.php?id=1234&query=b%E4r results in 403 (% E4 = "ä" utf-8 urlencoded)
  • search.php?id=1234&query=b%C3%A4r results in 403 (% C3% A4 = "ä" utf-8 urlencoded)
  • sending umlauts via POST work

i converted the application from iso-8859-1 to utf-8, but that didn't matter.

I also tested it on my local machine, here it works flawlessly - as expected.

setting up a remote server (where it does not work):

Apache / 2.2.12 (Ubuntu),
PHP Version 5.2.10-2ubuntu6.7, Suhosin Patch 0.9.7 via CGI / FastCGI

local setting (works here too):

Apache / 2.2.8 (Win32) PHP / 5.3.5
PHP version 5.3.5 via mod_php

Does anyone have an idea why remote apache / php-cgi is not accepting urlencoded umlauts correctly in the url?

Additional info: I also tried to create a static file called umlaut in it, and both /t%C3%A4st.php and /täst.php will be served without problems. täst.php?foo=täst does not work.

note ?foo=%28 , where% 28 - "(" also works.

+7
source share
1 answer

Apache does not avoid this, the browser does.

You need to use urlencode and urldecode to avoid problems with such characters.

Some browsers, such as the old Netscape, simply send the URL as it is written, with 8-bit characters in it. Others, notably MSIE, encode the UTF-8 URL before sending it to the web server, so the 8-bit character comes in two characters, of which the first has the 8th bit. There is no indication in the request headers or anywhere else that the URL is encoded in UTF-8.

+1
source

All Articles