What is the correct encoding for repeats?

I am trying to send a request to a URL like this "http://mysite.dk/tværs?test=æ" from an asp.net application, and I am having problems with the correct encoding of the request. Or maybe the query string is encoded correctly, the service I'm connecting to just doesn't understand it correctly.

I tried sending a request with different browsers and logging how they encode the request using Wireshark, and I get the following results:

 Firefox: http://mysite.dk/tv%C3%A6rs?test=%E6
 Ie8: http://mysite.dk/tv%C3%A6rs?test=\xe6
 Curl: http://mysite.dk/tv\xe6rs?test=\xe6

Both Firefox, IE, and Curl get the correct results from the service. Note that they encode the Danish special character "æ" differently in querystring.

When I submit a request from my asp.net application using HttpWebRequest, the URL gets the encoding as follows:

 http://mysite.dk/tv%C3%A6rs?test=%C3%A6

It encodes the query string in the same way as the path to the URL. The remote service does not understand this encoding, so I do not get the correct answer.

For the entry, “æ” (U + 00E6) is% E6 in ISO-LATIN-1 and% C3% A6 in UTF-8.

I could change the remote service to accept UTF-8 encoded encoding, but then the service will stop working in browsers, and I'm not very interested. Is there a way to tell .NET that it should not encode requests with UTF-8?

I create a web request as follows:

var req = WebRequest.Create("http://mysite.dk/tværs?test=æ") as HttpWebRequest; 

But the problem seems to come from System.Uri, which is apparently used internally by WebRequest.Create:

 var uri = new Uri("http://mysite.dk/tværs?test=æ"); // now uri.AbsolutePath == "http://mysite.dk/tv%C3%A6rs?test=%C3%A6" 
+8
c # query-string encoding
source share
3 answers

I ended up changing my remote web service to expect the request to be encoded in UTF-8. It solves my immediate problem, the web service cannot be properly called by both PHP and the .NET framework.

However, in browsers, the behavior is now weird. Copy the URL in the browser, for example, "http://mysite.dk/tv%C3%A6rs?test=%C3%A6" and then press "return", it even corrects the encoded characters and displays the location as "http: //mysite.dk/tværs?test=æ ". If you then reload the page (F5), it still works. But if I click on the location bar and return again, the request will be encoded in Latin-1 and will not be executed.

For anyone interested here is the old Firefox bugreport about the problem: https://bugzilla.mozilla.org/show_bug.cgi?id=284474 (thanks @dtb)

So it seems that there is no good solution.

Thanks to everyone who helped though!

0
source share

It looks like you are using UrlEncode throughout the url - this is not true, the paths and query strings are encoded differently as you saw. What does URI coding, WebRequest do?

You can manually create different parts using UriBuilder or manually encode using UrlPathEncode for the path and UrlEncode for the names and values ​​of the query string.

Edit:

If the problem is in the path and not in the query string, you can try to enable IRI support through web.config

 <configuration> <uri> <iriParsing enabled="true" /> </uri> </configuration> 

Then you should leave the international characters on the way.

+5
source share
+2
source share

All Articles