String coding between .net and Java

I have a Silverlight client application that sends the string "including the characters ş ţ ă and â î" to the Java jax-ws SOAP service.

Now, no matter what I do, I always get "including characters?" and "" on the other hand. ("â î" work, while others do not).

I even tried HttpUtility.UrlEncode("ş ţ ă and â î") in silverlight, but URLDecoder.decode(inputText, "UTF-8") in Java still gives me these three characters instead.

What's happening? Java strings should be encoded in UTF-8 by default, right? And the encoding in .net is Unicode (actually UTF-16). But if I decode using Unicode or UTF-16 on the java side, I get ALL of these special characters that were included (s included).

Any help is much appreciated!


[edit] I would really like to know what encoding I use on the Silverlight side, or to specify the encoding myself. The problem is that I can’t understand where / how to do it: the client I created was using Service References → Add Reference, where I specified WSDL, and from there .NET did everything for me, created the Client Class and the necessary events and functions . Here's what my client’s essence looks like:

  FooWildcardSOAPClient client = new FooWildcardSOAPClient(); client.CallFooServiceCompleted += new EventHandler<CallFooServiceCompletedEventArgs>(client_CallFooServiceCompleted); client.CallFooServiceAsync(param1, HttpUtility.UrlEncode(inputString), args); 

I looked at the automatically generated code, but could not determine where to specify the encoding.

And here is the Java side:

 @WebService(targetNamespace = "http://jaxwscalcul.org", name="FooWildcardSOAP", serviceName="FooWildcardService") @SOAPBinding( style=SOAPBinding.Style.DOCUMENT, use=SOAPBinding.Use.LITERAL) public class FooWildcardServiceImpl { @WebMethod(operationName="CallFooService", action="urn:FooWildcardService") @WebResult(name="result") public String getOutput( @WebParam(name="FooServiceWSDL") String param1, @WebParam(name="inputTextOrXML") String inputText, @WebParam(name="otherArgsString") String[] otherArgs) { try { inputText = URLDecoder.decode(inputText, "UTF-16LE");//ISO-8859-1 } catch (UnsupportedEncodingException e) { e.printStackTrace(); } System.out.println("\r\n\r\n"+inputText); } 

[EDIT2] I used Fiddler, and I see that the content on the wire is text / xml UTF-8, and the actual data, like in the "ş ţ ă" characters that are not displayed in java, DO show on the wire, correctly.

Here are some pastes from Fiddler:

 Client: Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.3 Accept-Language: en-GB,en-US;q=0.8,en;q=0.6,ro;q=0.4,fr-FR;q=0.2,de;q=0.2 Entity: content-type: text/xml; charset=utf-8 
+4
source share
1 answer

Through Luther, Blisset responds with "UTF-16! = UTF-16" :

In Java, getBytes ("UTF-16") is big-endian.

In C #, Encoding.Unicode.GetBytes is unsigned.

On the Java side, try getBytes ("UTF-16LE").

For a detailed explanation, see Big and Small Byte Serial Number .

+5
source

All Articles