I am implementing a piece of software that works as follows:
I have a Linux server running a vt100 terminal application that displays text. My program telnets the server and reads / analyzes bits of text into the corresponding data. The corresponding data is sent to a small client, launched by a web server, which displays the data on an HTML page.
My problem is that some special characters, such as "åäö", are displayed as question marks (classic).
Background:
My program reads a stream of bytes using Apache Commons TelnetClient . The byte stream is converted to String, then the corresponding bits of the substring are placed back to the delimiter characters. After that, a new line is converted back to an array of bytes and sent using Socket for the client launched by the web server. This client creates a string from the received bytes and outputs it to standard output, which the web server reads and outputs HTML.
Step 1: byte [] → String → byte [] → [send to client]
Step 2: byte [] → String → [print]
Problem:
When I run my Java program on Windows, all characters, including "åäö", are displayed correctly on the resulting HTML page. However, if I run the program on Linux , all special characters are converted to " ? " (Question mark).
The web server and client are currently running on Windows (step 2).
Code:
The program basically works as follows:
My program:
byte[] data = telnetClient.readData()
StringBuffer buf = new StringBuffer();
for (byte b : data) {
buf.append((char) (b & 0xFF));
}
String text = buf.toString();
ServerSocket serverSocket = new ServerSocket(...);
Socket socket = serverSocket.accept();
serverSocket.close();
socket.getOutputStream.write(text.getBytes());
socket.getOutputStream.flush();
Client executed by web server:
Socket socket = new Socket(...);
byte[] data = readData(socket);
String output = new String(data);
System.out.println(output);
Suppose the synchronization between reads and writes works.
:
- . . Windows "WINDOWS 1252", -, -, Linux . "Charset.defaultCharset(). ForName()", , Linux "US-ASCII". , Linux "UTF-8"?
, Linux?