How do you specify the value of the Java.coding file corresponding to the base Windows codepage?

I have a Java application that receives data through a socket using InputStreamReader . It reports "Cp1252" using the getEncoding method:

 /* java.net. */ Socket Sock = ...; InputStreamReader is = new InputStreamReader(Sock.getInputStream()); System.out.println("Character encoding = " + is.getEncoding()); // Prints "Character encoding = Cp1252" 

This does not necessarily correspond to what the system reports as a code page. For example:

  C: \> chcp
 Active code page: 850

An application can accept byte 0x81, which on the code page 850 represents the ΓΌ character. The program interprets this byte with code page 1252, which does not define any character in this value, so I get a question mark.

I was able to work around this problem for one client that used code page 850 by adding another command line parameter in the batch file that launches the application:

  java.exe -Dfile.encoding = Cp850 ...

But not all of my clients use code page 850, of course. How can I get Java to use a codepage compatible with the underlying Windows system? My preference would be that I could just paste into a batch file, leaving the Java code intact:

  ENC = ...
 java.exe -Dfile.encoding =% ENC% ...
+7
java windows batch-file codepages
source share
4 answers

The default encoding used by cmd.exe is Cp850 (or something else, the β€œOEM” CP is native to the OS); system coding Cp1252 (or something else "ANSI" CP is native to the OS). Information about Gory here . One way to detect console encoding would be to do this through native code (see GetConsoleOutputCP for the current console encoding, see GetACP for the default ANSI encoding, etc. ).

Changing the encoding with the -D switch will affect all your default encoding mechanisms, including the redirected stdout / stdin / stderr. This is not an ideal solution.

I came up with this WSH script that can configure the console on the ANSI system code page, but did not understand how to programmatically switch to the TrueType font.

 'file: setacp.vbs 'usage: cscript /Nologo setacp.vbs Set objShell = CreateObject("WScript.Shell") 'replace ACP (ANSI) with OEMCP for default console CP cp = objShell.RegRead("HKEY_LOCAL_MACHINE\SYSTEM\ControlSet001" &_ "\Control\Nls\CodePage\ACP") WScript.Echo "Switching console code page to " & cp objShell.Exec "chcp.com " & cp 

(This is my first WSH script, so it may be wrong - I am not familiar with the permissions to read the registry.)

Using the TrueType font is another requirement for using ANSI / Unicode with cmd.exe . I'm going to see a software switch for the best font, if time permits.

+6
source share

As for the code snippet, the correct answer is to use the appropriate constructor for the InputStreamReader, which does the correct code conversion. Thus, it does not matter which encoding is the default on the system, you know that you get the correct encoding that matches what you get on the socket.

Then you can specify the encoding when outputting files, if you need, rather than relying on system encoding, but of course, when they open files on this system, they can have problems, but modern Windows systems support UTF-8, so you can write the file in UTF-8 if you need (internally, Java presents all lines as 16-bit unicode).

I would think that this is the β€œright” solution as a whole, which would be most compatible with the largest set of base systems.

+5
source share

Windows has the added complication of having two active code pages. In your example, both 1252 and 850 are correct, but they depend on how the program starts. For GUI applications, Windows uses the ANSI code page, which is typically 1252 for Western European languages. However, the OEM code page, which is 850 for the same locales, will be displayed on the command line.

+4
source share

If the code page value returned from the chcp command returns the required value, you can use the following command to get the code page

 C:\>for /F "Tokens=4" %I in ('chcp') Do Set CodePage=%I 

This sets the CodePage variable for the codepage value returned from chcp

 C:\>echo %CodePage% 437 

You can use this value in your bat file, pre-using Cp

 C:\>echo Cp%CodePage% Cp437 

If, when this file is entered into the bat file, the% values ​​in the first command must be replaced with %% I

+4
source share

All Articles