Decoding a coded pound symbol in java

we use an external service to receive data in CSV format. we are trying to write data in response so that csv can be downloaded to the client. Unfortunately, we get the data in the following format.

Amount inc. VAT Balance £112.83 £0.0 £97.55 £0.0 £15.28 £0.0 

we cannot decode the contents. Is there a way to decode £ and display £ in java.

Does String Utils have string decoding.

+7
source share
3 answers

Problem: when we use the getBytes () string over it, it tries to decode using the default encoder. after encoding String, decoding may not work if we use the default decoders.

Solution: one StringUtils apache will help us in decoding these characters when accessing the answer. This class is available in the org.apache.commons.codec.binary package.

 String CSVContent = "/* CSV data */"; /** * Decode the bytes using UTF8. */ String decodedStr = StringUtils.newStringUtf8(CSVContent.getBytes("UTF-8")); /** * Convert the decoded string to Byte array to write to the stream */ Byte [] content = StringUtils.getBytesIso8859_1(decodedStr); 

Dependence Maven 2.0.

 <dependency> <groupId>commons-codec</groupId> <artifactId>commons-codec</artifactId> <version>1.6</version> </dependency> 

Solution: two

According to @Joni, the best solution with the standard API:

 content = CSVContent.getBytes("ISO-8859-1"); 
+2
source

The file appears to be encoded in UTF-8. You should read it as UTF-8.

If you use java.io.FileReader and company, you should open FileInputStream and use InputStreamReader instead:

 // Before: Reader in = new FileReader(file) Reader in = new InputStreamReader(new FileInputStream(file), "UTF-8"); 

If you use any other method to read the file (perhaps for an external or internal class library?), Check its documentation if it allows you to specify the text encoding used to read the file.

Update: If you already have String mojibake, for example £97.55 , and it cannot fix the reading method, one way to transcode is to convert the string to bytes and re-interpret the bytes as UTF-8. This process does not require any external "StringUtils" or codec library; The standard Java API is quite powerful:

 String input = ...obtain from somewhere...; String output = new String(input.getBytes(/*use platform default*/), "UTF-8"); 
+5
source

We are fortunate that there is now Java 7. You can do the following with Paths , Files and StandardCharsets :

 Path path = Paths.get("/tmp", "input.txt"); List<String> lines = Files.readAllLines(path, StandardCharsets.UTF_8); for (String line : lines) { System.out.println(line); } 
+1
source

All Articles