Convert bytes in hexadecimal to actual bytes

I have a file that is written in bytes, like this

\r\x00\x00\x00\xd0{"a": "test"} 

which has the following bytes

 [13, 0, 0, 0, -48, 123, 34, 97, 34, 58, 32, 34, 116, 101, 115, 116, 34, 125] 

when this file is read in java i slip away

 \\r\\x00\\x00\\x00\\xd0{"a": "test"} 

when I do .getBytes() on this line, I get

 [92, 114, 92, 120, 48, 48, 92, 120, 48, 48, 92, 120, 48, 48, 92, 120, 100, 48, 123, 34, 97, 34, 58, 32, 34, 116, 101, 115, 116, 34, 125] 

I need to convert a string to valid bytes, I have no way to change the way the file is read, unfortunately. I know in Python that you open a file with the 'rb' mode and you are good to go. If java has this ability, I cannot use it.

So, how can I convert the string that Java reads to the original byte array that was written to the file?

Sorry if this question is stupid, but I'm so green when it comes to Java.

EDIT: So, I believe that my question is different from the proposed "duplicate question" link. It does not take every literal value in a java string and converts it back to byte. The reader reads a string in java. \x00 now \\x00 , which is not the same byte value. So it seems to me that I need a way to unescaping a string?

Hex Editor File

 0000000: 5c72 5c78 3030 5c78 3030 5c78 3030 5c78 \r\x00\x00\x00\x 0000010: 6430 7b22 6122 3a20 2274 6573 7422 7d0a d0{"a": "test"}. 

String that java is being viewed in hex editor

 0000000: 5c5c 725c 5c78 3030 5c5c 7830 305c 5c78 \\r\\x00\\x00\\x 0000010: 3030 5c5c 7864 307b 2261 223a 2022 7465 00\\xd0{"a": "te 0000020: 7374 227d 0a st"}. 
+5
source share
3 answers

In Java, you have to interpret the input string to get the desired byte values.

I wrote a Java application that interprets the input string.

Here's the input line:

 \r\x00\x00\x00\xd0{"a": "test"} 

Here is the result:

 [13, 0, 0, 0, -48, 34, 97, 34, 58, 32, 34, 116, 101, 115, 116, 34, 125] 

And here is the code. You may have to modify the code a bit to handle cases that you did not ask in your question.

 package com.ggl.testing; import java.io.BufferedReader; import java.io.FileNotFoundException; import java.io.IOException; import java.io.InputStreamReader; import java.util.ArrayList; import java.util.Arrays; import java.util.List; public class ConvertBytes implements Runnable { private String fileName; public static void main(String[] args) { new ConvertBytes("bytes.txt").run(); } public ConvertBytes(String fileName) { this.fileName = fileName; } @Override public void run() { BufferedReader br = null; try { br = new BufferedReader(new InputStreamReader(getClass() .getResourceAsStream(fileName))); String line = ""; while ((line = br.readLine()) != null) { processLine(line); } } catch (FileNotFoundException e) { e.printStackTrace(); } catch (IOException e) { e.printStackTrace(); } finally { try { if (br != null) { br.close(); } } catch (IOException e) { e.printStackTrace(); } } } private void processLine(String line) { String[] parts = line.split("(?=\\\\)"); List<Byte> byteList = new ArrayList<Byte>(); for (int i = 0; i < parts.length; i++) { if (parts[i].equals("")) { continue; } else { byteList.addAll(getValue(parts[i])); } } Byte[] bytes = byteList.toArray(new Byte[byteList.size()]); System.out.println(Arrays.toString(bytes)); } private List<Byte> getValue(String s) { List<Byte> byteList = new ArrayList<Byte>(); if (s.startsWith("\\x")) { int value = Integer.valueOf(s.substring(2, 4), 16); if (value > 127) { value = value - 256; } byteList.add(Byte.valueOf((byte) value)); if (s.length() > 4) { byteList.addAll(getAsciiValue(s.substring(5))); } } else if (s.equals("\\r")) { byteList.add(Byte.valueOf((byte) 13)); } else if (s.equals("\\t")) { byteList.add(Byte.valueOf((byte) 9)); } else { byteList.addAll(getAsciiValue(s)); } return byteList; } private List<Byte> getAsciiValue(String s) { List<Byte> byteList = new ArrayList<Byte>(); for (int i = 0; i < s.length(); i++) { int value = (int) s.charAt(i); byteList.add(Byte.valueOf((byte) value)); } return byteList; } } 

The bytes.txt file must be in the same directory as the Java application.

+1
source

It looks like you need to parse the string "String" yourself.

I would have a map of escaped characters ('\ r', '\ n', '\ b', etc.)

 private static Map<String, Byte> escapedCharacters; static { escapedCharacters = new HashMap<>(); escapedCharacters.put("\\b", (byte)'\b'); escapedCharacters.put("\\f", (byte)'\f'); escapedCharacters.put("\\n", (byte)'\n'); escapedCharacters.put("\\r", (byte)'\r'); escapedCharacters.put("\\t", (byte)'\t'); // Add more if needed }; 

Then, to process the file, follow these steps:

 public static void main(String[] args) throws Exception { String myFile = "PathToYourFile"; // Read your file in List<String> myFileLines = Files.readAllLines(Paths.get(myFile)); // List to hold all the lines as translated bytes List<byte[]> myFileLinesAsBytes = new ArrayList<>(); for (String line : myFileLines) { myFileLinesAsBytes.add(translateEscapedBytes(line)); } // Displays all translated lines for (byte[] byteLine : myFileLinesAsBytes) { System.out.println(Arrays.toString(byteLine)); } System.out.println(); } private static byte[] translateEscapedBytes(String line) throws UnsupportedEncodingException { List<Byte> translatedBytes = new ArrayList<>(); for (int i = 0; i < line.length();) { if (line.charAt(i) == '\\') { // Escaped byte String escapedByte = line.substring(i, i + 2); if (escapedByte.endsWith("x")) { // Hexidecimal number escapedByte = line.substring(i + 2, i + 4); // + 4 to get the two numbers after \x translatedBytes.add(hexStringToByte(escapedByte)); i += 4; } else { // Escaped character translatedBytes.add(escapedCharacters.get(escapedByte)); i += 2; } } else { // Non Escapted Character translatedBytes.add((byte)(line.charAt(i))); i++; } } // Copy List to actual byte[] to return byte[] result = new byte[translatedBytes.size()]; for (int i = 0; i < translatedBytes.size(); i++) { result[i] = translatedBytes.get(i); } return result; } private static byte hexStringToByte(String s) { return (byte) ((Character.digit(s.charAt(0), 16) << 4) + Character.digit(s.charAt(1), 16)); } 

translatedEscapedBytes() looks for the "\" character in the string and determines that in combination with the next character you will have an escaped character. If the escape character is \ x, then you know that the next two numbers are a hexadecimal number that needs to be converted to bytes ( hexStringToByte(String s) ), otherwise a hidden escape character in a byte using a screen of escaped characters. All other characters are treated as unshielded characters and are simply converted to their byte value.

Results (using the data provided):

enter image description here

+1
source

You do not get "everything escaped" when a file is read in Java. Why do you think so? Converting to bytes shows that String contains exactly what the hex editor in the file shows. In other words,

92, 114, 92, 120, 48, 48, 92, 120, 48, 48, 92, 120, 48, 48, 92, 120 (decimal)

coincides with

5c72 5c78 3030 5c78 3030 5c78 3030 5c78 (hex)

If you want to decode the escape sequences in the file, you will need to write code to process them; this is not a character encoding problem.

0
source

All Articles