Why does this character 口 cause my scanner to crash?

I am using Java Scanner .

I have a .txt file with the text stored in it.

 PriceDB = { ["profileKeys"] = { ["Name - 回音山"] = "Name - 回音山", }, ["char"] = { ["Name - 回音山"] = { ["CurrentValue"] = "一口价:|cffffffff70,197|TInterface\\MoneyFrame\\UI-GoldIcon:0:0:2:0|t|r", }, }, } 

All I am trying to do is open this file with a scanner and extract "CurrentValue" from 70,197 from the file and save it as an int. However, each time the file is opened, it will not read the line and throws a NoSuchElementException message with "No line found" as the message. After I fumbled with the file and deleted some Chinese characters one by one, I narrowed it down to this little guy 口. For some reason, the scanner does not like this symbol. I'm just wondering if there is some kind of encoding parameter that I need to change, or if I have to use BufferedReader or what ... I'm honestly not quite sure what happens, except for the encoding error. So what is going on here?

Edit: Here is the initialization of my scanner.

 Scanner scanner; if (region.equals("US")) { scanner = new Scanner(new File("C:\\Program Files\\World of Warcraft\\WTF\\Account\\313023286#1\\SavedVariables\\WoWTokenPrice.lua")); } else if (region.equals("EU")) { scanner = new Scanner(new File("C:\\Program Files\\World of Warcraft\\WTF\\Account\\313495228#1\\SavedVariables\\WoWTokenPrice.lua")); } else if (region.equals("China")) { File file = new File("C:\\Program Files\\World of Warcraft\\WTF\\Account\\232241227#1\\SavedVariables\\WoWTokenPrice.lua"); System.out.println(file.exists()); scanner = new Scanner(file); } else { System.exit(1); break; } 

I just copied it as is. region == "China"

+5
source share
1 answer

You must specify the correct encoding when creating the Scanner . Constructor:

 public Scanner(InputStream source, String charsetName) 

Creates a new scanner that produces values ​​scanned from the specified input stream. Bytes from the stream are converted to characters using the specified encoding.

Find your encoding here , I think UTF-16 but not an expert on foreign characters :).

 Scanner scanner = new Scanner(is, StandardCharsets.UTF-16.toString()); 
+4
source

All Articles