How to save newlines when reading a file using stream-java 8

try (Stream<String> lines = Files.lines(targetFile)) { List<String> replacedContent = lines.map(line -> StringUtils.replaceEach(line,keys, values)) .parallel() .collect(Collectors.toList()); Files.write(targetFile, replacedContent); } 

I am trying to replace multiple text patterns in each line of a file. But I observe that "\ r \ n" (the byte equivalent of 10 and 13) is simply replaced with "\ r" (10 in total), and my comparative tests fail.

I want to save new lines as they are in the input file, and I do not want java to touch them. Can anyone suggest if there is a way to do this without using a separate default replacement for "\ r \ n".

+6
source share
2 answers

The problem is that Files.lines() is implemented on top of BufferedReader.readLine() , which reads the line to the line terminator and discards it. Then, when you write lines with something like Files.write() , this gives a system line terminator after each line, which may differ from the line that was read.

If you really want to keep line terminators exactly as they are, even if they are a mixture of different line terminators, you can use regular expression and Scanner to do this.

First, define a pattern that matches a string containing valid string delimiters or EOF:

 Pattern pat = Pattern.compile(".*\\R|.+\\z"); 

\\R is a special line separator that matches the usual line terminators and several Unicode line terminators that I have never heard of. :-) You can use something like (\\r\\n|\\r|\\n) if you want only regular CRLF , CR or LF .

You must include .+\\z to match the potential last "line" in a file that does not have a line terminator. Make sure that the regular expression always matches at least one character, so that no match will be found when the scanner reaches the end of the file.

Then read the lines using Scanner until you return null :

 try (Scanner in = new Scanner(Paths.get(INFILE), "UTF-8")) { String line; while ((line = in.findWithinHorizon(pat, 0)) != null) { // Process the line, then write the output using something like // FileWriter.write(String) that doesn't add another line terminator. } } 
+9
source

Lines in your stream do not include a newline.

It would be nice if this method documentation for Files.lines() was mentioned. However, if you follow the implementation, this ultimately leads to BufferedReader.readLine() . This method is documented to return the contents of a string, not including line feeds .

You can add a newline character to lines when writing them.

The system line separator is used by the Files.write() method, which you call because it is documented in its relationship . You can also get this system-dependent line separator System.lineSeparator() .

If you need another line separator and know what it is , you can specify it. For instance:

  try ( PrintStream out = new PrintStream( Files.newOutputStream( targetFile ))) { lines.forEach( line -> out.print( line + "\r\n") ); } 

If you want the original file line separators , you cannot rely only on a method that removes them. Options include:

  • Read the first line separator and guess that it is consistent across the entire file. This allows you to continue to use Files.lines() to read lines.
  • Use an API that allows you to get strings with your separators.
  • Read the step character rather than one by one so that you can get line breaks.

WARNING: Your code reads and writes from the same file. You may lose your original data due to abnormal termination or errors.

+3
source

All Articles