Regex in java: matching BOL and EOL

I am trying to parse windows ini file using java on windows. Suppose the contents are:

[section1] key1=value1 key2=value2 [section2] key1=value1 key2=value2 [section3] key1=value1 key2=value2 

I am using the following code:

 Pattern pattSections = Pattern.compile("^\\[([a-zA-Z_0-9\\s]+)\\]$([^\\[]*)", Pattern.DOTALL + Pattern.MULTILINE); Pattern pattPairs = Pattern.compile("^([a-zA-Z_0-9]+)\\s*=\\s*([^$]*)$", Pattern.DOTALL + Pattern.MULTILINE); // parse sections Matcher matchSections = pattSections.matcher(content); while (matchSections.find()) { String keySection = matchSections.group(1); String valSection = matchSections.group(2); // parse section content Matcher matchPairs = pattPairs.matcher(valSection); while (matchPairs.find()) { String keyPair = matchPairs.group(1); String valPair = matchPairs.group(2); } } 

But it does not work correctly:

  • Section1 does not match. Probably because it does not start with "after EOL". When I put an empty line before [section1] , it matches.

  • valSection returns '\ r \ nke1 = value1 \ r \ nkey2 = value2 \ r \ n'. keyPair returns 'key1'. Sounds good. But valPair returns the value 'value1 \ r \ nkey2 = value2 \ r \ n', but not the value 'value1'.

What is wrong here?

+4
source share
2 answers

You do not need the DOTALL flag, since you are not using dots in your template.

I think Java treats \n itself as a new line, so \r will not be processed. Your template:

 ^\\[([a-zA-Z_0-9\\s]+)\\]$ 

will not be true but insted

 ^\\[([a-zA-Z_0-9\\s]+)\\]\r$ 

will be.

I recommend that you also ignore MULTILINE and use the following patterns as line separators:

 (^|\r\n) ($|\r\n) 
+1
source

The first regex just worked (no problem with how you read the file?), But in the second the character ?? to use it reluctantly.

 import java.util.regex.Matcher; import java.util.regex.Pattern; public class Test { public static void main(String[] args) { String content = "[section1]\r\n" + "key1=value1\r\n" + "key2=value2\r\n" + "[section2]\r\n" + "key1=value1\r\n" + "key2=value2\r\n" + "[section3]\r\n" + "key1=value1\r\n" + "key2=value2\r\n"; Pattern pattSections = Pattern.compile( "^\\[([a-zA-Z_0-9\\s]+)\\]$([^\\[]*)", Pattern.DOTALL + Pattern.MULTILINE); Pattern pattPairs = Pattern.compile( "^([a-zA-Z_0-9]+)\\s*=\\s*([^$]*?)$", Pattern.DOTALL + Pattern.MULTILINE); // parse sections Matcher matchSections = pattSections.matcher(content); while (matchSections.find()) { String keySection = matchSections.group(1); String valSection = matchSections.group(2); // parse section content Matcher matchPairs = pattPairs.matcher(valSection); while (matchPairs.find()) { String keyPair = matchPairs.group(1); String valPair = matchPairs.group(2); } } } } 
0
source

Source: https://habr.com/ru/post/1411983/


All Articles