Java string splitting method ignores empty substrings

It seemed to me that the behavior of java String.split() very strange.

Actually I want to split the string "aa,bb,cc,dd,,,ee" into an array on .split(",") , which gives me an array of String ["aa","bb","cc","dd","","","ee"] length 7.

But when I try to split the array "aa,bb,cc,dd,,,," String "aa,bb,cc,dd,,,," into an array, it gives me an array of length 4 means that only ["aa","bb","cc","dd"] rejects all following blank lines.

I need a procedure that splits String as "aa,bb,cc,dd,,,," into an array ["aa","bb","cc","dd","","",""] .

Is this possible with java.lang.String api? Thanks in advance.

+9
java string arrays split regex
Feb 05 '14 at 11:05
source share
2 answers

Use String.split(String regex, int limit) with a negative limit (e.g. -1).

 "aa,bb,cc,dd,,,,".split(",", -1) 

When String.split(String regex) is called, it is called with limit = 0, which removes all trailing blank lines in the array (in most cases, see below).

The actual behavior of String.split(String regex) rather confusing:

  • Splitting an empty string will result in an array of length 1. An empty string split will always contain an array of length 1 containing an empty string.
  • Separation of ";" or ";;;" with regex ";" will lead to empty. Non-empty string markup will delete all trailing blank lines in the array.

The behavior above can be observed at least from Java 5 to Java 8.

An attempt was made to change the behavior to return an empty array when splitting an empty string in JDK-6559590 . However, he was soon returned to JDK-8028321 when he causes regression in various places. This change never hits the original release of Java 8.

+24
Feb 05 '14 at 11:08
source share

You can use public String[] split(String regex, int limit) :

The limit parameter controls the number of times the template is applied and, therefore, affects the length of the resulting array. If the limit n is greater than zero, then the pattern will be applied at the largest n - 1 time, the length of the array will be no more than n, and the last element of the array will contain all the input data for the last matched separator. If n is not positive, then the pattern will be applied as many times more, and the array can have any length. If n is zero then the pattern will be applied as many times as possible, the array can have any length, and the final empty lines will be discarded.




 String st = "aa,bb,cc,dd,,,,"; System.out.println(Arrays.deepToString(st.split(",",-1))); ↑ 

Print

 [aa, bb, cc, dd, , , , ] 
+4
Feb 05 '14 at 11:09
source share



All Articles