Parsing a string in Java, which is faster? Regular or string methods?

I have a dilemma. I am parsing a string and can either do

s.matches(regex)

or i can do

s.startsWith(..) && s.endsWith(..)

As you already understood, this is not a complicated regular expression, and both situations will work. The idea is that the string can be very long (hundreds of characters), so I want to maximize efficiency. What works and is best suited for this problem?

+4
source share
3 answers

Here is a really pretty crude test to give you an idea. Adapt it to your use cases to give you more relevant results. startsWith and endsWith are much faster. Results after 1,000,000 runs:

uncompiled template 1091ms

745ms

/ 24

public class TestRegex {

String regex = "^start.*end$";
Pattern p = Pattern.compile(regex);
String start = "start";
String end = "end";
String search = start + "fewbjlhfgljghfadsjhfdsaglfdhjgahfgfjkhgfdkhjsagafdskghjafdkhjgfadskhjgfdsakhjgfdaskhjgafdskjhgafdsjhkgfads" +end;
int runs = 1000000;


@Test
public final void test() {
    //init run
    for (int i=0;i<runs;i++)
        search.matches(regex);
    for (int i=0;i<runs;i++)
        p.matcher(search).matches();
    for (int i=0;i<runs;i++){
        search.startsWith(start);
        search.endsWith(end);
    }

    //timed run;
    Stopwatch s = Stopwatch.createStarted();
    for (int i=0;i<runs;i++)
        search.matches(regex);
    System.out.println(s.elapsed(TimeUnit.MILLISECONDS));
    s.reset();      s.start();
    for (int i=0;i<runs;i++)
        p.matcher(search).matches();
    System.out.println(s.elapsed(TimeUnit.MILLISECONDS));
    s.reset();      s.start();
    for (int i=0;i<runs;i++){
        search.startsWith(start);
        search.endsWith(end);
    }
    System.out.println(s.elapsed(TimeUnit.MILLISECONDS));

}

}
+5

, , , , :

^start.*art$

"start"

"start".startsWith("start") && "start".endsWith("art")

.

+3

Indeed, the difference exists and is noticeable for small lines. Having a compiled version of a regex using a pattern makes some improvements, but without a doubt, its worst idea when matching is easy.

Thanks to everyone.

+1
source

All Articles