As a baseline, I ran your Python program using Python 2.7.6. Over 10 runs, it had an average time of 12.2 ms with a standard deviation of 443 μs. I do not know how you got a very good time 6.5ms .
Running Rust code with Rust 1.4.0-dev ( febdc3b20 ) without optimization, I got an average of 958 ms and a standard deviation of 33 ms.
Running the code with optimization ( cargo run --release ), I got an average of 34.6 ms and a standard deviation of 495 μs. Always benchmark in release mode .
Further optimizations you can do:
Compiling a regex once, out of sync loop:
fn main() { // ... let substring = "TTAGGG"; let re = Regex::new(substring).unwrap(); // ... for _ in 0..10000 { fun(line, &re); } // ... } fn fun(line: &str, re: &Regex) { // ... }
It produces an average of 10.4 ms with a standard deviation of 678 μs.
Switch to substring matching:
fn fun(line: &str, substring: &str) { // ... if l[0].contains(substring) { // Do nothing } }
It has an average value of 8.7 ms and a standard deviation of 334 μs.
And finally, if you look at only one result, and not at collection in a vector:
fn fun(line: &str, substring: &str) { let col = line.split(" ").nth(9); if col.map(|c| c.contains(substring)).unwrap_or(false) {
It has an average value of 6.30 ms and a standard deviation of 114 μs.