It might be interesting to execute a control function of String functions
## Data require("stringi") vec = paste0(sample(LETTERS, 1e6, replace = TRUE), collapse = "") df <- data.frame(vec, vec, vec, vec, vec, vec, vec, vec, vec, vec, stringsAsFactors = FALSE) ### Base method base_fun <- function(x){ sapply(gregexpr("CG", x), function(x) sum(x != -1)) } ### Stringi Method stringi_fun <- function(x){ sapply(x, function(x) stri_count_fixed(x,"CG")) } ### Stringr method library(stringr) stringr_fun <- function(x){ sapply(x, function(x) str_count(x, "CG")) } base_fun(df) # [1] 1441 1441 1441 1441 1441 1441 1441 1441 1441 1441 stringi_fun(df) # vec vec.1 vec.2 vec.3 vec.4 vec.5 vec.6 vec.7 vec.8 vec.9 # 1441 1441 1441 1441 1441 1441 1441 1441 1441 1441 stringr_fun(df) # vec vec.1 vec.2 vec.3 vec.4 vec.5 vec.6 vec.7 vec.8 vec.9 # 1441 1441 1441 1441 1441 1441 1441 1441 1441 1441 require(rbenchmark) benchmark(base_fun(df), stringi_fun(df), stringr_fun(df)) # test replications elapsed relative user.self sys.self user.child sys.child # 1 base_fun(df) 100 17.499 1.000 17.513 0 0 0 # 2 stringi_fun(df) 100 34.897 1.994 34.926 0 0 0 # 3 stringr_fun(df) 100 17.555 1.003 17.564 0 0 0
In this particular example, these are the results. Feel free to add or change them. base_fun (df) = stringr_fun (df)> stringi_fun (df)
EDIT: The search engine in stringi 0.2-3 has been greatly improved. New benchmarks (on another machine):
benchmark(base_fun(df), stringi_fun(df), stringr_fun(df)) ## test replications elapsed relative user.self sys.self user.child sys.child ## 1 base_fun(df) 100 26.412 21.214 26.353 0.004 0 0 ## 2 stringi_fun(df) 100 1.245 1.000 1.241 0.000 0 0 ## 3 stringr_fun(df) 100 26.995 21.683 26.905 0.011 0 0
So we have stringi <base = stringr
marbel
source share