Extract parts of a string in R

I have a form string

stamp = "section_d1_2010-07-01_08_00.txt"

and would like to be able to extract some of this. I was able to do this using repeated str_extract to go to the section I want, for example. grab a month

month = str_extract(stamp,"2010.+")
month = str_extract(month,"-..")
month = str_extract(month,"..$")

however, this is terribly inefficient and there should be a better way. In this particular example, I can use

month = substr(stamp,17,18)

however, we are looking for something more universal (in the case of a change in the number of digits).

I think I need a regular expression to grab what appears AFTER certain flags (_ or -, or 3rd, etc.). I also tried using sub, but had the same problem that I needed a few to hone what I really wanted.

An example of how you can say that the month (07 here) and hour (08 here) will be evaluated.

+4
source
2

strsplit regex [-_] perl=TRUE, .

stamp <- "section_d1_2010-07-01_08_00.txt"
strsplit(stamp, '[-_]')[[1]]
# [1] "section" "d1"      "2010"    "07"      "01"      "08"      "00.txt" 

. .

https://regex101.com/r/cK4iV0/8

+4

gsub('^.*_\\d+-|-\\d+_.*$', '', stamp)
#[1] "07"

library(stringr)
str_extract(stamp, '(?<=\\d_)\\d+(?=_\\d)')
#[1] "08"

 str_extract_all(stamp, '(?<=\\d{4}[^0-9])\\d{2}|\\d{2}(?=[^0-9]\\d{2}\\.)')[[1]]
 #[1] "07" "08"
+2

All Articles