In R, I have a data column in the data frame, and each element looks something like this:
Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Marinilabiaceae
What I want is the section after the last semicolon, and I'm trying to use "sub" as well as duplicate an existing column and create a new one with saved endings. In essence, I want this (kind):
Marinilabiaceae
The code snippet is as follows:
mydata$new_column<- sub("([\\s\\S]*;)", "", mydata$old_column)
In this situation, I use \\, and not \because of R escape sequences. subreplaces parts that I don't want and updates them to a new column. I tested Regex several times in places like: http://regex101.com/r/kS7fD8/1
However, I am still afraid, because the results are very strange. Now my new column is filled with the domain of the body, not born: Bacteria.
How do i solve this? Are there any good comprehensible resources for more information on R Regex formats?
source
share