R regex gsub single letters and numbers

I have a string that contains mixed letters and numbers:

"The sample is 22mg" 

I would like to split the lines where the number immediately follows, for example:

 "The sample is 22 mg" 

I tried this:

 gsub('[0-9]+[[aA-zZ]]', '[0-9]+ [[aA-zZ]]', 'This is a test 22mg') 

but I do not get the desired results.

Any suggestions?

+7
source share
2 answers

You need to use parentheses for the brackets in the regular expression and group references in the replacement. For example:

 gsub('([0-9])([[:alpha:]])', '\\1 \\2', 'This is a test 22mg') 

There is nothing R-specific; R help is recommended for regex and gsub .

+14
source

You need feedback:

 test <- "The sample is 22mg" > gsub("([0-9])([a-zA-Z])","\\1 \\2",test) [1] "The sample is 22 mg" 

Everything in brackets is remembered. Then they are accessed by \ 1 (for the first object in parens), \ 2, etc. The first backslash skips the interpretation of the backslash in R so that it is passed to the regular expression parser.

+10
source

All Articles