R regex gets text between single quotes

I have text like

la<-c("case when ANTIG_CLIENTE <= 4 then '01: ANTIG_CLIENTE <= 4' when ANTIG_CLIENTE <= 8 then '02: ANTIG_CLIENTE <= 8' else '99: Error' end ") 

I want to extract text between single quotes as a list:

 "01: ANTIG_CLIENTE <= 4","02: ANTIG_CLIENTE <= 8","99: Error" 

I tried two approaches without success

 > sub('[^\]+\"([^\']+).*', '\\1', la) Error: '\]' is an unrecognized escape in character string starting "'[^\]" > regmatches(x, gregexpr('"[^']*"', la))[[1]] Error: unexpected ']' in "regmatches(x, gregexpr('"[^']" 

How can I get text between single quotes?

+4
source share
1 answer

This should get what you want. The only assumption is that all the lines you want to use for single quotes contain a colon (otherwise, how should we distinguish between '01: ANTIG_CLIENTE <= 4' from ' when ANTIG_CLIENTE <= 8 then ' , both of which are enclosed in single quotes?):

 > regmatches(la,gregexpr("'[^']*:[^']*'",la)) [[1]] [1] "'01: ANTIG_CLIENTE <= 4'" "'02: ANTIG_CLIENTE <= 8'" "'99: Error'" 

Basically, we are trying to return all expressions (hence gregexpr instead of regexpr ) of the form of a single quote, something other than a single quote, a colon, something other than a single quote, a single quote.

If you want to exclude single quotes in what is returned, you will need look-ahead and look-behind, which requires R to interpret your regular expression as perl:

 > regmatches(la,gregexpr("(?<=')[^']*:[^']*(?=')",la,perl=T)) [[1]] [1] "01: ANTIG_CLIENTE <= 4" "02: ANTIG_CLIENTE <= 8" "99: Error" 
+3
source

All Articles