R gsub everything after empty

I am confused to figure out how gsub all after the "empty" value of the first hour.

 as.data.frame(valeur) valeur 1 8:01 8:15 2 17:46 18:00 3 <NA> 4 <NA> 5 <NA> 6 <NA> 7 8:01 8:15 8 17:46 18:00 

I need

  valeur 1 8:01 2 17:46 3 <NA> 4 <NA> 5 <NA> 6 <NA> 7 8:01 8 17:46 

Any clue?

I tried

  gsub("[:blank:].*$","",valeur) 

Nearly

 valeur = c(" 8:01 8:15 ", " 17:46 18:00 ", NA, NA, NA, NA, " 8:01 8:15 ", " 17:46 18:00 ") 
+5
source share
2 answers

I think you have leading / trailing spaces from the output of "valeur". We can remove them using gsub . We match one or more spaces from the beginning of the line ( ^\\s+ ) or ( | ) at the end of the line ( \\s+$ ), replace with '' .

 valeur1 <- gsub('^\\s+|\\s+$', '', valeur) 

If we need the first non-spatial characters, we match a space ( \\s+ ), followed by no space ( \\s+ ) to the end of the line, and replace it with '' .

 sub('\\s+\\S+$', '', valeur1) #[1] "8:01" "17:46" NA NA NA NA "8:01" "17:46" 

To get the latest non-spatial characters, use sub to match one or more characters that are not spaces ( \\s+ ), starting at the beginning of a line ('^'), followed by one or more spaces ( \\s+ ) and replace it on '' to get the last non-spatial character.

 sub('^\\S+\\s+', '', valeur1) #[1] "8:15" "18:00" NA NA NA NA "8:15" "18:00" 

The above can be done in one step, when we match zero or more space at the beginning ( ^\\s* ) or ( | ) one or more spaces ( \\s+ ), followed by one or more non-spaces ( \\s+ ), followed by zero or more spaces at the end ( \\s*$ ) and replaced with '' .

  gsub("^\\s*|\\s+\\S+\\s*$","",valeur) #[1] "8:01" "17:46" NA NA NA NA "8:01" "17:46" 

Or another option stri_extract_first or stri_extract_last from library(stringi) , where we match one or more non-spatial characters at the beginning or end.

  library(stringi) stri_extract_first(valeur, regex='\\S+') #[1] "8:01" "17:46" NA NA NA NA "8:01" "17:46" 

For last non_space characters

  stri_extract_last(valeur, regex='\\S+') #[1] "8:15" "18:00" NA NA NA NA "8:15" "18:00" 
+4
source

for the sake of contribution, just thought:

 substr(x = valeur, start = 2, stop = 6) [1] "8:01 " "17:46" NA NA NA NA "8:01 " "17:46" 
+2
source

All Articles