Retrieving data items found in a single column

This is what my data looks like.

id interest_string
1       YI{Z0{ZI{
2             ZO{
3            <NA>
4             ZT{

As you can see, there can be several codes combined in one column, separated by a {. It is also possible that the string has no values interest_string.

How can I manipulate this data frame to extract values ​​into this format:

id  interest
1    YI
1    Z0
1    ZI
2    Z0
3    <NA>
4    ZT

I need to complete this task with R.

Thanks in advance.

+4
source share
2 answers

This is one of the solutions.

out <- with(dat, strsplit(as.character(interest_string), "\\{"))
## or
# out <- with(dat, strsplit(as.character(interest_string), "{", fixed = TRUE))

out <- cbind.data.frame(id = rep(dat$id, times = sapply(out, length)),
                        interest = unlist(out, use.names = FALSE))

Donation:

R> out
  id interest
1  1       YI
2  1       Z0
3  1       ZI
4  2       ZO
5  3     <NA>
6  4       ZT

Explanation

interest_string dat, \\{ . , R \. ( , fixed = TRUE strsplit.) - ,

R> out
[[1]]
[1] "YI" "Z0" "ZI"

[[2]]
[1] "ZO"

[[3]]
[1] "<NA>"

[[4]]
[1] "ZT"

, , . , , id, out, .

, ( , ) id, , strsplit (out). , , , . , id , strsplit.

+5

data.table:

library(data.table)
DT <- data.table( read.table( textConnection("id interest_string
1       YI{Z0{ZI{
2             ZO{
3            <NA>
4             ZT{"), header=TRUE))

DT$interest_string <- as.character(DT$interest_string)

DT[, {
  list(interest=unlist(strsplit( interest_string, "{", fixed=TRUE )))
}, by=id]

   id interest
1:  1       YI
2:  1       Z0
3:  1       ZI
4:  2       ZO
5:  3     <NA>
6:  4       ZT
+5

All Articles