Adding an extra row for each object identifier and storing values ​​in other columns

I want to add an extra row for each object identifier in the data frame (below). This line should have TIME=0 and DV=0 . Other values ​​in other columns should remain unchanged. The data frame is as follows:

 ID TIME DV DOSE pH 1 1 5 50 4.6 1 5 10 50 4.6 2 1 6 100 6.0 2 7 10 100 6.0 

After adding an extra line, it should look like this:

 ID TIME DV DOSE pH 1 0 0 50 4.6 1 1 5 50 4.6 1 5 10 50 4.6 2 0 0 100 6.0 2 1 6 100 6.0 2 7 10 100 6.0 

How could I achieve this in R?

+5
source share
4 answers

Try the following:

 #dummy data df <- read.table(text="ID TIME DV DOSE pH 1 1 5 50 4.6 1 5 10 50 4.6 2 1 6 100 6.0 2 7 10 100 6.0",header=TRUE) #data with zeros df1 <- df df1[,c(2,3)] <- 0 df1 <- unique(df1) #rowbind and sort res <- rbind(df,df1) res <- res[order(res$ID,res$TIME),] res # ID TIME DV DOSE pH # 11 1 0 0 50 4.6 # 1 1 1 5 50 4.6 # 2 1 5 10 50 4.6 # 31 2 0 0 100 6.0 # 3 2 1 6 100 6.0 # 4 2 7 10 100 6.0 
+5
source

Here's another possible solution to data.table

 library(data.table) setDT(df)[, .SD[c(1L, seq_len(.N))], ID][, indx := seq_len(.N), ID][indx == 1L, 2:3 := 0][] # ID TIME DV DOSE pH indx # 1: 1 0 0 50 4.6 1 # 2: 1 1 5 50 4.6 2 # 3: 1 5 10 50 4.6 3 # 4: 2 0 0 100 6.0 1 # 5: 2 1 6 100 6.0 2 # 6: 2 7 10 100 6.0 3 
+5
source

I changed the indexing from c(.N+1, 1:.N) to c(1L, 1:.N) (from @David Arenburg's post), as this is simpler :-)

 library(data.table) setDT(df)[, .SD[c(1L,1:.N)], by=ID][, 2:3 := .SD*(!duplicated(.SD, fromLast=TRUE))+0L, .SDcols=2:3][] # ID TIME DV DOSE pH #1: 1 0 0 50 4.6 #2: 1 1 5 50 4.6 #3: 1 5 10 50 4.6 #4: 2 0 0 100 6.0 #5: 2 1 6 100 6.0 #6: 2 7 10 100 6.0 

Or you can use set , which is updated by reference (if there are many columns)

  DT <- setDT(df)[, .SD[c(1L, 1:.N)], by=ID] indx <- DT[, !duplicated(.SD, fromLast=TRUE), .SDcols=2:3] for(j in 2:3){ set(DT, i=NULL, j=j, value= DT[[j]]*(indx+0L)) } 
+3
source

A brief approach using plyr :

 library(plyr) ldply(split(df, df$ID), function(u){x=u[1,];x[c("DV","TIME")]=0;rbind(x,u)}) # .id ID TIME DV DOSE pH #1 1 1 0 0 50 4.6 #2 1 1 1 5 50 4.6 #3 1 1 5 10 50 4.6 #4 2 2 0 0 100 6.0 #5 2 2 1 6 100 6.0 #6 2 2 7 10 100 6.0 
+2
source

Source: https://habr.com/ru/post/1213135/


All Articles