Calculate the percentage of the start of intervention in R

I have a data frame that has the strength of a vertical jump for a group of athletes over an 8 week period. One athlete passed only seven weeks. I want to calculate a new variable called "percent.change", which calculates the percentage difference from the first week for each week for the study period. I tried to use dplyr to figure this out, but I was stuck. I am wondering if anyone has a direct solution.

The data frame is called weeklyPower. An example of a weekly meal below:

athlt week power E 1 25.20015 E 2 25.54569 E 3 24.52463 E 4 24.88044 E 5 25.11421 E 6 25.86154 E 7 26.08613 E 8 25.90775 K 1 29.74277 K 2 28.80131 K 3 28.96818 K 4 29.62439 K 5 29.98119 K 6 29.11570 K 7 29.96380 T 1 25.02413 T 2 23.75867 T 3 25.25716 T 4 24.73285 T 5 27.02891 T 6 25.60140 T 7 25.64665 T 8 24.38937 

Thanks so much for any thoughts.

Matt

+5
source share
4 answers

Using dplyr , you can do:

 df %>% group_by(athlt) %>% arrange(week) %>% mutate(cp = power / first(power) * 100) 

What gives:

 #Source: local data frame [23 x 4] #Groups: athlt # athlt week power cp #1 E 1 25.20015 100.00000 #2 E 2 25.54569 101.37118 #3 E 3 24.52463 97.31938 #4 E 4 24.88044 98.73132 #5 E 5 25.11421 99.65897 #6 E 6 25.86154 102.62455 #7 E 7 26.08613 103.51577 #8 E 8 25.90775 102.80792 #9 K 1 29.74277 100.00000 #10 K 2 28.80131 96.83466 

Or another option:

 df %>% group_by(athlt) %>% mutate(cp = power / power[which.min(week)] * 100) 
+7
source

Here is a similar option with data.table , where we convert data.frame to data.table ( setDT(df) ), for each group "athlt" we order for "week" and create a column 'cp' for quick assignment ( := ) link.

 library(data.table) setDT(df)[order(week), cp := power/power[1L]*100 ,by=athlt] # athlt week power cp #1: E 1 25.20015 100.00000 #2: E 2 25.54569 101.37118 #3: E 3 24.52463 97.31938 #4: E 4 24.88044 98.73132 #5: E 5 25.11421 99.65897 #6: E 6 25.86154 102.62455 

We can also use setorder , which is usually memory efficient as it reorders the data set by reference. But in this case (as pointed out in the comments of @Arun), the above method will also be very efficient, since order only calculates indexes instead of reordering the entire data set.

  setorder(setDT(df),athlt,week)[, cp:= power/power[1L] *100, athlt][] 

Or, if the "week" is numeric, you can use which.min without using order

  setDT(df)[, cp := power/power[which.min(week)]*100, by=athlt] 
+4
source
 > Data$percent_change <- unlist( + tapply(Data$power, Data$athlt, function(x) c(NA, 100*x[-1]/x[1]) ) + ) > Data athlt week power percent_change 1 E 1 25.20015 NA 2 E 2 25.54569 101.37118 3 E 3 24.52463 97.31938 4 E 4 24.88044 98.73132 5 E 5 25.11421 99.65897 6 E 6 25.86154 102.62455 7 E 7 26.08613 103.51577 8 E 8 25.90775 102.80792 9 K 1 29.74277 NA 10 K 2 28.80131 96.83466 11 K 3 28.96818 97.39570 12 K 4 29.62439 99.60199 13 K 5 29.98119 100.80161 14 K 6 29.11570 97.89169 15 K 7 29.96380 100.74314 16 T 1 25.02413 NA 17 T 2 23.75867 94.94304 18 T 3 25.25716 100.93122 19 T 4 24.73285 98.83600 20 T 5 27.02891 108.01139 21 T 6 25.60140 102.30685 22 T 7 25.64665 102.48768 23 T 8 24.38937 97.46341 
+1
source

Sorry for my belated answer, but I found this helpful, especially with the use of a check period. Hope this helps.

 df %>% group_by(athlt) %>% arrange(athlt, week) %>% # first arrange data to get series and then count power per athlete and week mutate(wk.growth = round(power/lag(power,1)*100, 1), # creat weekly delta based on the week before ref.growth = round((power / first(power)*100), 1), # creat weekly delta based on the reference week n.week = length(athlt)) %>% replace_na(list(wk.growth=100)) %>% # replace NA with 100 to calculate changes mutate(d.growth = wk.growth - first(wk.growth), # add change rates d.ref.growth = ref.growth - first(ref.growth) # add change rates with reference week ) 
0
source

All Articles