I created a survey. Since outvey-weight-weight can lead to very large deviations, I follow a tip from many statistical books: I want to truncate the top 5% and bottom 5% of the survey weight. I would like to use dplyr for this.
data<-as.data.frame(cbind(sequence(2000),rnorm(2000,mean=3.16,sd=1.355686)))
names(data)<-c("id","weight")
data2<-data %>% mutate(perc.weight=percent_rank(weight)) %>%
mutate(perc.weight>0.95 | perc.weight<0.05)
After that, I have two new variables. The first variable gives the percentage series of weights. The second variable indicates if the value exceeds the target range.
Now I want to replace the weights that are in the 95-100 percentiles and weights within the 0-5 percentile, with the weight values that make up the border of these percentiles.
I would be grateful for any help!
source
share