I'm not sure I understood the question correctly, but I will try. You have 3 distributions of D1, D2 and D3. From these three distributions, you would like to create variables that use 2 of these 3, but not the same ones.
Since I donβt know how distributions should be combined, I used flags using a binomial distribution (its length vector is 200 with 0s and 1s) to determine which distribution each value will be selected from (you can change that if itβs not, as you want it).
D1 = rnorm(200,2,1) D2 = rnorm(200,3,1) D3= rnorm(200,1.5,2)
To create a mixed distribution, we can use the rbinom function to create the vector 1s and 0s according to the selected probability. This is a way to have some values ββfrom both distributions.
var_1_flag <- rbinom(200, size=1, prob = 0.3) var_1 <- var_1_flag*D1 + (1 - var_1_flag)*D2 var_2_flag <- rbinom(200, size=1, prob = 0.7) var_2 <- var_2_flag*D2 + (1 - var_2_flag)*D3 var_3_flag <- rbinom(200, size=1, prob = 0.6) var_3 <- var_3_flag*D1 + (1 - var_3_flag)*D3
To find out what values ββcome from the distribution, you can do the following:
var_1[var_1_flag] # This gives you the values ββin the mixed distribution that come from the first distribution (D1)
var1[!var_1_flag] # This gives you the values ββin the mixed distribution that come from the second distribution (D2)
Since I found this a little tame, and I assume that you might want to change the variables, you can use the function below to get the same results
create_distr <- function(observations, mean1, sd1, mean2, sd2, flag_prob) { flag <- rbinom(observations, size=1, prob = flag_prob) my_distribution <- flag * rnorm(observations, mean1, sd1) + (1 - flag) * rnorm(observations, mean2, sd2) } var_1 <- create_distr(200, 2, 1, 3, 1, 0.5) var_2 <- create_distr(200, 3, 1, 1.5, 2, 0.7) var_3 <- create_distr(200, 2, 1, 1.5, 2, 0.6)
If you want to have more than two variables (distributions) for the mix, you can extend the code you provided as follows:
N <- 100000 #Sample N random uniforms U U <- runif(N) #Variable to store the samples from the mixture distribution rand.samples <- rep(NA,N) for(i in 1:N) { if(U[i] < 0.3) { rand.samples[i] <- rnorm(1,1,3) } else if (U[i] < 0.5){ rand.samples[i] <- rnorm(1,2,5) } else if (U[i] < 0.8) { rand.samples[i] <- rnorm(1,5,2) } else { rand.samples[i] <- rt(1, 2) } }
Thus, each element is taken from one of each distribution. If you want to get the same result, but without each element one at a time, you can do the following:
N <- 100000
Which corresponds to 0.3 * normal (1.3) + 0.2 * normal (2.5) + 0.3 * normal (5.2) + 0.2 * students (2 degrees of freedom)
If you want to create two mixtures, and in the second to keep the same values ββfrom the usual distribution, you can do the following:
mixture_1 <- c(D1[U < 0.3], D2[U >= 0.3 ]) mixture_2 <- c(D1[U < 0.3], D3[U >= 0.3])
This will use the same elements from normal (1.3) in both blends. The trick is not to recount rnorm (N, 1,3) every time you use it. And in both cases, the samples consist of 30%, approximately coming from the first normal (D1) and 70% from about the second distribution. For example:
set.seed(1) N <- 100000 U <- runif(N) > prop.table(table(U < 0.3)) FALSE TRUE 0.6985 0.3015
30% of the values ββin the vector U are below 0.3.