R: How to set up a large dataset with a combination of distributions?

To fit a real number data set ( x) with a single distribution, we can use MASS as follows: gamma or Student Distribution t :

fitdistr(x, "gamma")

or

fitdistr(x2, "t")

What if I believe that my data set should match the sum of the gamma and t distributions?

P(X) = Gamma(x) + t(x)

Can I adjust the parameters of mixtures of probability distributions using the maximum likelihood method in R?

+5
source share
2 answers

, . , , .

fitdistr() R , optim(). , Gamma t, , ​​. optim() . :

library( MASS )

vals = rnorm( n = 10000, mean = 0, sd = 1 ) 
print( summary(x_vals) )

ll_func = function(params) {
   log_probs = log( dnorm( x = vals, mean = params[1], sd = params[2] ))
   tot = sum(log_probs)
   return(-1 * tot)
}       

params = c( 0.5, 10 )

print( ll_func(params) )
res = optim( params, ll_func )
print( res$par )

R :

[1] "mean: 0.0223766157516646"
[1] "sd:   0.991566611447471"

= 0 sd = 1.

, , . , . overfitting.

+3

All Articles