R: problem with error mle (): not finite finite difference [2]

I have a simple x, y data.frame.

 mydata <- data.frame(days = 1:96, risk = c(5e-09, 5e-09, 5e-09, 1e-08, 4e-08, 6e-08, 9e-08, 1.5e-07, 4.2e-07, 7.2e-07, 1.02e-06, 1.32e-06, 1.66e-06, 2.19e-06, 2.76e-06, 3.32e-06, 3.89e-06, 4.55e-06, 5.8e-06, 7.16e-06, 8.51e-06, 9.85e-06, 1.138e-05, 1.396e-05, 1.672e-05, 1.947e-05, 2.222e-05, 2.521e-05, 2.968e-05, 3.439e-05, 3.909e-05, 4.378e-05, 4.894e-05, 5.697e-05, 6.546e-05, 7.392e-05, 8.236e-05, 9.16e-05, 0.00010573, 0.00012063, 0.00013547, 0.00015025, 0.00016642, 0.00019127, 0.00021743, 0.00024343, 0.00026924, 0.00029818, 0.00034681, 0.00039832, 0.00044932, 0.00049976, 0.0005451, 0.00056293, 0.00057586, 0.00058838, 0.0006005, 0.00061562, 0.00065079, 0.00068845, 0.00072508, 0.00076062, 0.00079763, 0.00084886, 0.00090081, 0.0009507, 0.00099844, 0.00104427, 0.00108948, 0.00113175, 0.00117056, 0.00120576, 0.00123701, 0.00126253, 0.00128269, 0.00129757, 0.00130716, 0.00131291, 0.00132079, 0.0013216, 0.00131392, 0.00129806, 0.00127247, 0.00122689, 0.00117065, 0.00110696, 0.00103735, 0.00095951, 0.00085668, 0.0007517, 0.00065083, 0.000556, 0.0004669, 0.00037675, 0.00029625, 0.00093289)) 

I think Weibull(3, 0.155) very suitable for my data, judging by the chart below.

 plot(1:96, dweibull(mydata$risk, shape = 3, scale = 0.155), type = "l", xlab = "days", ylab = "risk") lines(mydata, type = "l", col = "grey") legend("topleft", c("Data", "Estimate"), col = c("black", "grey"), lty = c(1, 1)) 

enter image description here

I am writing a function that calculates the negative logarithmic likelihood that will be passed to mle .

 estimate <- function(kappa, lambda){ -sum(dweibull(mydata$y, shape = kappa, scale = lambda, log = TRUE)) } 

I call mle , provide my initial parameter estimates and get the following error.

 > mle(estimate, start = list(kappa = 3, lambda = 0.155)) Error in optim(start, f, method = method, hessian = TRUE, ...) : non-finite finite-difference value [2] In addition: There were 50 or more warnings (use warnings() to see the first 50) 

What is wrong here?

+7
r statistics estimation mle
source share
1 answer

What do you want to do? From what I can say, you have a dataset of 96 risk values, and you want to put it into the distribution using weibull. Please note that “days” are not relevant at all, if so. You have an unordered vector of values.

The picture above is misleading. You compute dweibull() for risk values. The figure shows that dweibull(risk) approximately equal to risk. This is a completely different statement than Weibull, provided that the specified parameters are well suited.

for example, here is the distribution of your data: hist(mydata$risk, breaks=15) enter image description here while the Weibull density with your parameters in the corresponding range is as follows: curve((function(x) dweibull(x, shape=3, scale=0.155))(x), 0, 0.0014) enter image description here

Therefore, these distributions are very different. I would say that your empirical distributions are uniform, plus the mass is zero, not weibull.

Now to your last problem: since distributions do not fit well, the optimizer works with numerical features. I don't know mle() too well, but with a little tweak, maxLik::maxLik() will show the problem:

 estimate <- function(par){ Kappa <- par[1] Lambda <- par[2] dweibull(mydata$risk, shape = Kappa, scale = Lambda, log = TRUE) } summary(maxLik::maxLik(estimate, start=c(Kappa=3, Lambda=0.155), method="BHHH")) 

gives you

 -------------------------------------------- Maximum Likelihood estimation BHHH maximisation, 43 iterations Return code 2: successive function values within tolerance limit Log-Likelihood: 682.743 2 free parameters Estimates: Estimate Std. error t value Pr(> t) Kappa 0.4849129 0.0473720 10.236 < 2e-16 *** Lambda 0.0002953 0.0001028 2.873 0.00407 ** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 -------------------------------------------- 

Note that I made one significant change: removing sum from the log-likelihood and using the BHHH optimizer. This is usually more stable than optimization based on a single cumulative probability. You should also seriously consider writing analytical derivatives for evaluation.

You can check that distributions look much more similar.

+2
source share

All Articles