Find the probability density of a new data point using the density function in R

I am trying to find the best PDF of continuous data with an unknown distribution using the density function in R. Now, given the new data point, I want to find the probability density of this data point based on the estimated kernel density that I get from the density function. How can i do this?

+7
r probability
source share
3 answers

If your new point is in the range of values ​​created using density , this will be pretty easy to do - I would suggest using approx (or approxfun if you need it as a function) to handle interpolation between the grid values.

Here is an example:

 set.seed(2937107) x <- rnorm(10,30,3) dx <- density(x) xnew <- 32.137 approx(dx$x,dx$y,xout=xnew) 

If we build density and a new point, we will see that it does what you need:

enter image description here

This will return NA if the new value needs to be extrapolated. If you want to handle extrapolation, I would suggest direct KDE calculation for this point (using the bandwidth from KDE you have).

+4
source share

This is one year, but, nevertheless, it is a complete solution. Make a phone call

 d <- density(xs) 

and define h = d$bw . Your KDE score is fully determined

  • xs elements
  • throughput h ,
  • type of kernel functions.

Given the new value of t , you can calculate the corresponding y(t) using the following function, which assumes that you used Gaussian kernels for the estimate.

 myKDE <- function(t){ kernelValues <- rep(0,length(xs)) for(i in 1:length(xs)){ transformed = (t - xs[i]) / h kernelValues[i] <- dnorm(transformed, mean = 0, sd = 1) / h } return(sum(kernelValues) / length(xs)) } 

What myKDE does is it computes y(t) using definition .

+2
source share

See: docs

 dnorm(data_point, its_mean, its_stdev) 
-2
source share

All Articles