Find the probability density of a new data point using the density function in R

Question

Find the probability density of a new data point using the density function in R

I am trying to find the best PDF of continuous data with an unknown distribution using the density function in R. Now, given the new data point, I want to find the probability density of this data point based on the estimated kernel density that I get from the density function. How can i do this?

+7

r probability

programmingIsFun Jan 21 '15 at 21:45

source share

3 answers

Glen_b · Answer 1 · 2015-01-21T22:46:18+0000

If your new point is in the range of values created using density , this will be pretty easy to do - I would suggest using approx (or approxfun if you need it as a function) to handle interpolation between the grid values.

Here is an example:

 set.seed(2937107) x <- rnorm(10,30,3) dx <- density(x) xnew <- 32.137 approx(dx$x,dx$y,xout=xnew)

If we build density and a new point, we will see that it does what you need:

enter image description here

This will return NA if the new value needs to be extrapolated. If you want to handle extrapolation, I would suggest direct KDE calculation for this point (using the bandwidth from KDE you have).

Antoine · Answer 2 · 2016-01-08T17:02:10+0000

This is one year, but, nevertheless, it is a complete solution. Make a phone call

 d <- density(xs)

and define h = d$bw . Your KDE score is fully determined

xs elements
throughput h ,
type of kernel functions.

Given the new value of t , you can calculate the corresponding y(t) using the following function, which assumes that you used Gaussian kernels for the estimate.

 myKDE <- function(t){ kernelValues <- rep(0,length(xs)) for(i in 1:length(xs)){ transformed = (t - xs[i]) / h kernelValues[i] <- dnorm(transformed, mean = 0, sd = 1) / h } return(sum(kernelValues) / length(xs)) }

What myKDE does is it computes y(t) using definition .

bill_e · Answer 3 · 2015-01-21T21:59:41+0000

See: docs

 dnorm(data_point, its_mean, its_stdev)

Find the probability density of a new data point using the density function in R

More articles: