The calculation of the number of points lies above and below the regression line with R

How to calculate the number of points lying above and below the regression line in the scatter chart?

data = read.csv("info.csv") par(pty = "s") plot(data$col1, data$col2, xlab = "xaxis", ylab = "yaxis", xlim = c(0, 1), cex.lab = 1.5, cex.axis = 1.5, ylim = c(0, 1), col.lab = "red", col = "blue", pch = 19) abline(a = -1.21, b = 2.21) 
+6
source share
2 answers
 x <- 1:10 set.seed(1) y <- 2*x+rnorm(10) plot(y~x) fit <- lm(y~x) abline(fit) resi <- resid(fit) #below the fit: sum(resi < 0) #above the fit: sum(resi > 0) 

Edit: If you did (for some unknown reason) something like this:

 x <- 1:10 set.seed(1) y <- 2*x+rnorm(10) plot(y~x) abline(-0.17,2.05) 

You can do it:

 yfit <- 2.05 * x - 0.17 resi <- y - yfit sum(resi < 0) sum(resi > 0) 
+14
source

If I read the question correctly, the answer will be.

  • Define the regression line equation - it is direct and will have the form y = mx + b, where m is the slope of the line, and b is the interception of y.
  • Calculate the y value for each x in region x.
  • Using the y value that you have in your data, determine if it will be greater, equal to, or less than the calculated y value

Using the above should be sufficient to find the numbers you are after.

+1
source

Source: https://habr.com/ru/post/927246/


All Articles