Nonlinear Regression Line and R² in ggplot2

I have the following data:

dput(dat)
structure(list(Band = c(1930, 1930, 1930, 1930, 1930, 1930, 1930, 
1930, 1930, 1930, 1930, 1930, 1930, 1930, 1930, 1930, 1930, 1930
), Reflectance = c(25.296494, 21.954657, 18.981184, 15.984661, 
14.381341, 12.485372, 10.592539, 8.51772, 7.601568, 7.075429, 
6.205453, 5.36646, 4.853167, 4.21576, 3.979639, 3.504217, 3.313851, 
2.288752), Number.of.Sprays = c(0, 1, 2, 3, 5, 6, 7, 9, 10, 11, 
14, 17, 19, 21, 27, 30, 36, 49), Legend = structure(c(4L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 5L
), .Label = c("1 x spray between each measurement", "2 x spray between each measurement", 
"3 x spray between each measurement", "Dry soil", "Wet soil"), class = "factor")), .Names =c("Band", 
"Reflectance", "Number.of.Sprays", "Legend"), row.names = c(NA, 
-18L), class = "data.frame")

leading to the following graph

enter image description here

with the following code

g <- ggplot(dat, aes(Number.of.Sprays, Reflectance, colour = Legend)) +
    geom_point (size = 3) +
    geom_smooth (aes(group = 1, colour = "Trendline"), method = "loess", size = 1, linetype = "dashed", se = FALSE) +
    stat_smooth(method = "nls", formula = "y ~ a*x^b", start = list(a = 1, b = 1), se = FALSE)+
    theme_bw (base_family = "Times") +
    labs (title = "Regression between Number of Sprays and Reflectance in Band 1930") +
    xlab ("Number of Sprays") +
    guides (colour = guide_legend (override.aes = list(linetype = c(rep("blank", 4), "dashed", "blank"), shape = c(rep(16, 4), NA, 16)))) +
    scale_colour_manual (values = c("cyan", "green2", "blue", "brown",  "red", "purple")) +
    theme (legend.title = element_text (size = 15), legend.justification = c(1,1),legend.position = c(1,1), legend.background = element_rect (colour = "black", fill = "white"))

Note. I really don't get the string stat_smoothand the function started in it, just adapted it from another thread.

Now my questions and goals:

  • Is there a package / function that can provide a more or less accurate estimate of which linear functions are best for points? Or do I need to try different function formulas and see which one works best? The "Trendline" based method = "loess"looks pretty good, but I don’t know what base it is designed for.

  • Why does my line applied through stat_smooth()depend on the levels of factors in the data and do not just rely on all points?

  • "Trendline" ? ( ?)

  • , R² ? ( , R2 "" , ). summary(lm()) . R² ?

, , , , , . , , - .

,

+4
1

1) , , , , , NLS, , loess , .

, . a Reflectance, Number of Sprays = 0 b Reflectance Number of Sprays, - . a b . :

fit = lm ( data = dat, Reflectance ~ Number.of.Sprays )

ggplot geom_smooth :

stat_smooth(method = "nls", formula = "y ~ a*x^b",  method.args = list(start=c(a=fit$coefficients[[1]], b=fit$coefficients[[2]])), se = FALSE)

NLS , ​​ .

4) . , , R2 . :

r2 =  cor (dat$Reflectance, predict(fit))^2

2,3) , , . Legend, , .

0

All Articles