The standard way to do linear regression is something like this:
l <- lm(Sepal.Width ~ Petal.Length + Petal.Width, data=iris)
then use predict(l, new_data)to create predictions where new_data is a data framework with columns matching the formula. But it lm()returns an object lm, which is a list containing many things that in most cases are irrelevant. This includes a copy of the source data and a bunch of named vectors and arrays of data length / size:
R> str(l)
List of 12
$ coefficients : Named num [1:3] 3.587 -0.257 0.364
..- attr(*, "names")= chr [1:3] "(Intercept)" "Petal.Length" "Petal.Width"
$ residuals : Named num [1:150] 0.2 -0.3 -0.126 -0.174 0.3 ...
..- attr(*, "names")= chr [1:150] "1" "2" "3" "4" ...
$ effects : Named num [1:150] -37.445 -2.279 -0.914 -0.164 0.313 ...
..- attr(*, "names")= chr [1:150] "(Intercept)" "Petal.Length" "Petal.Width" "" ...
$ rank : int 3
$ fitted.values: Named num [1:150] 3.3 3.3 3.33 3.27 3.3 ...
..- attr(*, "names")= chr [1:150] "1" "2" "3" "4" ...
$ assign : int [1:3] 0 1 2
$ qr :List of 5
..$ qr : num [1:150, 1:3] -12.2474 0.0816 0.0816 0.0816 0.0816 ...
.. ..- attr(*, "dimnames")=List of 2
.. .. ..$ : chr [1:150] "1" "2" "3" "4" ...
.. .. ..$ : chr [1:3] "(Intercept)" "Petal.Length" "Petal.Width"
.. ..- attr(*, "assign")= int [1:3] 0 1 2
..$ qraux: num [1:3] 1.08 1.1 1.01
..$ pivot: int [1:3] 1 2 3
..$ tol : num 1e-07
..$ rank : int 3
..- attr(*, "class")= chr "qr"
$ df.residual : int 147
$ xlevels : Named list()
$ call : language lm(formula = Sepal.Width ~ Petal.Length + Petal.Width, data = iris)
$ terms :Classes 'terms', 'formula' length 3 Sepal.Width ~ Petal.Length + Petal.Width
.. ..- attr(*, "variables")= language list(Sepal.Width, Petal.Length, Petal.Width)
.. ..- attr(*, "factors")= int [1:3, 1:2] 0 1 0 0 0 1
.. .. ..- attr(*, "dimnames")=List of 2
.. .. .. ..$ : chr [1:3] "Sepal.Width" "Petal.Length" "Petal.Width"
.. .. .. ..$ : chr [1:2] "Petal.Length" "Petal.Width"
.. ..- attr(*, "term.labels")= chr [1:2] "Petal.Length" "Petal.Width"
.. ..- attr(*, "order")= int [1:2] 1 1
.. ..- attr(*, "intercept")= int 1
.. ..- attr(*, "response")= int 1
.. ..- attr(*, ".Environment")=<environment: R_GlobalEnv>
.. ..- attr(*, "predvars")= language list(Sepal.Width, Petal.Length, Petal.Width)
.. ..- attr(*, "dataClasses")= Named chr [1:3] "numeric" "numeric" "numeric"
.. .. ..- attr(*, "names")= chr [1:3] "Sepal.Width" "Petal.Length" "Petal.Width"
$ model :'data.frame': 150 obs. of 3 variables:
..$ Sepal.Width : num [1:150] 3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
..$ Petal.Length: num [1:150] 1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
..$ Petal.Width : num [1:150] 0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
..- attr(*, "terms")=Classes 'terms', 'formula' length 3 Sepal.Width ~ Petal.Length + Petal.Width
.. .. ..- attr(*, "variables")= language list(Sepal.Width, Petal.Length, Petal.Width)
.. .. ..- attr(*, "factors")= int [1:3, 1:2] 0 1 0 0 0 1
.. .. .. ..- attr(*, "dimnames")=List of 2
.. .. .. .. ..$ : chr [1:3] "Sepal.Width" "Petal.Length" "Petal.Width"
.. .. .. .. ..$ : chr [1:2] "Petal.Length" "Petal.Width"
.. .. ..- attr(*, "term.labels")= chr [1:2] "Petal.Length" "Petal.Width"
.. .. ..- attr(*, "order")= int [1:2] 1 1
.. .. ..- attr(*, "intercept")= int 1
.. .. ..- attr(*, "response")= int 1
.. .. ..- attr(*, ".Environment")=<environment: R_GlobalEnv>
.. .. ..- attr(*, "predvars")= language list(Sepal.Width, Petal.Length, Petal.Width)
.. .. ..- attr(*, "dataClasses")= Named chr [1:3] "numeric" "numeric" "numeric"
.. .. .. ..- attr(*, "names")= chr [1:3] "Sepal.Width" "Petal.Length" "Petal.Width"
- attr(*, "class")= chr "lm"
This material takes up a lot of space, and the object lmis almost an order of magnitude larger than the original data set:
R> object.size(iris)
7088 bytes
R> object.size(l)
52704 bytes
, 170 , 450mb lm. false, lm 5 :
R> ls <- lm(Sepal.Width ~ Petal.Length + Petal.Width, data=iris, model=FALSE, x=FALSE, y=FALSE, qr=FALSE)
R> object.size(ls)
30568 bytes
R, , ? , , ?
: , , lm, - ...