I have a large dataset in R (1M + rows by 6 columns) that I want to use to train a random forest (using the package randomForest) for regression purposes. Unfortunately, I get an error Error in matrix(0, n, n) : too many elements specifiedwhen trying to do all this at once and cannot allocate enough memory errors when working in a subset of the data - up to 10,000 or so observations.
Seeing that I canβt add more RAM to my machine, and random forests are very suitable for the type of process I'm trying to simulate, I would really like to do this job.
Any suggestions or workarounds are greatly appreciated.
ktdrv source
share