In the carriage: creating several sections of different sizes for testing / training / verification

I am trying to take a data set and break it into 3 parts: training: 60%, testing: 20% and validation: 20%.

part1 <- createDataPartition(fullDataSet$classe, p=0.8, list=FALSE)
validation <- fullDataSet[-part1,]
workingSet <- fullDataSet[part1,]

When I do the same to break up again:

inTrain <- createDataPartition(workingSet$classe, p=.75, list=FALSE)

I get an error message:

Error in sort.list(y) : 'x' must be atomic for 'sort.list'
Have you called 'sort' on a list?

Is there a way: a) to create 3 partitions of different sizes, or b) to make a nested partition, like what I was trying to do? I considered c) using sample () instead, but this is for a class in which the teacher uses only createDataPartition, and we need to show our code. Does anyone have any advice?

+4
source share
2 answers

, , .

, 60% 20% . :

set.seed(1234)
inTraining <- createDataPartition(mydata$FLAG, p=0.6, list=FALSE)
training.set <- mydata[inTraining,]
Totalvalidation.set <- mydata[-inTraining,]
# This will create another partition of the 40% of the data, so 20%-testing and 20%-validation
inValidation <- createDataPartition(Totalvalidation.set$FLAG, p=0.5, list=FALSE)
testing.set <- Totalvalidation.set[inValidation,]
validation.set <- Totalvalidation.set[-inValidation,]

, . , , - , , ! :)

+4
  #METHOD 1 : EQUAL SPLITS
  # allind <- sample(1:nrow(m.d),nrow(m.d))
  # #split in three parts 
  # trainind <- allind[1:round(length(allind)/3)]
  # valind <- allind[(round(length(allind)/3)+1):round(length(allind)*(2/3))]
  # testind <- allind[round(length(allind)*(2/3)+1):length(allind)]

  set.seed(1234)

 #METHOD 2 : 60-30-20 SPLIT
 allind <- sample(1:nrow(m.d),nrow(m.d))
 trainind <- allind[1:round(length(allind)*0.6)]
 valind <- allind[(round(length(allind)*0.6)+1):((round(length(allind)*0.6)+1)+    
 (round(length(allind)*0.3)))]
 testind <- allind[((round(length(allind)*0.6)+1)+
 (round(length(allind)*0.3))+1):length(allind)]
 m.dTRAIN <- m.d[trainind,]
 m.dVAL   <- m.d[valind,]
 m.dTEST  <- m.d[testind,]
0

All Articles