LibSVM accuracy decreases

After receiving my testlabel and trainlabel, I implemented SVM on libsvm and I got an accuracy of 97.4359%. (c = 1 and g = 0.00375)

model = svmtrain(TrainLabel, TrainVec, '-c 1 -g 0.00375');
[predict_label, accuracy, dec_values] = svmpredict(TestLabel, TestVec, model);

After I find the best c and g,

bestcv = 0;
for log2c = -1:3,
  for log2g = -4:1,
    cmd = ['-v 5 -c ', num2str(2^log2c), ' -g ', num2str(2^log2g)];
    cv = svmtrain(TrainLabel,TrainVec, cmd);
    if (cv >= bestcv),
      bestcv = cv; bestc = 2^log2c; bestg = 2^log2g;
    end
    fprintf('%g %g %g (best c=%g, g=%g, rate=%g)\n', log2c, log2g, cv, bestc, bestg, bestcv);
  end
end

c = 8 and g = 0.125

I will implement the model again:

 model = svmtrain(TrainLabel, TrainVec, '-c 8 -g 0.125');
[predict_label, accuracy, dec_values] = svmpredict(TestLabel, TestVec, model);

I get an accuracy of 82.0513%

How can accuracy be reduced? shouldn't it increase? Or I'm wrong?

+5
source share
2 answers

The accuracy that you obtained during the parameter adjustment is biased up because you predicted the same data that you trained. This is often great for setting options.

, , , .

, ( ): http://www.pnas.org/content/99/10/6562.abstract

-, :

n     = 95 % total number of observations
nfold = 10 % desired number of folds

% Set up CV folds
inds = repmat(1:nfold, 1, mod(nfold, n))
inds = inds(randperm(n))

% Loop over folds
for i = 1:nfold
  datapart = data(inds ~= i, :)

  % do some stuff

  % save results
end

% combine results
+4

, . , . . -:

for param = set of parameter to test
  [trainTrain,trainVal] = randomly split (trainSet); %%% you can repeat that several times and take the mean accuracy
  model = svmtrain(trainTrain, param);
  acc = svmpredict(trainVal, model);
  if accuracy is the best
     bestPAram = param
  end
end
+1

All Articles