The most basic approach here is to use the so-called “class weighting scheme” - in the classic SVM formulation, there is a parameter C used to control the skip classification counter. It can be changed to parameters C1 and C2 used for classes 1 and 2, respectively. The most common choice for C1 and C2 for a given C is to place
C1 = C / n1 C2 = C / n2
where n1 and n2 are sizes of classes 1 and 2, respectively. Thus, you “punish” SVM for missing a less frequent class much more difficult than for the most common missclassification.
Many existing libraries (such as libSVM ) support this mechanism with class_weight parameters.
Example using python and sklearn
print __doc__ import numpy as np import pylab as pl from sklearn import svm
In particular, in sklearn you can just turn on automatic weighting by setting class_weight='auto' .

lejlot
source share