Svm for interference-free binary data

I have a standard machine learning problem {-1, + 1}. The main difference is that data points are binary strings, so their proximism is measured by the Hamming distance. Can SVM be applied in this case? Which SVM library is best suited for this task?

+5
source share
3 answers

If the kernel k is positive definite for any pair of examples x and z, the determinant of the gram matrix is ​​not negative.

|k(x, x) k(x, z)|
|               | = k(x,x)k(z,z) - k(x,z)^2 >= 0
|k(z, x) k(z, z)|

For distance (including distance from interference), the following properties are stored:

For any x, y:

1) d(x, z) >= 0 and d(x, z) = 0 <=> x = z
2) symmetry d(x, z) = d(z, x)
3) triangular inequality d(x, z) <= d(x, y) + d(y, z)

Given that k is the distance from the interference, according to 1), we would have:

a) k(x,x) = k(z,z) = 0

But in order to be an explicit core, we need:

b) k(x,x)k(z,z) - k(x,z)^2 >= 0

a) b), :

-k(x,z)^2 >= 0
k(x,z)^2 <= 0

, k (x, z) , , .

- , , , : K ( "aab", "baa" ) = [0,1,0,1,1,0]\dot [1,0,0,1,0,1].

, . "aab" "baa" 2, .

[0,1,0,1,1,0] \dot [1,0,0,1,0,1] = 1.

, , SVM, .

+2

, , SVM, (, libSVM, SVMLight, scikits). , .

: , , . , , , , .

+1

StompChicken, , .

- , , , : K ( "aab", "baa" ) = [0,1,0,1,1, 0]\dot [1,0,0,1,0,1].

Understanding this "encoding", you can really use any SVM library that supports a linear kernel, encodes your lines, as in the previous example.

+1
source

All Articles