Binary numbers instead of one hot vector

Question

Binary numbers instead of one hot vector

When performing logistic regression, it is common practice to use one hot vector as the desired result. So, no of classes = no of nodes in output layer . We do not use the index of the word in the dictionary (or the class number in general), because this may falsely indicate the proximity of the two classes. But why can't we use binary numbers instead of single vectors?

ie, if there are 4 classes, we can represent each class as 00,01,10,11, which leads to log(no of classes) nodes at the output level.

+7

machine-learning computer-vision nlp neural-network

Raghuram vadapalli Oct 23 '16 at 20:19

source share

2 answers

https://github.com/scikit-learn-contrib/categorical-encoding

Binary encoding (and, in fact, base-anything encoding) is supported in category_encoders. In our case, we get a function in place in a binary string, so this is not one function with the value "011" or "010" of its 3 s [0, 1, 1] and [0, 1, 0], respectively.

0

Will mcginnis Jul 30 '17 at 15:18

source share

Mehdi · Accepted Answer · 2016-10-23T22:42:31+0000

It is ok if you encode binary code. But you probably need to add another layer (or filter) depending on your task and model. Since your encoding now implies invalid common functions due to binary representation.

For example, binary encoding for input ( x = [x1, x2] ):

 'apple' = [0, 0] 'orange' = [0, 1] 'table' = [1, 0] 'chair' = [1, 1]

This means that orange and chair use the same x2 function. Now with predictions for two classes of y :

 'fruit' = 0 'furniture' = 1

And the linear optimization model ( W = [w1, w2] and bias b ) for the tagged data sample:

 (argmin W) Loss = y - (w1 * x1 + w2 * x2 + b)

Whenever you update w2 scales for chair as furniture , you get a positive improvement in choosing orange for this class. In this particular case, if you add another layer U = [u1, u2] , you can probably solve it:

 (argmin U,W) Loss = y - (u1 * (w1 * x1 + w2 * x2 + b) + u2 * (w1 * x1 + w2 * x2 + b) + b2)

Okay, why not avoid this representation of skips using single-string encoding. :)

Binary numbers instead of one hot vector

More articles: