Why doesn't word2vec use regularization?

ML models with a huge number of parameters will be prone to retraining (since they have a large variance). In my opinion word2vec- one of these models. One way to reduce the variance of a model is to apply a regularization technique, which is very common for other implementation models, such as matrix factorization. However, the base version word2vecdoes not have any part of regularization. Is there a reason for this?

+6
source share
1 answer

This is an interesting question.

I would say that retraining in Word2Vec does not make much sense, because the purpose of embedding words in accordance with the distribution of the appearance of words is as accurate as possible. Word2Vec is not intended to learn anything outside of the training dictionary, i.e. Generalizations, and to approximate a single distribution defined by the text corpus. In this sense, Word2Vec is actually trying to match exactly, so it cannot overflow.

If you had a small vocabulary, you could calculate the match matrix and find the exact global minimum for attachments (of a given size), i.e. Get the perfect fit and determine the best contextual word model for this fixed language.

+3
source

All Articles