This is an interesting question.
I would say that retraining in Word2Vec does not make much sense, because the purpose of embedding words in accordance with the distribution of the appearance of words is as accurate as possible. Word2Vec is not intended to learn anything outside of the training dictionary, i.e. Generalizations, and to approximate a single distribution defined by the text corpus. In this sense, Word2Vec is actually trying to match exactly, so it cannot overflow.
If you had a small vocabulary, you could calculate the match matrix and find the exact global minimum for attachments (of a given size), i.e. Get the perfect fit and determine the best contextual word model for this fixed language.
Maxim source
share