I am testing a collaborative filtering algorithm that is implemented in Spark, and I am running the following problem:
Suppose I train a model with the following data:
u1|p1|3 u1|p2|3 u2|p1|2 u2|p2|3
Now, if I test it with the following data:
u1|p1|1 u3|p1|2 u3|p2|3
I never see ratings for user "u3", presumably because this user does not appear in the training data. Is it due to a cold start problem? I got the impression that this question will only apply to a new product. In this case, I expected the prediction for “u3”, since “u1” and “u2” in the training data have similar rating information with “u3”. Is this the difference between model-based and memory-based collaborative filtering?
source share