I want to find a library or algorithm (so I write the code myself) to identify the nearest k neighbors of a web page, where a web page is defined as a set of keywords. I have already done the part where I extract the keywords.
It should not be very good, good enough.
Can anyone suggest a solution or where to start. I studied the lectures of Yuri Lifshits in the past, but I hope to get something ready, if possible.
Java libraries are preferred.
Ankur source
share