Keyword Based Neighbor Algorithm or Library

I want to find a library or algorithm (so I write the code myself) to identify the nearest k neighbors of a web page, where a web page is defined as a set of keywords. I have already done the part where I extract the keywords.

It should not be very good, good enough.

Can anyone suggest a solution or where to start. I studied the lectures of Yuri Lifshits in the past, but I hope to get something ready, if possible.

Java libraries are preferred.

+5
source share
1 answer

, , . , / . - document term-frequency.

, - . , , , . doc-- WRT ; .. % tage.

, . . , , . (, , ).

, K- .

+2

All Articles