Suppose you were able to track news about different people, for example, “Steve Jobs” and “Steve Ballmer”.
In what ways could you determine if the number of references to an entity over a given period of time was unusual compared to their usual degree of frequency of occurrence?
I believe that for a more popular person like Steve Jobs, a 50% increase may be unusual (an increase from 1000 to 1500), while for a relatively unknown CEO, an increase of 1000% for a given day is possible (an increase of 2 up to 200). If you didn’t have a way to scale, your uniqueness index could dominate the unheard of, gaining them 15 minutes of fame.
update. To make it more understandable, it is assumed that you can already receive a continuous stream of news and identify entities in each news item and store all this in a relational data warehouse.
source
share