Badoo.com has 56,000,000 user profiles. Profiles can be searched by gender, age, hair color, zodiac, education, etc., as well as the distance from my hometown, online status and registration date. Until now, this seemed feasible, even if it is a rather large query for huge tables (56m members ...), it can be cached in general.
The interesting part is that they also have an individual โexclusion listโ (with each profile that you look at, you can say that you do not want to meet this person). In addition, your friends do not appear either.
The second interesting part is the parts of the OR query. You can search for someone who has a woman, 25-35 years old, blonde or brunette, non-smoker, hetero OR OR bisexual, virgin OR twins, or cancer, living within a radius of 50 km from Paris, and who is not your friend, not on the list of exceptions and who is online now. Many ORs, heavy queries, sorting options, no caching method or precalculating all of this, but the search returns 11.298 results in milliseconds.
How do they do such a thing with 56 million data sets and 250K people using it at the same time? Full Text Indices? Relational Databases? Major Securities? Does anyone have an idea in a concept or architecture?
performance search search-engine
Jens
source share