Badoo.com user search - how can this be done?

Badoo.com has 56,000,000 user profiles. Profiles can be searched by gender, age, hair color, zodiac, education, etc., as well as the distance from my hometown, online status and registration date. Until now, this seemed feasible, even if it is a rather large query for huge tables (56m members ...), it can be cached in general.

The interesting part is that they also have an individual โ€œexclusion listโ€ (with each profile that you look at, you can say that you do not want to meet this person). In addition, your friends do not appear either.

The second interesting part is the parts of the OR query. You can search for someone who has a woman, 25-35 years old, blonde or brunette, non-smoker, hetero OR OR bisexual, virgin OR twins, or cancer, living within a radius of 50 km from Paris, and who is not your friend, not on the list of exceptions and who is online now. Many ORs, heavy queries, sorting options, no caching method or precalculating all of this, but the search returns 11.298 results in milliseconds.

How do they do such a thing with 56 million data sets and 250K people using it at the same time? Full Text Indices? Relational Databases? Major Securities? Does anyone have an idea in a concept or architecture?

+6
performance search search-engine
source share
2 answers

Most likely they are built using inverted indexing technology such as Lucene or Sphinx. If you want to create a solution, my recommendation would be Apache Solr (a search engine built using Lucene). It is very popular, has an active OSS community, and is used by sites such as Netflix, Cnet, etc.

+3
source share

I would recommend taking a look at Badoo Dev Blog . This is in Russian, but google translate helps a lot.

In short, they use oververted MySQL and memcached. Here is a few badoo evolution list .

+1
source share

All Articles