I have a table inside a PostgreSQL database with columns c1, c2 ... cn. I want to run a query that compares each row with a tuple of values ββv1, v2 ... vn. The query should not return an exact match, but should return a list of rows in descending order with respect to the vector of v values.
Example:
The table contains sports records:
1,USA,basketball,1956 2,Sweden,basketball,1998 3,Sweden,skating,1998 4,Switzerland,golf,2001
Now, when I run a query on this table using v = (Sweden, basketball, 1998), I want to get all records that have similarities with this vector, sorted by the number of matching columns in descending order:
2,Sweden,basketball,1998
Line 4 is not returned because it does not match at all.
Edit: all columns are equally important. Although, when I really think about it ... it would be nice to add if I could give each column a different weight coefficient.
Is there any possible SQL query that returns rows in a reasonable amount of time, even when I run it against a million rows?
What does this query look like?