How to implement a ranking algorithm on my website to sort database data?

I want to implement a ranking system on the website I was working on and decided to go with the Hacker News algorithm. My reasoning for choosing this algorithm is simply because it is described here .

I looked at this Python code (the language that I use to create my site), and could not understand how I implement it.

def calculate_score(votes, item_hour_age, gravity=1.8): return (votes - 1) / pow((item_hour_age+2), gravity) 

Given the tables:

 posts: id | title | time_submitted votes: id | postid | userid | score 

How can I extract data from a database? The ideal solution (the most effective) would be to build a MySQL query to get the first 10 posts ranked using the algorithm. But, given that Hacker News has implemented it in Arc, it seems to me that they pull out all the messages and then run them through the algorithm to rank them.

Reddit also comes to mind for this ... They use a non-relational database schema, so I would suggest that they, like Hacker News, perform ranking in their code, not in the database.

How do you implement this?

EDIT: A single post can have many votes, as I would like to register which user votes on which post.

+4
source share
2 answers

You can use the data required in the ORDER BY .

 SELECT p.id, p.title, p.time_submitted, SUM(v.score) as num_votes FROM posts p, votes v WHERE v.postid = p.id GROUP BY p.id ORDER BY (SUM(v.score) - 1) / POW(TIMESTAMPDIFF(HOUR,p.time_submitted,NOW()) + INTERVAL 2 HOUR, 1.8) DESC LIMIT 100 
+4
source

In your case, the number of votes will be returned:

 SELECT count(*) FROM votes WHERE postid=<THE POST ID>; 

If you want to consider the assessment, you can include this in the request, but the formula you provided is not equipped to process it.

The hour age of an item is simply the current time subtracted from the represented time:

 SELECT HOUR(TIMEDIFF(NOW(), time_submitted)) FROM posts WHERE id=<THE POST ID>; 

This can also be done completely in SQL:

 SELECT id FROM posts ORDER BY (((SELECT count(*) FROM votes WHERE postid=posts.id) - 1) / MOD(HOUR(TIMEDIFF(NOW(), time_submitted) + INTERVAL 2 HOURS), <GRAVITY>)) LIMIT 10; 
0
source

All Articles