Popularity, How to make new hits more than old hits?

Each product a product_date_added , which is a Date field, contains the date it was added. They also have product_views , which is an int field in which the number of times a product has been viewed.

To display products by popularity, I need an algorithm to calculate the number of hits per day of a product.

 SELECT AVG(product_views / DATEDIFF(NOW(), product_date_added)) as avg_hits , product_table.* FROM product_table WHERE product_available = "yes" GROUP BY product_id ORDER BY avg_hits DESC 

It works, but the boss notices how many old products appear first. Therefore, he basically wants newer looks to have more weight than old ones.

His suggestion was that any views older than a year were not taken into account. I think that I will need to adhere to the date of each look in order to do this, which, I think, will slow down the work.

What is the best way to create a popularity algorithm, like what my boss is asking for?

Ideally, I would like to be able to come up with something that does not change the structure of the table. If this is not possible, I would like at least to come up with a solution that can use existing data, so we do not start with 0. If this is not possible, or anything that will work.

+4
source share
2 answers

You should not (as such) keep a date of each kind. Instead, you can store up to 366 rows per element in a table with columns: product_id, day_of_year, count. Every day, run a task with zero counts from one year ago. If you do not mind denormalized data, this task can also update the "count" field in the element itself for quick retrieval so that your query does not need to be changed. product_views just becomes product_views_in_the_last_year . The 1-day time frame is arbitrary - I doubt that you care about window-based popularity for exactly one year, so I expect it to be as good as an hour, a week, or two weeks, depending on how many buckets you are ready to handle.

An alternative scheme could be to use exponential decay. Turn the count field to decimal type. Once a day, reduce the amount of each element by a fixed percentage (less than 1%, more than 0.1%), so that the more recent the hit, the more “weight” it has. Thus, the old popularity never dies completely, but hits from a year ago will not make a big contribution. By the way, the equivalent of this scheme is to leave the code as it is, but to ensure that your site as a whole becomes exponentially more popular over time; -)

How to avoid starting from scratch - perhaps immediately reduce the number of items, as a one-time action, by a fraction that depends on the age of the element. In general, you expect older objects to have older representations and therefore have a higher rating on the current schema. This is not perfect, as maybe some old items have recently received many hits. You may be able to identify these elements by looking at the logs of the latest web servers or spending weekly or monthly calculations before doing a one-time reduction. Without even doing this, if there is a fundamental reason for their popularity (not only because they currently rate your rating highly and therefore receive traffic from people who are looking at your “most popular” schedule), then I hope they restore this time.

+2
source

You might want to check out this blog post . It targets the App Engine, but this technique is common. The basic approach is to have a popularity that exponentially decreases and increases with each voice recording / download / independently.

+1
source

All Articles