Popular today, this week, this month - design template.

I have a system that displays entries sorted by one of the three fields most popular today, this week, and this month. Each time a record is viewed, the score is increased by 1, changing the order.

So, if record 1 is new and is reviewed 10 times today, its ratings will be:

Today: 10 Week: 10 Month: 10 

Current solution

Currently, I have only 3 fields associated with each entry, one for today is another this week and the other for this month. Each time a record is viewed, all three points are increased by 1.

At the end of the day, the day’s score is reset to 0. At the end of the current week, the week indicator is 0 and at the end of the current month of the calendar, the month is set to 0.

Problem

Although this works and uses little space, it is not ideal for two reasons:

1) At the end of the current period (day, week, month) this value reset to 0 all at once means that at 00:00:00 every day the rating is all reset and everything is set to 0 daily for points, the same is true for the end week and end of month. At 00:00:00 on the 1st day of every month, all points are set to 0, losing all existing ranking data.

2) Since the end of the month usually falls within a week (Mon-Sun), monthly reset points during the week result in weekly points exceeding the monthly points.

Possible Solution

I could use a moving hourly counter for each hour of the month, which is used to calculate points for the current day, week, month based on the current hour indicator.

 Array size = 31 * 24 = 744 int16 values 

So, on the 1st at 4 a.m. the performance will be placed in the clock [4]

 hours[4]++ 

Then, the statistics calculator will be used today as the sum of the last 24 values, and This Week's score will be equal to the sum of the last (24 * 7) values. Finally, this month would be the sum of the last (24 * 31) values.

Problem Solving

The main problem with solution 1 is the disk / memory requirements. I switched from using 3 32-bit values ​​in my current solution to using 744 32-bit values. Even if I change them to 16, I'm still going to use a lot more memory per record

 Memory per Entry = 3 * 4 bytes = 12 bytes (Existing) Memory per Entry = 744 * 2 = 1,488 bytes (possible solution) 

With this solution, my memory usage for each record jumped 12400%!

Can someone suggest another solution that will satisfy the problems in my current solution, but without using 1.5 k per record?

Many thanks!

+7
c # algorithm design-patterns
source share
3 answers

This is actually a common problem of how to effectively group data and store all the necessary information.

First of all: did you try to do it your own way? Did you really run out of memory? Your decision seems reasonable.

How would i do that

I assume that you are using a database to store data.

I would create two separate tables: one for hourly and one for daily statistics. Each article will have exactly 24 rows in this database, one for each hour. This will be used for hourly statistics. To update a specific row, you only need to know the hour (0-23) and entry_id. UPDATE count=count+1 WHERE hour=11 AND entry_id = 18164;

 entry_id foreign key | hour integer | count integer ---------------------+--------------+-------------- 1 | 0 | 123 1 | 2 | 1712 ... 

Current daily statistics will either be calculated around midnight (or whenever the application does less), or summed on demand. In any case, once a day, the amount should be made from all hourly data, and the amount should be inserted into the daily statistics table.

 entry_id foreign key | day date | count integer ---------------------+------------+-------------- 1 | 2013-07-03 | 54197 1 | 2013-07-04 | 66123 ... 

Each entry older than 31 (30/29/28) days must be deleted. Or not, if you need general or annual statistics

<strong> Benefits

  • you have less data than with full hourly statistics: 24 + 31
  • amounts per hour table should be fast if indexed on entry_id and per hour
  • less memory used than in your solution.

disadvantages

  • additional scripts / triggers / tasks necessary for daily statistics update
  • more work is needed to implement it than in your solution.
+5
source share

One simple solution would be

 Use an array of 31. Today - the last value This Week score would be the sum of the last 7 values. This Month would be the sum of the last 31 values. At the end of each day, shift the whole array values by 1 to accommodate new value. 

Regarding your comment,

 Use another array of size 24 to store hours visit count. Today - Sum of all elements of Array2 This Week score would be the sum of the last 7 values of Array1. This Month would be the Sum of all elements of Array1. At the end of each day, shift the whole array values of Array1 by 1 to accommodate new value. Last day visit count = Sum of all elements of Array2 
+2
source share

Maybe some easing can help. You will need 6 variables for Today , Yesterday , ThisWeek , LastWeek , ThisMonth , LastMonth .

Then the final rating (for example, daily) can be defined as: Today + Yesterday * attenuation( current_time - start_of_the_day ) .

Where the attenuation is something like 1 / (1 + k * time) , where k can be adjusted depending on how quickly you want your last rating to drop out.

UPDATE: consider that a new record was viewed 123 times during the day. And it allows you to measure time in seconds to get to some numbers. At 23:59, the etrys rating would be 123 + 0 * 1 / (1 + k * 86340)^2 = 100 .

At midnight Today counter becomes Yesterday :

 0 + 123 * 1 / ( 1 + k * 0)^2 = 123 

Suppose that by noon the record receives 89 more views.

 89 + 123 * 1 / ( 1 + k * 43200 )^2 = ? 

Ok, this is the right time to choose k . If we want the old views to disappear four times in 12 hours, then k would be 1/43200 . If we want to disappear a hundred times - 9/43200 . In this case:

 89 + 123 * 1 / ( 1 + 9 )^2 = 90.23 

And then on until 23:59. Allow recordings to get over 60 views

 149 + 123 * 1 / ( 1 + (9/43200) * 86340 )^2 ~= 149.002 

Therefore, yesterday’s views almost completely lost their influence on the rating in 24 hours. Of course, you can play with k or the weakening formula as a whole to best suit your needs. This is just an example.

+2
source share

All Articles