How to track your views in the most efficient way?

I have this blog system (LAMP), and I would like to track the number of views of each article. Now is it better to update the column of article views every time an article is viewed, or use some temp table, where I would save only the article identifier, and then (let it run every hour) a query that will receive data from the temp table and update the rows in the table articles? I am open to completely different solutions.

Please note that I cannot use analytics tools, since I need to work with these numbers (the most popular, etc.).

+4
source share
3 answers

Updating the article table every time you read the article means a big lock on this table (or rows, depending on the engine you are using).

Using a temporary table may be the best solution, in my opinion:

  • either make a raw insert each time an article is viewed, and without updating
  • or update the counter on the article in this temporary table
  • or (if you use an engine like InnoDB that supports row locks and doesn't use table locks), use something like 100 rows per article and update one of them randomly every time the article is viewed
    • this way you will have less concurrency on locks (if you have 5 users reading the same article at the same time, there are not so many risks that they will try to update the same line among 100!)
    • just remember that you have to sum the values โ€‹โ€‹of 100 lines in each article to get the total amount when you want to count how many times the article has been viewed.

The last solution, perhaps the best in terms of concurrency, is again if you are using an engine that supports row locks (i.e. not MyISAM).

And, from time to time, run the cron task, which will be calculated from this temp table, and update the article table.

+2
source

Could this be a reason for premature optimization? Before proceeding to the extreme split tables and completing the cron jobs, I would make sure that a simple approach is a problem when configured correctly.

In addition, your problem is a write lock conflict, by writing to another table, you simply transferred this statement to this table and will have the same lock.

I would suggest:

  • make your reads without locks (NOLOCK), and only your records with locks. Thus, you block simultaneous updates to the number of views, and do not read these articles.
  • If this is not so good, and you can live with a certain amount of loss in case of red dots, update the number of views asynchronously and do not wait until he returns to show the page.

(In the red case of losing the number of views, I mean the cases when the asynchronous recording failed after you put the page because your database went down immediately after the article was read, but before the number of views was updated)

+2
source

The โ€œmost effective wayโ€ is highly subjective; You will need to tell us about your specific performance problem.

I would probably look at the pages (on each web server in the farm) in the local log file (of course, of course), then I should have a process that rotates periodically and sums it up in the database (of course, the parallel access descriptor that remains for exercises for exercise).

The adder calculates the number of views of each article in the log file for a certain period (say, run every minute or two), and then do it in one transaction, but many updates are needed, one for each article. This will probably not cause too many problems, since you will only consider one process on each web server that performs one transaction per minute (or 2, 5 or any number), and not one per web request. The database will have less load.

+1
source

All Articles