MySQL performance: views and functions compared to stored procedures

I have a table that contains some statistics that are collected per hour. Now I want to be able to quickly get statistics per day / week / month / year / total. What is the best way to make this work reasonable? Creating views? Functions? Stored procedures? Or regular tables, where should I write at the same time when updating data? (I would like to avoid this). My idea would be to create a view_day, in which the hours are summed, and then view_week and view_month and view_year, which summarize the data from view_day and view_total, which summarizes view_year. Is this good or bad?

+4
source share
5 answers

You essentially have two systems: one that collects data and one that reports this data.

Running reports against your frequently updated transactional tables is likely to result in read locks that block writing from completion as much as possible, and therefore can degrade performance.

It is generally recommended that you perform the periodic “gathering” task, which collects information from your (possibly highly normalized) transaction tables and writes data to the denormalized report tables, forming a “data warehouse”. Then you specify your reporting mechanism / tools in a denormalized “data warehouse” that you can query about without affecting the actual transactional database.

This collection task should only be performed as often as your reports should be "accurate." If you can leave once a day, great. If you need to do this once an hour or more, then go ahead, but watch how you work on your tasks when recording.

Remember that if the performance of your transactional system is important (and usually it is), avoid running reports against it at all costs.

+3
source

Yes, having tables that store already aggregated data is good practice.

While views, as well as SP and functions, will simply execute queries on large tables, which is not so efficient.

+1
source

The only real quick and scalable solution is that you put it in “regular tables, where you have to write at the same time when updating data” with the corresponding indexes. You can automate updating such a table with triggers .

+1
source

My opinion is that complex calculations should only be done once, because the data does not change every time you request. Create aggregated data and fill it out either with a trigger (if no log is acceptable), or through a task that runs once a day or once per hour or any delay time is acceptable for reporting. If you go on a trigger route, test, check, check. Make sure it can handle multiple row inserts / updates / deletes, as well as more general single ones. Make sure it is as fast as possible and has no errors. Triggers will add a bit of processing to each data action, you have to make sure that it adds the smallest possible bit, and that no errors will occur, which may prevent users from inserting / updating / deleting data.

0
source

We have a similar problem, and what we are doing is using the master / follower relationship. We make transactional data (both read and write, since in our case some reads must be very fast and cannot wait for replication for the transaction), on the main one. The slave quickly replicates the data, but then we run every non-transactional request, including reporting.

I highly recommend this method, as it is easy to implement as a quick and dirty data warehouse if your data is granular enough to be useful in reporting layers / applications.

0
source

All Articles