MySql execution speed max (), min (), sum () relative to a relatively large database

I have a relatively large database (130,000+ lines) of weather data that accumulates very quickly (a new line is added every 5 minutes). Now on my site I publish min / max data for the day, as well as for the entire period of my weather station (which is about 1 year).

Now I would like to know if I would benefit from creating additional tables where this min / max data will be stored, and not let php execute a mysql query that searches for min / max day data and min / max data for the entire existence of my weather station. Would a query max (), min () or sum () (the sum of the needs () to sum the rain accumulation over several months) take much longer than a simple query to a table that already contains these values ​​min, max and sum

+6
optimization php mysql
source share
3 answers

It depends on the weather in which your columns are indexed. In the case of MIN () and MAX (), you can read the following in the MySQL manual:

MySQL uses indexes for these operations:

To find the MIN () or MAX () value for a specific indexed key_col column. This one is optimized by the preprocessor, which checks whether you use WHERE key_part_N = constant on all key parts that occur before key_col in the index. In this case, MySQL does a single key lookup for each MIN () or MAX () and replaces its constant.

In other words, if indexes are indexed by your columns, you are unlikely to get the big performance benefit by denormalizing. In case they are NOT, you will definitely get performance.

As for SUM (), it will most likely be faster in the indexed column, but I'm not sure about the performance achieved here.

Please note that you should not be tempted to index your columns after reading this post. If you put indexes, your update requests will slow down!

Cheerz!

+3
source share

Yes, denormalization in this case should help performance.

There is nothing wrong with storing calculations for historical data that will not change in order to get performance benefits.

+3
source share

While I agree with RedFilter that there is nothing wrong with storing historical data, I do not agree with the performance improvements you will receive. Your database is not what I would consider as a database with heavy use.

One of the main advantages of databases is indexes. They used advanced data structures for quick access to data. Think each primary key has an index. You should not be afraid of them. Of course, it would probably be counterproductive to do all your field indexes, but this should never be necessary. I would suggest exploring the indices more to find the right balance.

As for the work done during the change, this is not so bad. An index is a tree similar to the representation of your field data. This is done to reduce the search to a small number of close binary solutions.

For example, think about finding a number from 1 to 100. Usually you randomly put on numbers, or you just started with 1 and counted. It is slow. Instead, it would be much faster if you set it up so that you can ask if you were higher or lower when you select a number. Then you will start at 50 and ask if you are finished or not. Under, then select 75 and so on until you find the number. Instead of possibly going through 100 numbers, you will only need to go through 6 numbers to find the right one.

The problem is when you add 50 numbers and do it from 1 to 150. If you start again with 50, your search will be less optimized since you will have 100 numbers. Your binary search is not balanced. So what you are doing is rebalancing your search, starting from the middle again, namely 75.

Thus, working with a database is just an adjustment to balance the middle of your index. Actually this is not a lot of work. If you work with a large large database and require a lot of changes per second, you will definitely need a strong strategy for your indexes. In a small database that receives very few changes like yours, this is not a problem.

+1
source share

All Articles