MySQL and a table with 100 million rows

I have several tables with over 100 million rows. I get about 20-40 million rows every month.

At this point, everything seems beautiful: - all inserts are fast - everyone is selected quickly (they use indexes and do not use complex aggregations)

However, I’m worried about two things that I read somewhere: - When a table contains several hundred million rows, there may be slow inserts, because it may take some time to rebalance the indices (binary trees), - If the index does not fit into memory , it may take some time to read it from different parts of the disk.

Any comments would be highly appreciated. Any suggestions on how I can avoid it or how I can fix / mitigate the problem if / when it happens will be highly appreciated.

(I know that at some point we should start making shards)

Thanks in advance.

+8
mysql
source share
2 answers

Today is the day when you should think about sharding or partitioning, because if you have 100MM lines today and you type them at ~ 30MM per month, then you are going to double the size of this in three months and maybe double it again to of the year.

At some point, you will fall into an event horizon where your database is too large for migration. Either you do not have enough working space on your disk to switch to an alternative scheme, or you do not have enough time to complete the migration before it works again. Then you are stuck with him forever, as he becomes slower and slower.

The performance of writing to a table depends heavily on how difficult it is to maintain indexes. The more data you index, the more punitive the records may be. The type of index matters, some are more compact than others. If your data is slightly indexed, you can usually get away with a lot of records before everything starts to get too slow, but this degradation coefficient is highly dependent on your system configuration, your hardware, and I / O capacity.

Remember that InnoDB, the engine you have to use, has many settings, and many people leave it for really terrible defaults. Look at the allocated memory and make sure you do it right.

If you have any way of dividing this data, for example, by months, by customer, or by any other factor that does not change based on business logic, that is, the data is essentially unrelated, you will have many simple options. If it is not, you will have to make some tough decisions.

The only thing you want to do now is to simulate what your 1G table looks like. Create a large enough, diverse enough amount of test data, and then see how well it works under load. You may find that this is not a problem, and in this case do not worry for a few more years. If not, start to panic today and work on a solution before your data gets too large to share.

Database performance usually deteriorates quite linearly, and then at some point it falls off a cliff. You need to know where this cliff is located so that you know how much time you have before you hit it. Dramatic performance degradation usually occurs when your indexes cannot fit into memory and when your disk buffers are stretched too thin to be useful.

+9
source share

I manage and process MySQL dbs, which have 200-300 million records per table, daily. I have several 1 or 2 TB db and a large, highly accessible MySQL infrastructure.

However, I saw how the MySQL infrastructure collapses with the number of records in the 1M range - so the question is not, can MySQL handle it ?, but it is "Am I building it right?" More importantly, how you build your application, and queries that interact with MySQL are vital to your experience.

There is a lot of information available to you on SO and many articles at http://www.mysqlperformanceblog.com/ - see table parsing, edging, replication, and indexing for starters.

Some sources for further reading:

http://www.mysqlperformanceblog.com/

How big are MySQL databases before performance starts to degrade

http://ebiquity.umbc.edu/blogger/2007/12/28/how-youtube-scales-mysql-for-its-large-databases/

https://www.facebook.com/MySQLatFacebook

+3
source share

All Articles