I use and work on software that uses MySQL as an internal server (it can use others like PostgreSQL or Oracle or SQLite, but this is the main application that we use). The software was designed so that the binary data that we want to receive is stored as BLOB in separate columns (each table has one BLOB column, the other columns have integer / float for the BLOB characteristic and one row column with a BLOB MD5 hash). Tables usually have 2, 3, or 4 indexes, one of which is always an MD5 column that is made by UNIQUE . Some tables already have millions of records, and they have entered a size of several gigabytes. We store separate MySQL databases in one year on one server (so far). The hardware is reasonably reasonable (I think) for general applications (Dell PowerEdge 2U-form server).
MySQL SELECT queries are relatively fast. There is a small complaint, since they (most of the time) are in batch mode. However, INSERT queries take a lot of time, which increases with the size of the table (the number of rows). Admittedly, this is because the MD5 column is of type UNIQUE , and therefore each INSERT needs to find out if each new row has a corresponding, already inserted MD5 row. And this is not too strange (I think) if performance worsens, if there are other indexes (not unique). But I still canβt calm down that choosing this software architecture (I suspect storing BLOBs in a table rather than a disk has a significant negative impact) is not the best choice. The inserts are not critical, but it is an annoying feeling.
Does anyone have experience in such situations? With MySQL or even with other (preferably based on Linux) RDBMes? Any ideas you would like to provide, perhaps some performance indicators?
BTW, the working language is C ++ (which transfers C calls to the MySQL API).
performance database mysql insert indexing
jbatista
source share