Best database for high recording (10,000+ inserts / hour), low readability (10 times per second)?

I am developing a web application and am currently using SQL Server 2008 for it. But I am considering moving to another database (simpledb) to improve performance.

I have a background process that inserts up to 10,000 rows every hour into one specific table. This table is also read to display data in the web application. When the background process is running, the web application is unusable because the db connection time is disconnected.

As a result, I am going to switch to amazon simpledb to improve performance. Is amazon SimpleDB optimized for this use case? If not, is there any other solution that I could use?

+7
performance sql database sql-server amazon-simpledb
source share
4 answers

Your problem is the isolation level you are using. If you do not change it, SQL Server (and many other databases) work in the mode that it chooses, it will be locked during uncommitted reads. You want to change SQL Server to use MVCC instead (by default, Oracle, MySQL and SQL Server both have it), and your problem will go away.

From INSTALL BET ISOLATION LEVEL (Transact-SQL) :

READ THE COMMITTEE

Indicates that operators cannot read data that has been changed but not committed by other transactions. This prevents dirty reads. Data can be changed by other transactions between separate statements within the current transaction, resulting in irreproducible reads or phantom data. This parameter is a SQL Server standard.

The behavior of READ COMMITTED depends on the setting of the Database READ_COMMITTED_SNAPSHOT option:

  • If READ_COMMITTED_SNAPSHOT is set to OFF (the default value), the Database Engine uses common locks to prevent other transactions from changing rows during the current transaction reading operations are performed. General locks also block instructions from reading lines changed by other transactions until another transaction is completed. The type of shared lock determines when it will be released. Lowercase locks released before the next line processed. Page locks are released when the next page is read and the table locks are released when approval finishes.
  • If READ_COMMITTED_SNAPSHOT is set to ON, the Database Engine uses a version string to represent each statement with a sequence of transactions a snapshot of the data as it existed at the beginning of the statement. Locks are not used to protect data from updating on other transactions.

When the READ_COMMITTED_SNAPSHOT database is enabled, you can use the READCOMMITTEDLOCK table hint table to request a general lock instead of the version control string for individual statements in transactions running on the READ isolation level COMMITTED.

(in italics)

Change your database configuration to turn READ_COMMITTED_SNAPSHOT ON.

In addition, try to minimize the duration of the transaction and make sure that you complete the transaction in the background (this makes 10,000 inserts per hour), because if it is never completed, the choice will be blocked forever (the default setting).

+20
source share

As others have said, the amount of data you write to the database is not a problem. SQL Server can easily process much more data than this. Personally, I have tables that take hundreds of thousands to millions of rows per hour without problems, and people read rows all day without any slowdowns.

  • You may need to view dirty reads by changing the isolation level of the instructions you read or using the WITH (NOLOCK) hint.

  • You should see how to use a massive load object in .NET to load your data into a database. Use batches of 1000-5000 depending on the performance that you see during testing. You will need to play with the number to get the best performance. Bulk pasting data into a table will give you significantly better performance than pasting row-by-row records. Make sure that you do not complete the entire download in one transaction. You have to make one transaction per package.

  • What is a disk identifier when you write to the database.

  • What recovery model did you install for the database? FULL database recovery will require much more I / O than using SIMPLE recovery mode. Use only FULL recovery if you really need a moment in time that comes with it.

+5
source share

Less than 3 inserts per second will not give any DBMS training if the amount of data to be inserted into each insert operation is phenomenal. Similarly, 10 reads per second is unlikely to overstrain any competent DBMS unless there is some complicating factor that you did not mention (for example, “reads are aggregates of aggregates throughout the DBMS that will accumulate billions of records after a period of ... well, 100,000 hours for the first billion records, which is about 4,000 days or about 10 years).

+2
source share

In response to Joel’s answer, you may need to set the appropriate values ​​for PAD_INDEX and FILLFACTOR for your indexes. If you do not specify these parameters, your inserts can do a lot of repeated splitting on your indexes, which will significantly slow down your recording time.

0
source share

All Articles