How do you do advanced paste using JDBC without creating strings?

I have an application that analyzes log files and inserts a huge amount of data into the database. It is written in Java and talks about the MySQL database via JDBC. I experimented with various ways to insert data to find the fastest for my particular use case. The one that currently seems to be the best performer is to release an extended insert (for example, one insert with multiple lines), for example:

INSERT INTO the_table (col1, col2, ..., colN) VALUES (v1, v2, v3, ..., vN), (v1, v2, v3, ..., vN), ..., (v1, v2, v3, ..., vN); 

The number of rows can be tens of thousands.

I tried using prepared statements, but it isnโ€™t close anywhere so quickly, probably because each insert is still sent to the database separately, and the tables must be locked and something else. My colleague who worked on the code in front of me tried to use batch processing, but that didn't work either.

The problem is that using extended inserts means that, as far as I can tell, I need to build the SQL string myself (since the number of rows is a variable), which means that I open all kinds of SQL injection vectors, which I "No, where itโ€™s wise to find yourself. There must be a better way to do this.

Obviously, I avoid the lines that I insert, but only with something like str.replace("\"", "\\\""); (repeated for ',? and \), but I'm sure this is not enough.

+4
source share
4 answers

ready-made applications + batch insert:

 PreparedStatement stmt = con.prepareStatement( "INSERT INTO employees VALUES (?, ?)"); stmt.setInt(1, 101); stmt.setString(2, "Paolo Rossi"); stmt.addBatch(); stmt.setInt(1, 102); stmt.setString(2, "Franco Bianchi"); stmt.addBatch(); // as many as you want stmt.executeBatch(); 
+4
source

I would try dosing your inserts and see how it works.

Read this ( http://www.onjava.com/pub/a/onjava/excerpt/javaentnut_2/index3.html?page=2 ) for more information on batch processing.

+1
source

If you download tens of thousands of records, you are probably better off using a bootloader.

http://dev.mysql.com/doc/refman/5.0/en/load-data.html

+1
source

Regarding the difference between extended inserts and batch single inserts, the reason I decided to use extended inserts is that I noticed that there was much more time for my code to insert many rows than mysql from the terminal. This was even though I was doing batch inserts in batches of 5000 each. The end result was to use extended inserts.

I quickly revised this theory.

I took two dumps tables with 1.2 million rows. When using the extended extended insert statements that you get with mysqldump, the other one is:

 mysqldump --skip-extended-insert 

Then I just imported the files again into the new tables and timed it.

The extended insert test was completed at 1 m35, and the other at 3 m. 49.

+1
source

All Articles