I have an application that analyzes log files and inserts a huge amount of data into the database. It is written in Java and talks about the MySQL database via JDBC. I experimented with various ways to insert data to find the fastest for my particular use case. The one that currently seems to be the best performer is to release an extended insert (for example, one insert with multiple lines), for example:
INSERT INTO the_table (col1, col2, ..., colN) VALUES (v1, v2, v3, ..., vN), (v1, v2, v3, ..., vN), ..., (v1, v2, v3, ..., vN);
The number of rows can be tens of thousands.
I tried using prepared statements, but it isnโt close anywhere so quickly, probably because each insert is still sent to the database separately, and the tables must be locked and something else. My colleague who worked on the code in front of me tried to use batch processing, but that didn't work either.
The problem is that using extended inserts means that, as far as I can tell, I need to build the SQL string myself (since the number of rows is a variable), which means that I open all kinds of SQL injection vectors, which I "No, where itโs wise to find yourself. There must be a better way to do this.
Obviously, I avoid the lines that I insert, but only with something like str.replace("\"", "\\\""); (repeated for ',? and \), but I'm sure this is not enough.