Spring -data JPA: massive insert performance improvements

Question

Spring -data JPA: massive insert performance improvements

In my application, I need to significantly increase insert performance. Example: a file containing about 21K records takes more than 100 minutes to be inserted. There are reasons why this may take some time, such as 20 minutes or so, but more than 100 minutes are too long.

Data is inserted into 3 tables (many-to-many). Identifiers are generated from the sequence, but I have already done a googled search and set hibernate.id.new_generator_mappings = true and incrementSize + increment sequence to 1000.

Also, the amount of data is not unusual at all, the file is 90 mb.

I checked with visual vm that most of the time is spent on the jdbc driver (postgresql) and sleep mode. I think the problem is due to the unique constraint in the child table. The service level performs a manual check (= SELECT) before insertion. If the record already exists, it reuses it instead of waiting for the restriction to be thrown.

So, to summarize for a particular file, there will be 1 insert in the table (it may differ, but not for this file, which is the ideal (fastest) case). This means that full inserts 60k + 20k are selected. More than 100 minutes seems very long (yes, the hardware is calculated, and it is on a simple PC with a 7200 rpm drive, without an SSD or raid). However, this is an improved version compared to the previous application (simple jdbc), on which the same insert on this equipment took about 15 minutes. Given that in both cases about 4-5 minutes are spent on “pre-treatment”, the increase is massive.

Any tips that could be improved? Is there a batch download feature?

+8

spring-data hibernate bulkinsert jpa-2.0

beginner_ Nov 13 '12 at 5:58

source share

2 answers

sounds like a database problem. check yor tables if teyh uses innodb or myisam, later my experience is very slow with insertion and is the default value for new dbs. delete foreign keys as much as possible

if your problem is really related to a single unique innodb index, it might do the trick.

-one

Laures Nov 25 '12 at 8:46

source share

beginner_ · Accepted Answer · 2013-02-18T13:16:54+0000

cm

spring -data JPA: execute a transaction manually and restart a new one

Add entityManager.flush() and entityManager.clear() after every nth call to save (). If you are using hibernate, add hibernate.jdbc.batch_size and set it = n. 100 seems like a smart choice.

The increase in performance was> 10x, probably close to 100x.

Spring -data JPA: massive insert performance improvements

More articles: