How to efficiently perform a batch update in sleep mode

I read many articles and found several ways to make a batch process

One of them is the use of cleaning and cleaning, the following code

long t1 = System.currentTimeMillis(); Session session = getSession(); Transaction transaction = session.beginTransaction(); try { Query query = session.createQuery("FROM PersonEntity WHERE id > " + lastMaxId + " ORDER BY id"); query.setMaxResults(1000); rows = query.list(); int count = 0; if (rows == null || rows.size() == 0) { return; } LOGGER.info("fetched {} rows from db", rows.size()); for (Object row : rows) { PersonEntity personEntity = (PersonEntity) row; personEntity.setName(randomAlphaNumeric(30)); lastMaxId = personEntity.getId(); session.saveOrUpdate(personEntity); if (++count % 50 == 0) { session.flush(); session.clear(); LOGGER.info("Flushed and Cleared"); } } } finally { if (session != null && session.isOpen()) { LOGGER.info("Closing Session and commiting transaction"); transaction.commit(); session.close(); } } long t2 = System.currentTimeMillis(); LOGGER.info("time taken {}s", (t2 - t1) / 1000); 

In the above code, we process records in batch 1000 and update them in a single transaction.

This is normal when we only need a batch update.

But I have the following questions governing it:

  • There may be a case where some other thread (T2) accesses the same set of rows for some update operations at runtime, but in this case up to 1000 packets will not be committed, T2 remians are stuck

So how should we handle this?

Possible thoughts / solution:

  • I think we can do updates in different sessions with a small batch, for example, 50
  • Use a one-way stateless connection to upgrade and transcribe one after the other, but close the session when a batch of 1000 is completed.

Please help me get a better solution.

+7
java postgresql jdbc hibernate batch-processing
source share
1 answer

You want to say this:

  • batch update in transaction

  • while another thread starts updating one of the records that are also in the package

  • because of this, the package will wait for the update to complete in step 2. This makes the remaining entries in the batch also wait. So far, everything looks good. However, the important point here was that the transaction was completed in order to make the update for the large set of records โ€œfasterโ€. Transactions are typically used to ensure consistency / atomicity. How can this detail be constructed - a quick update of several records in one pass with atomicity that is not the main criterion, while a probable update of a record in a packet is also requested by another thread

+1
source share

All Articles