I read many articles and found several ways to make a batch process
One of them is the use of cleaning and cleaning, the following code
long t1 = System.currentTimeMillis(); Session session = getSession(); Transaction transaction = session.beginTransaction(); try { Query query = session.createQuery("FROM PersonEntity WHERE id > " + lastMaxId + " ORDER BY id"); query.setMaxResults(1000); rows = query.list(); int count = 0; if (rows == null || rows.size() == 0) { return; } LOGGER.info("fetched {} rows from db", rows.size()); for (Object row : rows) { PersonEntity personEntity = (PersonEntity) row; personEntity.setName(randomAlphaNumeric(30)); lastMaxId = personEntity.getId(); session.saveOrUpdate(personEntity); if (++count % 50 == 0) { session.flush(); session.clear(); LOGGER.info("Flushed and Cleared"); } } } finally { if (session != null && session.isOpen()) { LOGGER.info("Closing Session and commiting transaction"); transaction.commit(); session.close(); } } long t2 = System.currentTimeMillis(); LOGGER.info("time taken {}s", (t2 - t1) / 1000);
In the above code, we process records in batch 1000 and update them in a single transaction.
This is normal when we only need a batch update.
But I have the following questions governing it:
- There may be a case where some other thread (T2) accesses the same set of rows for some update operations at runtime, but in this case up to 1000 packets will not be committed, T2 remians are stuck
So how should we handle this?
Possible thoughts / solution:
- I think we can do updates in different sessions with a small batch, for example, 50
- Use a one-way stateless connection to upgrade and transcribe one after the other, but close the session when a batch of 1000 is completed.
Please help me get a better solution.
java postgresql jdbc hibernate batch-processing
Sahil aggarwal
source share