Lost data during high-frequency insertion in cassandra using datastax java 2.1.7 driver

I am new to apache-cassandra and I plan to use it as the data store of a new project for its write performance. I installed a cassandra cluster with three nodes and a replication ratio of 3. My program A uses datastax cassandra-driver-core 2.1.7 to write and read data from cassandra. Each program run writes about 50 entries to cassandra using the batch operator. Testing a single execution does not cause any problems. However, when I start running A more intensively, a problem arises.

Details: Another program B calls A 40 times in 10 seconds, so there should be 2k records in the cassandra after completion of B. However, the number of records recorded in cassandra was only 25-30% (varies randomly in each run of B) of 2k records . By the way, I used cqlsh to check the number of recorded records. I need to re-run B several times so that in the end all 2k entries can be written to cassandra.

Now I have no clue, there was no error reported when executing both A and B, and from the journal A it was executed 40 times.

I don’t know if this is related to setting up the cluster, setting the level of consistency, etc., or if there is any setting that I have to do in order to take care of recording a higher frequency.

The code looks something like this:

String query = "insert into A (a,b,c,d,e,f) values (?,?,?,?,?,?)"; PreparedStatement p = session.prepare(query); BatchStatement b = new BatchStatement(); for (int i=0; i<50; i++) { BoundStatement b1 = p.bind(); b1.setInt("a",A); ... b1.setInt("f",F); b.add(b1); } session.execute(b); 

Any help would be greatly appreciated!

Addition:

I changed my code so as not to use the batch operator as @aaron and others. The problem still remains, not all records were written to cassandra (I mean that I cannot see them using the select cqlsh statement). After some time, I noticed that the problem occurred only in those records that were previously inserted (deleted before being inserted again using the delete cqlsh command). If the records were not previously inserted, the correct results were displayed using cqlsh "select * from". Can someone enlighten me why this is so, and if there is a way to avoid this? Many thanks.

+4
source share

All Articles