Cassandra.csv import error: batch too large

I am trying to import data from a CSV file into Cassandra 3.2.1 using the copy command. The file contains only 299 lines with 14 columns. I get an error message:

Failed to import 299 lines: InvalidRequest - code = 2200 [Invalid query] message = "Batch too large"

I used the following copy and tried to increase the batch size:

copy table (Col1,Col2,...)from 'file.csv' with delimiter =';' and header = true and MAXBATCHSIZE = 5000;

I think 299 lines are not too many to import into cassandra, or am I mistaken?

+9
import copy cassandra csv
source share
2 answers

The error you are encountering is a server-side error message stating that the size (in terms of bytes) of your batch insert is too large.

This package size is defined in the cassandra.yaml file:

 # Log WARN on any batch size exceeding this value. 5kb per batch by default. # Caution should be taken on increasing the size of this threshold as it can lead to node instability. batch_size_warn_threshold_in_kb: 5 # Fail any batch exceeding this value. 50kb (10x warn threshold) by default. batch_size_fail_threshold_in_kb: 50 

If you insert many large columns (in size), you can quickly reach this threshold. Try reducing the MAXBATCHSIZE to 200.

More information on copy options here.

+5
source share

Adding the keyword CHUNKSIZE solved the problem for me.

for example, copy event_stats_user from '/home/kiren/dumps/event_stats_user.csv' with CHUNKSIZE = 1;

0
source share

All Articles