I want to insert a single row with 50,000 columns in Cassandra 1.2.8. Before insertion, I have all the data for the entire line ready for work (in memory):
+---------+------+------+------+------+-------+ | | 0 | 1 | 2 | ... | 49999 | | row_id +------+------+------+------+-------+ | | text | text | text | ... | text | +---------+------+------+------|------+-------+
Column names are integers that can be cut into pagination. Column values ββare the value in this particular index.
CQL3 table definition:
create table results ( row_id text, index int, value text, primary key (row_id, index) ) with compact storage;
Since I already have a pair of numbers_and_and_i and 50,000 names / values ββin memory, I just want to insert one line in Cassandra in one request / operation so that it is as fast as possible.
The only thing I can find is to execute the following 50,000 times:
INSERT INTO results (row_id, index, value) values (my_row_id, ?, ?);
first one ? is the index counter ( i ), and the second ? - The text value to store at i .
It takes a lot of time. Even when we put the above INSERT in the package, it takes a lot of time.
We have all the data we need (a full line) as a whole, I would suggest that it is very simple to say "here, Cassandra, store this data as one line in one request", for example:
//EXAMPLE-BUT-INVALID CQL3 SYNTAX: insert into results (row_id, (index,value)) values ((0,text0), (1,text1), (2,text2), ..., (N,textN));
This example is not possible using the current CQL3 syntax, but I hope that it illustrates the desired effect: everything will be inserted as a single query.
Is it possible to do this in CQL3 and Driver DataStax Java? If not, I believe that instead I will be forced to use the Hector or the Astyanax driver and the Thrift batch_insert operation?