Paging large result sets in Cassandra with CQL3 with varchar keys

I am working on updating old rule-based code in CQL3.

One part of the code goes through the entire data set of the table, consisting of 20M + rows. This part was originally due to memory usage due to memory usage, so I created a RowIterator class that was repeated through the column family using TokenRanges (and Hector).

When trying to rewrite this using CQL3, I had problems finding through data. I found some information at http://www.datastax.com/documentation/cql/3.0/cql/cql_using/paging_c.html , but when trying this code for the first "page"

resultSet = session.execute("select * from " + TABLE + " where token(key) <= token(" + offset + ")"); 

I get an error

com.datastax.driver.core.exceptions.InvalidTypeException: invalid type for value 0 of vargar type of CQL type, expecting class java.lang.String, but class java.lang.Integer is provided

Of course, the example uses numeric keys in the link. Is there a way to do this using varchar keys (UTF8Type)?

There seems to be built-in functionality for this ( https://issues.apache.org/jira/browse/CASSANDRA-4415 ), but I can not find examples that will make me go, In addition, I have to solve this for Cassandra 1.2 .nine.

0
java cassandra cql3
source share
1 answer

So the easy answer is to upgrade to Cassandra 2.0.X and use the new built-in swap features. But to do this on Cassandra 1.2, you are on the right track. Your syntax should work, if you run a query that you are trying to execute in cqlsh, do you get the same error? With paging like this, it is best to use a ">", as in the example, this can be a problem. You want to start with select * from table limit 100 , then go to select * from table where token(key)>token('last key') limit 100

I would also try it with a prepared expression. String manipulations can do something funny to offset.

+1
source share

All Articles