I am considering a CF implementation in Cassandra with very long rows (from hundreds of thousands to millions of columns per row).
Using completely dummy data, I inserted 2 million columns in one row (evenly distributed). If I do a slice operation to get 20 columns, then I notice a huge degradation in performance since you are performing a slice operation further down the line.
In most columns, I seem to be able to serve 10-40 ms slice results, but as you get closer to the end of the line, performance drops to the wall, response time gradually increases from 43 ms at 1,800,000, mark 214 ms at 1 900 000 and 435 ms for 1,999,900! (All fragments have the same width).
I find it difficult to explain why such a huge decrease in performance occurs when you get to the end of the line. Can someone please give some recommendations regarding what Kassandra is doing internally to make such a delay? String cropping is disabled, and almost everything is the default setting for Cassandra 1.0 by default.
It is estimated that it will be able to support up to 2 billion columns per row, but at this rate the increase in performance will mean that it cannot be used for very long rows in a practical situation.
Many thanks.
Caution, I click on this with 10 queries in parallel at the same time, so they are a bit slower than I expected anyway, but this is an honest test for all queries and even just doing them all in serial order this is a strange degradation between the 1,800,000 and 1,900,000th record .
I also noticed an extremely poor performance when performing backward fragments for only one element with only 200,000 columns per row: query.setRange (end, start, false, 1);
cassandra
agentgonzo
source share