Getting data in cassandra based on statistics

Question

Getting data in cassandra based on statistics

I am testing Cassandra (2.0) as a possible replacement for storing time series data.

I made a simple table and dumped some of our data:

CREATE TABLE DataRaw(
  channelId int,
  sampleTime timestamp,
  value double,
  PRIMARY KEY (channelId, sampleTime)
) WITH CLUSTERING ORDER BY (sampleTime ASC);

I can easily execute the most frequently used queries, such as the first value, the last value (current) and get statistics through max, min, count, avg, etc.

But I also need to not only get the maximum value in the range, but also the time in which this value.

For data:

sampleTime          value
2015-08-28 00:00    10
2015-08-28 01:00    15
2015-08-28 02:00    13

I want the request returned 2015-08-28 01:00 and 15

I tried something like this:

select sampletime, value from dataraw where channelid=865 and sampletime >= '2014-01-01 00:00' and sampleTime < '2014-01-02 00:00' and value = (select max(value) from dataraw where channelid=865 and sampletime >= '2014-01-01 00:00' and sampleTime < '2014-01-02 00:00');

which will work in "normal" SQL, but he doesn't like Cassandra.

Any ideas?

+4

cassandra time-series cql3 cassandra-2.0

Paaland Aug 28 '15 at 8:46

source share

2

Jim Meyer · Answer 1 · 2015-08-28T10:31:28+0000

Cassandra 2.2. 2.0 , 2.2.

2.2 :

cqlsh:test> SELECT  * from dataraw ;

 channelid | sampletime               | value
-----------+--------------------------+-------
         1 | 2015-08-28 06:20:38-0400 |    10
         1 | 2015-08-28 06:20:49-0400 |    15
         1 | 2015-08-28 06:20:57-0400 |    13

cqlsh:test> SELECT sampletime, max(value) FROM dataraw 
            WHERE channelid=1 AND sampletime >= '2015-08-28 06:20:38-0400' 
                  AND sampletime <= '2015-08-28 06:20:57-0400';

 sampletime               | system.max(value)
--------------------------+-------------------
 2015-08-28 06:20:38-0400 |                15

, CQL ( Cassandra) SQL, , . . .

, :

CQL, , / .
UDF ( ) , (.. ).
Cassandra Apache Spark, , CQL ( ).
Cassandra 3.0 , , , . Cassandra 3.0 -.

Cassandra 2.2 CQL min, max, avg sum, , , 2.0. , CQL SQL, SQL .

Sergei Rodionov · Answer 2 · 2015-09-01T16:26:48+0000

Axibase Time-Series Database MIN_VALUE_TIME MAX_VALUE_TIME.

MIN_VALUE_TIME , MIN .
MAX_VALUE_TIME , MAX .

API, MAX, MAX_VALUE_TIME .

, ATSD HBase .

: Axibase.

1: , . , MIN MAX . , .

2: SQL

SELECT entity, 
  MAX(value), 
  date_format(MAX_VALUE_TIME(value), 'yyyy-MM-dd HH:mm:ss') AS "Max Value Time" 
  FROM cpu_busy 
WHERE time > current_hour GROUP BY entity

Getting data in cassandra based on statistics

More articles: