How to get the 5 best entries in cassandra 2.2

I need help. I have a request that receives the top 5 records by date (not date + time) and the amount of the amount.

I wrote the following, but it returns all the records, not only the top 5 records

CREATE OR REPLACE FUNCTION state_groupbyandsum( state map<text, double>, datetime text, amount text ) CALLED ON NULL INPUT RETURNS map<text, double> LANGUAGE java AS 'String date = datetime.substring(0,10); Double count = (Double) state.get(date); if (count == null) count = Double.parseDouble(amount); else count = count + Double.parseDouble(amount); state.put(date, count); return state;' ; CREATE OR REPLACE AGGREGATE groupbyandsum(text, text) SFUNC state_groupbyandsum STYPE map<text, double> INITCOND {}; select groupbyandsum(datetime, amout) from warehouse; 

Could you help get a total of 5 entries.

+4
source share
1 answer

Here is one way to do it. The function of your group can be as follows:

 CREATE FUNCTION state_group_and_total( state map<text, double>, type text, amount double ) CALLED ON NULL INPUT RETURNS map<text, double> LANGUAGE java AS ' Double count = (Double) state.get(type); if (count == null) count = amount; else count = count + amount; state.put(type, count); return state; '; 

This will create a map of all quantity lines selected by your WHERE clause. Now the tricky part is how to save only the top of N. One way to do this is to use FINALFUNC, which starts after all the lines have been placed on the map. So, here's what to do using the loop to find the maximum value on the map and transfer it to the results map. So, to find the vertex N, it will iterate over the map N times (there are more efficient algorithms than this, but this is just a quick and dirty example).

So, here is an example to find the two best:

 CREATE FUNCTION topFinal (state map<text, double>) CALLED ON NULL INPUT RETURNS map<text, double> LANGUAGE java AS ' java.util.Map<String, Double> inMap = new java.util.HashMap<String, Double>(), outMap = new java.util.HashMap<String, Double>(); inMap.putAll(state); int topN = 2; for (int i = 1; i <= topN; i++) { double maxVal = -1; String moveKey = null; for (java.util.Map.Entry<String, Double> entry : inMap.entrySet()) { if (entry.getValue() > maxVal) { maxVal = entry.getValue(); moveKey = entry.getKey(); } } if (moveKey != null) { outMap.put(moveKey, maxVal); inMap.remove(moveKey); } } return outMap; '; 

Then, finally, you need to define AGGREGATE in order to call the two functions you defined:

 CREATE OR REPLACE AGGREGATE group_and_total(text, double) SFUNC state_group_and_total STYPE map<text, double> FINALFUNC topFinal INITCOND {}; 

So let's see if this works.

 CREATE table test (partition int, clustering text, amount double, PRIMARY KEY (partition, clustering)); INSERT INTO test (partition , clustering, amount) VALUES ( 1, '2015', 99.1); INSERT INTO test (partition , clustering, amount) VALUES ( 1, '2016', 18.12); INSERT INTO test (partition , clustering, amount) VALUES ( 1, '2017', 44.889); SELECT * from test; partition | clustering | amount -----------+------------+-------- 1 | 2015 | 99.1 1 | 2016 | 18.12 1 | 2017 | 44.889 

Now the drum ...

 SELECT group_and_total(clustering, amount) from test where partition=1; agg.group_and_total(clustering, amount) ------------------------------------------- {'2015': 99.1, '2017': 44.889} 

So you see that it holds the top 2 rows based on the sum.

Please note that the keys will not be sorted in order, as this is a map, and I don’t think we can control the order of the keys on the map, so sorting in FINALFUNC would be a waste of resources. If you need a card sorted, you can do this on the client.

I think you could work more in the state_group_and_total function to remove items from the map as you move. It might be better to keep the map too large.

+3
source

All Articles