Hot issues
This question is quite appropriate, but he is 2 years old: In memory, the OLAP engine in Java
Background
I would like to create a pivot table similar to a matrix from a given table dataset in memory
eg. age by marital status (rows - age, columns - marital status).
Entrance : a list of people with age and some Boolean property (e.g. married),
Desired result : the number of people by age (row) and isMarried (column)
What I tried (Scala)
case class Person(val age:Int, val isMarried:Boolean) ... val people:List[Person] = ...
I managed to do it naively, first grouping by age, then map , which performs count by marital status and displays the result, then I foldRight to aggregate
TreeMap(peopleByAge.toSeq: _*).map(x => { val age = x._1 val rows = x._2 val numMarried = rows.count(_.isMarried()) val numNotMarried = rows.length - numMarried (age, numMarried, numNotMarried) }).foldRight(List[FinalResult]())(row,list) => { val cumMarried = row._2+ (if (list.isEmpty) 0 else list.last.cumMarried) val cumNotMarried = row._3 + (if (list.isEmpty) 0 else l.last.cumNotMarried) list :+ new FinalResult(row._1, row._2, row._3, cumMarried,cumNotMarried) }.reverse
I do not like the code above, it is not efficient, it is difficult to read, and I am sure that there is a better way.
Question (s)
How do I group both? and how do I make a count for each subgroup, for example.
How many people are exactly 30 years old and married?
Another question is how I can execute the total to answer the question:
How many people over 30 are married?
Edit:
Thank you for your great answers.
just to clarify, I would like the output to include a โtableโ with the following columns
- Age (increasing)
- Num Married.
- Num not married
- Running everything is married.
- Running total single
Not only answers these specific requests, but also creates a report that will answer all such questions.