This post is about a super-fast and dirty way of creating a histogram in MySQL for numeric values.
There are several other ways to create histograms that are better and more flexible using CASE statements and other types of complex logic. This method defeats me again and again, as it is just so easy to change for each use case, and therefore short and short. Here's how you do it:
SELECT ROUND(numeric_value, -2) AS bucket, COUNT(*) AS COUNT, RPAD('', LN(COUNT(*)), '*') AS bar FROM my_table GROUP BY bucket;
Just change the value of numeric_value to any column, change the rounding, and what it is. I made the bars to be on a logarithmic scale, so that they do not grow too much when you have large values.
The numeric_value should be offset in the ROUNDING operation based on increasing rounding to ensure that the first bucket contains as many items as the next buckets.
eg. with ROUND (numeric_value, -1), a numeric_value in the range [0.4] (5 elements) will be placed in the first bucket, while [5.14] (10 elements) in the second, [15.24] in the third, if only numeric_value is offset accordingly through ROUND (the numeric value is 5, -1).
This is an example of such a query for some random data that looks pretty nice. Good enough to quickly evaluate data.
+--------+----------+-----------------+ | bucket | count | bar | +--------+----------+-----------------+ | -500 | 1 | | | -400 | 2 | * | | -300 | 2 | * | | -200 | 9 | ** | | -100 | 52 | **** | | 0 | 5310766 | *************** | | 100 | 20779 | ********** | | 200 | 1865 | ******** | | 300 | 527 | ****** | | 400 | 170 | ***** | | 500 | 79 | **** | | 600 | 63 | **** | | 700 | 35 | **** | | 800 | 14 | *** | | 900 | 15 | *** | | 1000 | 6 | ** | | 1100 | 7 | ** | | 1200 | 8 | ** | | 1300 | 5 | ** | | 1400 | 2 | * | | 1500 | 4 | * | +--------+----------+-----------------+
Some notes: Ranges that do not have a match will not be displayed in the account - you will not have a zero in the count column. In addition, I use the ROUND function here. You can just as easily replace it with TRUNCATE if you feel it makes more sense to you.