The number of instances of each unique integer in a vector in 1 line of code?

Is there a slick way to rewrite this Julia function, perhaps using only one line of code without making it much slower? (I just started using Julia. That's great!) K is a positive integer, and zd is a vector of positive integers not exceeding K Thanks!

 function tally(zd) ret = zeros(Int64, K) for k in zd ret[k] += 1 end return ret end 

Example:

 julia> K = 5 julia> zd = [1,2,2,2,2,3]; julia> tally(zd) 5-element Array{Float64,1}: 1 4 1 0 0 
+7
julia-lang
source share
5 answers

I have not tested the performance, but using the hist function should work:

 hist(zd,0.5:K+0.5)[2] 

gives:

5-element array {Int64,1}: 1 4 1 0 0

or, if zeros are not significant, just use

 hist(zd)[2] 3-element Array{Int64,1}: 1 4 1 
+5
source share

Any alternative will probably not be faster. Your loop already makes only one pass through the array. Julia loops are fast, and there is no speed advantage for vectorized code, as in other languages.

Look at the Julia implementation of the hist function. This is taken directly from Julia 's standard library :

 function hist(v::AbstractVector, edg::AbstractVector) n = length(edg)-1 h = zeros(Int, n) for x in v i = searchsortedfirst(edg, x)-1 if 1 <= i <= n h[i] += 1 end end edg,h end 

The "edg" parameter contains the edges of the bins. If we remove this function, we get exactly the function that you wrote.

+8
source share

Here http://statsbasejl.readthedocs.org/en/latest/counts.html#countmap

 countmap(x[, wv]) Return a dictionary that maps distinct values in x to their counts (or total weights). 
+6
source share

There are tons of counting functions included in StatsBase.jl . Your count function is equivalent to counts(zd, 1:K) .

There are also methods for counting unique elements of types other than the whole, for example countmap , which returns a dictionary matching unique values ​​with their number of occurrences.

+3
source share

I know him old, but what about

[sum(zd .== i) for i in unique(zd)]

in a short test, it works better than your original function (time and memory).

Note: the result is not sorted!

+2
source share

All Articles