Try the following:
function gini(wagedistarray) nrows = size(wagedistarray,1) Swages = zeros(nrows) for i in 1:nrows for j in 1:i Swages[i] += wagedistarray[j,2]*wagedistarray[j,1] end end Gwages=Swages[1]*wagedistarray[1,2] for i in 2:nrows Gwages+=wagedistarray[i,2]*(Swages[i]+Swages[i-1]) end return 1-(Gwages/Swages[length(Swages)]) end wagedistarray=zeros(10000,2) for i in 1:size(wagedistarray,1) wagedistarray[i,1]=1 wagedistarray[i,2]=1/10000 end @time result=gini(wagedistarray)
- Time to:
5.913907256 seconds (4000481676 bytes allocated, 25.37% gc time) - Time after:
0.134799301 seconds (507260 bytes allocated) - Time after (second start):
elapsed time: 0.123665107 seconds (80112 bytes allocated)
The main problems are that Swages was a global variable (not a function), which is not good coding practice, but more importantly, it is a performance killer . Another thing I noticed is length(wagedistarray[:,1]) , which makes a copy of this column and then asks for its length - this created additional βgarbageβ. The second run is faster because the first time the function is run, compilation time is executed.
You increase productivity even higher using @inbounds , i.e.
function gini(wagedistarray) nrows = size(wagedistarray,1) Swages = zeros(nrows) @inbounds for i in 1:nrows for j in 1:i Swages[i] += wagedistarray[j,2]*wagedistarray[j,1] end end Gwages=Swages[1]*wagedistarray[1,2] @inbounds for i in 2:nrows Gwages+=wagedistarray[i,2]*(Swages[i]+Swages[i-1]) end return 1-(Gwages/Swages[length(Swages)]) end
which gives me elapsed time: 0.042070662 seconds (80112 bytes allocated)
Finally, check out this version, which is actually faster than everything, and also the most accurate, I think:
function gini2(wagedistarray) Swages = cumsum(wagedistarray[:,1].*wagedistarray[:,2]) Gwages = Swages[1]*wagedistarray[1,2] + sum(wagedistarray[2:end,2] .* (Swages[2:end]+Swages[1:end-1])) return 1 - Gwages/Swages[end] end
What has elapsed time: 0.00041119 seconds (721664 bytes allocated) . The main advantage was that the transition from O (n ^ 2) to the cycle is O (n) cumsum .