I have a matrix of large numbers, and I would like to apply sortperm to each column of this matrix. The naive thing to do is
order = sortperm(X[:,j])
which makes a copy. This seems like a shame, so I decided to try SubArray :
order = sortperm(sub(X,1:n,j))
but it was even slower. For a laugh I tried
order = sortperm(1:n,by=i->X[i,j])
but of course it was terrible. What is the fastest way to do this?
Here are some benchmarks:
getperm1(X,n,j) = sortperm(X[:,j]) getperm2(X,n,j) = sortperm(sub(X,1:n,j)) getperm3(X,n) = mapslices(sortperm, X, 1) n = 1000000 X = rand(n, 10) for f in [getperm1, getperm2] println(f) for it in 1:5 gc() @time f(X,n,5) end end for f in [getperm3] println(f) for it in 1:5 gc() @time getperm3(X,n) end end
results:
getperm1 elapsed time: 0.258576164 seconds (23247944 bytes allocated) elapsed time: 0.141448346 seconds (16000208 bytes allocated) elapsed time: 0.137306078 seconds (16000208 bytes allocated) elapsed time: 0.137385171 seconds (16000208 bytes allocated) elapsed time: 0.139137529 seconds (16000208 bytes allocated) getperm2 elapsed time: 0.433251141 seconds (11832620 bytes allocated) elapsed time: 0.33970986 seconds (8000624 bytes allocated) elapsed time: 0.339840795 seconds (8000624 bytes allocated) elapsed time: 0.342436716 seconds (8000624 bytes allocated) elapsed time: 0.342867431 seconds (8000624 bytes allocated) getperm3 elapsed time: 1.766020534 seconds (257397404 bytes allocated, 1.55% gc time) elapsed time: 1.43763525 seconds (240007488 bytes allocated, 1.85% gc time) elapsed time: 1.41373546 seconds (240007488 bytes allocated, 1.82% gc time) elapsed time: 1.42215519 seconds (240007488 bytes allocated, 1.83% gc time) elapsed time: 1.419174037 seconds (240007488 bytes allocated, 1.83% gc time)
Where the mapslices version is the 10x version of getperm1 , as you would expect.
It is worth noting that on my machine, at least the copy + sortperm option is not much slower than just sortperm for a vector of the same length, but there is no need for memory allocation, so it would be nice to avoid this.