Julia is much slower than Java

Question

Julia is much slower than Java

I'm new to Julia, and I wrote a simple function that calculates RMSE (standard error). ratings is a rating matrix, each line [user, film, rating] . There are 15 million ratings. The rmse() method takes 12.0 s, but the Java implementation is about 188 times faster: 0.064 s. Why is Julia's implementation slowing down? In Java, I work with an array of Rating objects, if it was a multi-dimensional int array, it will be even faster.

 ratings = readdlm("ratings.dat", Int32) function predict(user, film) return 3.462 end function rmse() total = 0.0 for i in 1:size(ratings, 1) r = ratings[i,:] diff = predict(r[1], r[2]) - r[3] total += diff * diff end return sqrt(total / size(ratings)[1]) end

EDIT: After the global variable is excluded, it ends at 1.99 s (31 times slower than in Java). After removing r = ratings[i,:] it is equal to 0.856 s (13x slower).

+8

performance julia-lang

fhucho Jun 22 '13 at 14:27

source share

3 answers

For me, the following code works in 0.024 seconds (and I doubt that my laptop is much faster than your computer). I initialized the ratings with a commented line, since I did not have the file that you referenced.

 function predict(user, film) return 3.462 end function rmse(r) total = 0.0 for i = 1:size(r,1) diff = predict(r[i,1],r[i,2]) - r[i,3] total += diff * diff end return sqrt(total / size(r,1)) end # ratings = rand(1:20, 5000000, 3)

+7

tholy Jun 22 '13 at 23:13

source share

On my system, the problem is that your constant predict function is not optimized. Replacing extra calls with predict does the code in 0.01 seconds.

 function time() ratings = ones(15_000_000, 3) predict(user, film) = 3.462 function rmse(ratings) total = 0.0 for i in 1:size(ratings, 1) diff = predict(ratings[i, 1], ratings[i, 2]) - ratings[3] total += diff * diff end return sqrt(total / size(ratings, 1)) end rmse(ratings) @elapsed rmse(ratings) end time() function time2() ratings = ones(15_000_000, 3) predict(user, film) = 3.462 function rmse(ratings) total = 0.0 for i in 1:size(ratings, 1) diff = 3.462 - ratings[3] total += diff * diff end return sqrt(total / size(ratings, 1)) end rmse(ratings) @elapsed rmse(ratings) end time2()

+5

John myles white Jun 23 '13 at 3:11

source share

Harlan · Accepted Answer · 2013-06-22T15:13:34+0000

A few suggestions:

Do not use global variables. For annoying technical reasons, they are slow. Instead, pass ratings as an argument.
The line r = ratings[i,:] makes a copy that is slow. Instead, use predict(r[i,1], r[i,2]) - r[i,3] .
square() may be faster than x*x - try it.
If you are using the original version of Julia from the source, check out the new NumericExtensions.jl package , which has insanely optimized features for many common numerical operations. ( see julia-dev list )
Julia must compile the code on first run. The right way to test in Julia is to make the timing several times and ignore the first time.

Julia is much slower than Java

More articles: