Optimally transmitted fixed array sizes in julia

I want to write a function that takes a matrix as input. This is a frequent low-level call in a complex project, so the fastest execution of this function has potentially serious consequences for performance. Since speed is so important to me, I use types in FixedSizeArrays , because I know that it will save on memory usage. But I often know some of the properties of the input matrix, and I'm not sure that I make the best use of it.

Here is a simple example. Imagine the function that I want to do as quickly as possible:

 using FixedSizeArrays function foo( input::Mat ) # NB: Mat is the FixedSizeArrays matrix type return 2 * input end 

Obviously, this is a trivial example, but this is not the point. The fact is that I know something about the size of the input matrix: it always has only two columns, and I can always specify the number of rows at runtime. This is similar to information that can be passed to the compiler to make my code faster. Can I pass it as an argument that somehow determines the size of the input ? Here is an example that does not work, but should give you some idea of โ€‹โ€‹what I'm trying to do.

 function bar( int::N, thismat::Mat{N,2,Float64} ) return 2 * thismat end 

Is there something similar that I can do? Would it work if I could? Perhaps FixedSizeArrays is already doing everything that can be done. Thanks for your thoughts!

+5
source share
1 answer

Arrays of fixed size are already specialized in size. These arrays are not suitable when the number of rows N in your case can change. Any performance issues you notice are likely due to over-specialization.

Let me be more specific.

The Julia compiler is able to achieve zero cost due to aggressive specialization in argument types. So in general (that is, in all cases, except those where specialization will be too expensive or explicitly disabled), if a function is called with two signatures of different types, two versions of this function will be compiled.

Since the size of a Mat is part of its type, this means that a version will be compiled for every possible Mat size. Thus, the specialization you are looking for has already been completed.

Specialization, however, is not free. There are two costs involved:

  • The first time a function is called on a specific signature, memory will be allocated and the compiler will have to start.
  • When a parameter whose type cannot be inferred is passed to the function, there is โ€œtype instabilityโ€ and dynamic dispatch is required. Dynamic submission includes runtime views.

Thus, if your matrices are of size (2, N) , where N changes and is unknown at compile time, the cost of dynamic sending will be incurred. This performance can be limited by using the method of functional barriers: we take on only one cost for each type of unstable call, so limiting the number of such calls increases productivity.

But what would further increase productivity would be to avoid this dynamic dispatch completely. You can build an array type that only encodes the number of columns in the type and has the number of rows as a field at runtime. That is, perhaps your performance problem is due to over-specification, and you need to create your own types to reduce the amount of specialization.

Finding the right balance is central to squeezing as much performance as possible from the application. Specializing in array size is actually quite rarely useful - even, for example, C and C ++ code tends to pass array sizes as run-time parameters, rather than specializing in a specific array size. It is not that expensive. In most cases this is not the case, FixedSizeArrays.jl will not improve performance, but rather it will hurt. Of course, there are situations in which this will help, but yours may not be one of them.


In your case, for maximum performance, I suspect that this type would be the fastest:

 immutable TwoColumnMatrix{T, BaseType} <: AbstractArray{T, 2} height::Int base::BaseType end function TwoColumnMatrix(A::Matrix) size(A, 2) == 2 || throw(ArgumentError("must be two columns")) TwoColumnMatrix{eltype(A), typeof(A)}(size(A, 1), A) end Base.@propagate _inbounds function getindex(M::TwoColumnMatrix, n::Int) M.base[n] end size(M::TwoColumnMatrix) = (M.height, 2) 

You may need to define additional methods for maximum performance and, as always, benchmarks. Perhaps the overhead of the wrapper is not worth the compiler, knowing about the size.

+7
source

All Articles