Install statements not provided by Haskell's Data.Vector, for what reason

Question

Install statements not provided by Haskell's Data.Vector, for what reason

My application includes heavy array operations (e.g. log(1) indexing), so Data.Vector and Data.Vector.Unboxed prefer Data.List . It also includes many set operations (for example, intersectBy), which, however, are not provided by Data.Vector.

Each of these functions can be implemented as in Data.List in 3-4 lines. Is there a reason why they are not all implemented using Data.Vector? I can only guess. Perhaps the given operations in Data.Vector are discouraged for performance reasons, i.e. IntersectBy will first create the intersection through a list comprehension, and then convert the list to Data.Vector?

+4

set vector haskell

Causality Mar 29 '13 at 18:12

source share

1 answer

Niklas B. · Accepted Answer · 2013-03-29T19:05:18+0000

I suppose this is absent because intersecting unsorted, immutable arrays should have the worst possible runtime of Ω(n*m) without the use of extra space, and Data.Vector optimized for performance. If you want, you can write this function yourself:

 import Data.Vector as V intersect :: Eq a => V.Vector a -> V.Vector a -> V.Vector a intersect x = V.filter (`V.elem` x)

Or using a temporary data structure to achieve the expected complexity of O(n + m) :

 import Data.HashSet as HS intersect :: (Hashable a, Eq a) => V.Vector a -> V.Vector a -> V.Vector a intersect x = V.filter (`HS.member` set) where set = HS.fromList $ V.toList x

If you can afford to use additional memory, perhaps you can use some type of aggregate for your data, such as an array for quick random access and a hash Data.HashSet , for example Data.HashSet , for quick membership checks and always save both containers in a timely manner. This way you can reduce the asymptotic complexity for intersecting with something like O(min(n, m))

Install statements not provided by Haskell's Data.Vector, for what reason

More articles: