Should I convert to Sets?

I deal with lists in my program, and I want to be able to quickly check if two lists intersect or not. My implementation attempt is

commonEle :: (Eq a) => [a] -> [a] -> Bool commonEle _ [] = False commonEle [] _ = False commonEle (x:xs) ys |x `elem` ys = True |otherwise = commonEle xs ys 

It works, but I try to be careful with efficiency so that things don't explode. This SO question explains in one of the answers that it is much more efficient to use kits to quickly check intersections. My list will automatically have different elements, so using collections can be a good idea, but the natural way to create my data is to understand the list, so I would have to use the fromList function to turn it into a collection. How do you know which implementation is more efficient?

For the record, I will need to check a lot of fairly small lists (~ 10 ^ 5 sizes <100).

+4
source share
2 answers

Are you going to check all the lists one by one, or do you need to check all possible pairs?

I would check out a lot of very small lists against one.

I asked because one problem is converting too many small lists using fromList . Since one of the lists is fixed, you can avoid most of this cost by converting only a fixed list.

 import qualified Data.Set as S import Data.Set (Set) -- There is probably a better name for this modified version. commonEle :: (Ord a) => Set a -> [a] -> Bool commonEle xs ys = any (`S.member` xs) ys 

If you are writing a puzzle game, you can save this part of the puzzle state forever as Set , so you don’t need to recreate the set every time. (If you need to save additional information related to positions, there are also Map types from Data.Map / Data.Map.Strict / Data.HashMap ).

In any case, follow Karsten’s advice: "The best way to find out is to try it (measure)." Also, do not forget to check the promised performance characteristics for the functions that you plan to use in the documentation of the corresponding modules.

+2
source

You mentioned in the comments that your sets will all be coordinate pairs (Int,Int) , where each coordinate is in the range [1..10] .

In this case, you must use bit-bit so that you can use the bitwise AND and OR operations of the processor to set the intersection and union.

For this purpose, you can use the Data.BitSet.Dynamic module:

 import qualified Data.BitSet.Dynamic as B type Bset = B.BitSet B.FasterInteger -- convert a list of points into a bit set toSet :: [(Int,Int)] -> Bset toSet points = B.fromList [ fromIntegral (r*10+c) | (r,c) <- points ] -- do two bit sets have common elements? commonElements :: Bset -> Bset -> Bool commonElements ab = not $ B.null $ B.intersection ab -- add a point to a bit set addPoint :: Bset -> (Int,Int) -> Bset addPoint a (r,c) = B.insert (fromIntegral $ r*10+c) a -- convert a bit set to a list of points toPoints :: Bset -> [(Int,Int)] toPoints a = [ (q,r) | x <- B.toList a, let (q,r) = divMod (fromIntegral x) 10 ] -- does a set have a point? hasPoint :: Bset -> (Int,Int) -> Bool hasPoint a (r,c) = B.member (fromIntegral $ r*10+c) a 

Detecting if the two sets have any common elements or if the point is in the set very quickly.

+2
source

All Articles