When should I use Haskell Data.Map in favor of a list of tuples?

Recently, I needed to compare two sets of historical data. Since one of them sometimes was missing a day or two, and I wanted to be precise, I decided to create a list of all possible dates and two lists of tuples containing dates and corresponding values โ€‹โ€‹belonging to both sets. Then I changed the last lists to Map to improve date search.

The idea was to try to find each date from the complete list of dates in the Map ped list and create a list of triples (date, value1, value2) containing only the dates when both datasets had a date and a value. Then I could write them to a file and compare them correctly.

DO NOT CONSIDER CODE INCLUDED ONLY FOR GOOD MEASURE

Here is the code (it is not at all optimal, but for this small task he did an excellent job):

 import qualified Data.Map as M import Data.List (transpose) import Data.Maybe (fromJust) main = do dts <- readFile "dates.txt" cts1 <- readFile "eu.txt" cts2 <- readFile "usa.txt" let dates = lines dts cols1 = transpose $ map words $ lines cts1 cols2 = transpose $ map words $ lines cts2 prs1 = zip (head cols1) (last cols1) prs2 = zip (head cols2) (last cols2) map1 = M.fromList prs1 map2 = M.fromList prs2 trips = map fromJust (filter (/=Nothing) (map (\date -> getTrips date map1 map2) dates)) cols3 = map (\(a,b,c) -> [a,b,c]) trips result = unlines $ map unwords $ cols3 writeFile "trips.txt" result getTrips :: String -> M.Map String String -> M.Map String String -> Maybe (String, String, String) getTrips date map1 map2 | is1 /= Nothing && is2 /= Nothing = Just (date, fromJust is1, fromJust is2) | otherwise = Nothing where is1 = M.lookup date map1 is2 = M.lookup date map2 

TL DR: The code worked (although I would love to hear some opinions / tips), but I have some questions:

  • there were only about 2000 dates, so I didnโ€™t really like the performance (you can see that I used String everywhere); Did Data.Map excess? When should Data.Map be preferred over tuple lists?
  • Map was created from String tuples - excellent or should the key always be numeric for the balancing and search to work correctly?
+4
source share
1 answer

there were only about 2000 dates, so I didnโ€™t care (you can see that I used Strings everywhere); was using Data.Map overkill? When should Data.Map be used over tuple lists?

You should use data structures that fit your tasks and performance / programming time constraints, so using Map was probably a good idea. Perhaps, in your case, if your data has already been ordered, you could do

 union [] _ = [] union _ [] = [] union xss@ ((dx,vx):xs) yss@ ((dy,vy):ys) = case compare dx dy of EQ -> (dx, vx, vy) : union xs ys GT -> union xss ys LT -> union xs yss 

The map was created from tuples of strings - is this good or if the key must always be numeric so that balancing and searching are performed correctly?

Not if your typechecks code your Map will work correctly (w / r / t, as you defined an Ord instance). But, as CA McCann suggests, trie may be more appropriate if your keys are lists, especially if there are many overlaps between key prefixes (see Ord instance on lists, and imagine the number of operations that must be performed to insert keys "abcdx", "abcdy" and "abcdz" into the Map and trie structure to convince themselves).

+5
source

All Articles