Recently, I needed to compare two sets of historical data. Since one of them sometimes was missing a day or two, and I wanted to be precise, I decided to create a list of all possible dates and two lists of tuples containing dates and corresponding values โโbelonging to both sets. Then I changed the last lists to Map to improve date search.
The idea was to try to find each date from the complete list of dates in the Map ped list and create a list of triples (date, value1, value2) containing only the dates when both datasets had a date and a value. Then I could write them to a file and compare them correctly.
DO NOT CONSIDER CODE INCLUDED ONLY FOR GOOD MEASURE
Here is the code (it is not at all optimal, but for this small task he did an excellent job):
import qualified Data.Map as M import Data.List (transpose) import Data.Maybe (fromJust) main = do dts <- readFile "dates.txt" cts1 <- readFile "eu.txt" cts2 <- readFile "usa.txt" let dates = lines dts cols1 = transpose $ map words $ lines cts1 cols2 = transpose $ map words $ lines cts2 prs1 = zip (head cols1) (last cols1) prs2 = zip (head cols2) (last cols2) map1 = M.fromList prs1 map2 = M.fromList prs2 trips = map fromJust (filter (/=Nothing) (map (\date -> getTrips date map1 map2) dates)) cols3 = map (\(a,b,c) -> [a,b,c]) trips result = unlines $ map unwords $ cols3 writeFile "trips.txt" result getTrips :: String -> M.Map String String -> M.Map String String -> Maybe (String, String, String) getTrips date map1 map2 | is1 /= Nothing && is2 /= Nothing = Just (date, fromJust is1, fromJust is2) | otherwise = Nothing where is1 = M.lookup date map1 is2 = M.lookup date map2
TL DR: The code worked (although I would love to hear some opinions / tips), but I have some questions:
- there were only about 2000 dates, so I didnโt really like the performance (you can see that I used
String everywhere); Did Data.Map excess? When should Data.Map be preferred over tuple lists? Map was created from String tuples - excellent or should the key always be numeric for the balancing and search to work correctly?
source share