Recursively sort an non-contiguous list into a list of contiguous lists

I've been trying to learn a bit of functional programming lately (with Haskell and Erlang), and I'm always surprised at the concise solutions that people can come up with when they can think recursively and know the tools.

I want the function to convert a list of sorted, unique, non-contiguous integers into a list of contiguous lists, i.e.

[1,2,3,6,7,8,10,11] 

in

 [[1,2,3], [6,7,8], [10,11] 

This was the best I could come up with in Haskell (two functions):

 make_ranges :: [[Int]] -> [Int] -> [[Int]] make_ranges ranges [] = ranges make_ranges [] (x:xs) | null xs = [[x]] | otherwise = make_ranges [[x]] xs make_ranges ranges (x:xs) | (last (last ranges)) + 1 == x = make_ranges ((init ranges) ++ [(last ranges ++ [x])]) xs | otherwise = make_ranges (ranges ++ [[x]]) xs rangify :: [Int] -> [[Int]] rangify lst = make_ranges [] lst 

This may be a bit subjective, but I would be interested to see a better, more elegant solution for this in Erlang or Haskell (other functional languages ​​too, but I may not understand). Otherwise, points to fix my crappy novice Haskell style!

+6
erlang functional-programming recursion haskell
source share
12 answers

The easiest way in my mind is foldr:

 ranges = foldr step [] where step x [] = [[x]] step x acc@ ((y:ys):zs) | y == x + 1 = (x:y:ys):zs | otherwise = [x]:acc 

Or, more briefly:

 ranges = foldr step [] where step x ((y:ys):zs) | y == x + 1 = (x:y:ys):zs step x acc = [x]:acc 

But wait, there still!

 abstractRanges f = foldr step [] where step x ((y:ys):zs) | fxy = (x:y:ys):zs step x acc = [x]:acc ranges = abstractRanges (\xy -> y == x + 1) powerRanges = abstractRanges (\xy -> y == x*x) -- mighty morphin 

By turning the security function into a parameter, you can group more interesting things than just +1 sequences.

 *Main> powerRanges [1,1,1,2,4,16,3,9,81,5,25] [[1,1,1],[2,4,16],[3,9,81],[5,25]] 

The usefulness of this particular function is questionable ... but fun!

+10
source share

I can't believe I got the shortest solution. I know that this is not code golf, but I think it is still quite readable:

 import GHC.Exts range xs = map (map fst) $ groupWith snd $ zipWith (\ab -> (a, ab)) xs [0..] 

or pointfree

 range = map (map snd) . groupWith fst . zipWith (\ab -> (ba, b)) [0..] 

BTW, groupWith snd can be replaced with groupBy (\ab -> snd a == snd b) if you prefer Data.List over GHC.Exts

[change]

By the way: is there a nicer way to get rid of lambda (\ab -> (ba, b)) than (curry $ (,) <$> ((-) <$> snd <*> fst) <*> snd) ?

[Change 2]

Yes, I forgot (,) is a functor. So here is the confusing version:

 range = map (map fst) . groupWith snd . (flip $ zipWith $ curry $ fmap <$> (-).fst <*> id) [0..] 

Suggestions are welcome ...

+9
source share
 import Data.List (groupBy) ranges xs = (map.map) snd . groupBy (const fst) . zip (True : zipWith ((==) . succ) xs (tail xs)) $ xs 

As for how to come up with such a thing: I started with zipWith f xs (tail xs) , which is a common idiom when you want to do something in sequential list items. Similarly, zip pings a list with list information and then act ( groupBy ) on it. The rest is plumbing.

Then, of course, you can feed it via @pl and get:

 import Data.List (groupBy) import Control.Monad (ap) import Control.Monad.Instances() ranges = (((map.map) snd) . groupBy (const fst)) .) =<< zip . (True:) . ((zipWith ((==) . succ)) `ap` tail) 

which, in my authoritative definition, is evil because of Mondad ((->) a) . Twice, even. The data stream is too tortuous to be laid out in any reasonable way. zipaptail is the god of the Aztecs, and the gods of the Aztecs should not be confused.

+4
source share

Another version in Erlang:

 part(List) -> part(List,[]). part([H1,H2|T],Acc) when H1 =:= H2 - 1 -> part([H2|T],[H1|Acc]); part([H1|T],Acc) -> [lists:reverse([H1|Acc]) | part(T,[])]; part([],Acc) -> Acc. 
+4
source share

Try reusing standard features.

 import Data.List (groupBy) rangeify :: (Num a) => [a] -> [[a]] rangeify l = map (map fst) $ groupBy (const snd) $ zip l contigPoints where contigPoints = False : zipWith (==) (map (+1) l) (drop 1 l) 

Or, following the (mixed) advice to use unfoldr , stop abusing groupBy and be happy using partial functions when it doesn't matter:

 import Control.Arrow ((***)) import Data.List (unfoldr) spanContig :: (Num a) => [a] -> [[a]] spanContig l = map fst *** map fst $ span (\(a, b) -> a == b + 1) $ zip l (head l - 1 : l) rangeify :: (Num a) => [a] -> [[a]] rangeify = unfoldr $ \l -> if null l then Nothing else Just $ spanContig l 
+3
source share
 kz = map (fst <$>) . groupBy (const snd) . zip z . (False:) . (zipWith ((==) . succ) <*> tail) $ z 
+3
source share

Erlang using foldr:

 ranges(List) -> lists:foldr(fun (X, [[Y | Ys], Acc]) when Y == X + 1 -> [[X, Y | Ys], Acc]; (X, Acc) -> [[X] | Acc] end, [], List). 
+3
source share

This is my v0.1, and I might do it better:

 makeCont :: [Int] -> [[Int]] makeCont [] = [] makeCont [a] = [[a]] makeCont (a:b:xs) = if b - a == 1 then (a : head next) : tail next else [a] : next where next :: [[Int]] next = makeCont (b:xs) 

And I will try to make it better. I think the changes are coming.

+2
source share

For comparison, here is the Erlang implementation:

 partition(L) -> [lists:reverse(T) || T <- lists:reverse(partition(L, {[], []}))]. partition([E|L], {R, [EL|_] = T}) when E == EL + 1 -> partition(L, {R, [E|T]}); partition([E|L], {R, []}) -> partition(L, {R, [E]}); partition([E|L], {R, T}) -> partition(L, {[T|R], [E]}); partition([], {R, []}) -> R; partition([], {R, T}) -> [T|R]. 
+2
source share

In Erlang, it can be pretty clear and simple:

 partition([]) -> []; partition([A|T]) -> partition(T, [A]). partition([A|T], [B|_]=R) when A =:= B+1 -> partition(T, [A|R]); partition(L, P) -> [lists:reverse(P)|partition(L)]. 

Edit : just for the sake of curiosity, I compared my version and Lukas version and mine seems to be about 10% faster in either the native or the bytecode version when testing, established by the fact that I created lists:usort([random:uniform(1000000)||_<-lists:seq(1,1000000)]) on the R14B01 64b version on my laptop. (The test kit is 669462 and is divided into 232451 subscriptions).

Edit2 . Other test data lists:usort([random:uniform(1000000)||_<-lists:seq(1,10000000)]) , length 999963 and 38 sections make a larger spread in the native code. Finishing work in a mine is less than half the time. The bytecode version is about 20% faster.

Edit3 : some micro-optimizations that provide extra performance, but lead to uglier and less supported code:

 part4([]) -> []; part4([A|T]) -> part4(T, A, []). part4([A|T], B, R) when A =:= B+1 -> part4(T, A, [B|R]); part4([A|T], B, []) -> [[B]|part4(T, A, [])]; part4([A|T], B, R) -> [lists:reverse(R, [B])|part4(T, A, [])]; part4([], B, R) -> [lists:reverse(R,[B])]. 
+2
source share

The standard paramorphism recursion scheme is missing in the Haskell Data.List module, although I think it should be. Here's a solution using paramorphism, because you are creating a list of lists from a list, cons-ing is a bit more complicated:

 contig :: (Eq a, Num a) => [a] -> [[a]] contig = para phi [] where phi x ((y:_),(a:acc)) | x + 1 == y = (x:a):acc phi x (_, acc) = [x]:acc 

Paramorphism is a general recursion or fold with a view:

 para :: (a -> ([a], b) -> b) -> b -> [a] -> b para phi b [] = b para phi b (x:xs) = phi x (xs, para phi b xs) 
+2
source share

Here's a try from haskell noob

 ranges ls = let (a, r) = foldl (\(r, a@ (h:t)) e -> if h + 1 == e then (r, e:a) else (a:r, [e])) ([], [head ls]) (tail ls) in reverse . map reverse $ r : a 
+1
source share

All Articles