F # array_chunk for sequence

I am having problems with consistency. Basically I need to slice a sequence into a sequence of arrays. Seq.windowed almost does this, but I don't want to duplicate elements.

I can get what I want by first reading everything into an array, but I would rather use a sequence.

let array_chunk s (a:int[]) = Array.init (a.Length / s) (fun i -> Array.sub a (i * s) s) someSequence |> Seq.to_array |> array_chunk 5 
+6
arrays f # sequence chunking
source share
12 answers

Here's a good imperative that will work with seq and create arrays of any size. The latter will be less if the sequence is not equal to n.

 let chunk n xs = seq { let i = ref 0 let arr = ref <| Array.create n (Unchecked.defaultof<'a>) for x in xs do if !i = n then yield !arr arr := Array.create n (Unchecked.defaultof<'a>) i := 0 (!arr).[!i] <- x i := !i + 1 if !i <> 0 then yield (!arr).[0..!i-1] } 
+5
source share

I like the solution of Seq.take and Seq.skip . It is beautiful, simple and very readable, but I would use something like this:

 let chunks n (sequence: seq<_>) = let fold_fce (i, s) value = if i < n then (i+1, Seq.append s (Seq.singleton value)) else ( 1, Seq.singleton value) in sequence |> Seq.scan (fold_fce) (0, Seq.empty) |> Seq.filter (fun (i,_) -> i = n) |> Seq.map (Seq.to_array << snd ) 

This is not imperative code, and it should be more efficient than a solution using Seq.skip. On the other hand, it orders the input sequence to a length divisible by n. If this behavior is unacceptable, it can be eliminated with a simple modification:

 let chunks n (sequence: seq<_>) = let fold_fce (i, s) value = if i < n then (i+1, Seq.append s (Seq.singleton value)) else ( 1, Seq.singleton value) in sequence |> Seq.map (Some) |> fun s -> Seq.init_finite (n-1) (fun _ -> None) |> Seq.append s |> Seq.scan (fold_fce) (0, Seq.empty) |> Seq.filter (fun (i,_) -> i = n) |> Seq.map (Seq.to_array << (Seq.choose (id)) << snd ) 
+5
source share

This answer is likely to be buried, but here I take on this problem:

 let chunk n xs = xs |> Seq.mapi(fun ix -> i/n, x) |> Seq.groupBy fst |> Seq.map (fun (_, g) -> Seq.map snd g) 

Pros:

  • Uses only seq, no arrays
  • O (n) runtime. Not O (n ^ 2) like Seq.skip / make decisions
  • Seq.length should not be a multiple of n
  • small and understandable?

Minuses:

  • probably not as efficient as mandatory / mutable loops
+4
source share

What about:

 let rec chunks n sq = if not (Seq.is_empty sq) then seq { yield Seq.take n sq |> Seq.to_array yield! chunks n (Seq.skip n sq) } else Seq.empty 

Note that this requires sq to contain several elements that are evenly divisible by n (since Seq.take and Seq.skip, unlike LINQ Take and Skip extension methods, require a sequence to contain at least n elements). In addition, it is not as effective as explicitly using an enumerator, but it is more elegant.

+3
source share

The version of the answer "take / skip" as an extension function is fixed. Should work unevenly. There are no performance guarantees, though ...

  module Seq = let rec chunks n (s:#seq<_>) = seq { if Seq.length s <= n then yield s else yield Seq.take ns yield! chunks n (Seq.skip ns) } 

(Code taken from my answer here )

+3
source share

It is beautiful and concise:

 let chunk size (arr : 'a array) = [| for a in 0 .. size .. arr.Length - size -> arr.[a..a + size - 1] |] 

However, this cuts off the last elements (arr.Length% size) in the array. You can fix this by capturing the missing elements and using Array.append:

 let chunk2 size (arr : 'a array) = let first = [| for a in 0 .. size .. arr.Length - size -> arr.[a..a + size - 1] |] let numberOfMissingElements = arr.Length - (first.Length * size) if numberOfMissingElements > 0 then let last = [| arr.[arr.Length - numberOfMissingElements..] |] Array.append first last else first 
+1
source share

Here is another approach with some pattern matching - it looks more like * .iter, and I have it spitting out lists, not arrays, since I usually like my data.

 let sequence_in_lists_of_length_n_with_acc (s: seq<'a>) n acc = seq { use e = s.GetEnumerator() let rec next_with_acc acc = match e.MoveNext(), acc with | true, a when List.length a + 1 = n -> yield (List.rev (e.Current :: a)) next_with_acc [] | true, _ -> next_with_acc (e.Current :: acc) | false, _ -> f(List.rev acc) () next_with_acc [] 

}

0
source share

I like this solution better. It generates a new sequence from the existing sequence (this means that you do not need to go through the entire sequence to get the result, which is critical if you do something like processing logs where you cannot name things like Length).

I ended up writing a blog post with more details on how I got here.

 module Seq = 

let grouped_by_with_leftover_processing f (f2: 'list β†’ list <' a>) (s: seq <'a>) = let rec grouped_by_with_acc (f:' a β†’ 'list β†’' list of options * 'list) acc (i.e. : IEnumerator <'a>) = seq {if ie.MoveNext () then let nextValue, leftovers = f ie.Current acc if nextValue.IsSome then gives nextValue.Value Give way! grouped_by_with_acc f residues, i.e. still let rems = f2 acc if rems.IsSome then give rems.Value} seq {Give way! grouped_by_with_acc f [] (s.GetEnumerator ())}

let YieldReversedLeftovers (f: 'list) = if f.IsEmpty then No else Some (List.rev f)

let grouped_by fs = grouped_by_with_leftover_processing f YieldReversedLeftovers s

let group_by_length_n ns = let grouping_function newValue acc = let newList = newValue :: acc // If we have the correct length, return // a Some as the first value. This will be // Get the sequence. if List.length acc = n - 1 then Some (List.rev newList), [] // If we don’t have the required length, // use None (so that it won’t work) else None, newList
grouped_by grouping_function s

Large sequences are not a problem:

seq {for i in 1..1000000000 β†’ i} | > Seq.group_by_length_n 3 ;; val it: seq <int list> = seq [[1; 2; 3]; [4; 5; 6]; [7; 8; nine]; [10; eleven; 12]; ...]>
0
source share

Princess nice version has been fixed to get tail and converted to seq

 let array_chunk size (arr : 'a array) = let maxl = arr.Length - 1 seq { for a in 0 .. size .. maxl -> arr.[a .. min (a + size - 1) maxl ] } 
0
source share

How about this:

 let grouped n = Seq.unfold(fun s -> if not (Seq.isEmpty s) then Some (Seq.take ns, Seq.skip ns) else None) 

This is in the same vein as the kvb answer.

I somehow remember (link?) That the sequence does not remember the position, so the sequential take / skip will not be optimal.

0
source share

Here the @kvb solution with Seq.skip / take restrictions is limited. He is small, elegant and O (n).

 let eSkip ns = System.Linq.Enumerable.Skip(s, n) let rec seq_chunks n sq = if (Seq.isEmpty sq) then Seq.empty else seq { yield Seq.truncate n sq yield! seq_chunks n (eSkip n sq) } 
0
source share

Here my version accepts an array as input and output:

  let chunk chunkNumber (array : _ array) = let chunkSize = array.Length/chunkNumber let mutable startIndex = 0 [| let n1 = array.Length % chunkNumber for i = 1 to n1 do yield Array.sub array startIndex (chunkSize+1) startIndex <- startIndex + chunkSize+1 let n2 = chunkNumber - n1 for i = 1 to n2 do yield Array.sub array startIndex chunkSize startIndex <- startIndex + chunkSize |] 

The function tries to make pieces of the same size (instead of getting a very small last fragment) and builds the output in the same way you would build a sequence (making it easy to rewrite it to get the sequence as output)

0
source share

All Articles