Split ICollection <T> with a separator sequence

This is for C # 3.5

I have an ICollection which I am trying to split into separate ICollections, where the separator is a sequence.

for instance

ICollection<byte> input = new byte[] { 234, 12, 12, 23, 11, 32, 23, 11 123, 32 }; ICollection<byte> delimiter = new byte[] {23, 11}; List<IICollection<byte>> result = input.splitBy(delimiter); 

will result in

 result.item(0) = {234, 12, 12}; result.item(1) = {32}; result.item(2) = {123, 32}; 
+4
source share
5 answers
 private static IEnumerable<IEnumerable<T>> Split<T> (IEnumerable<T> source, ICollection<T> delimiter) { // window represents the last [delimeter length] elements in the sequence, // buffer is the elements waiting to be output when delimiter is hit var window = new Queue<T>(); var buffer = new List<T>(); foreach (T element in source) { buffer.Add(element); window.Enqueue(element); if (window.Count > delimiter.Count) window.Dequeue(); if (window.SequenceEqual(delimiter)) { // number of non-delimiter elements in the buffer int nElements = buffer.Count - window.Count; if (nElements > 0) yield return buffer.Take(nElements).ToArray(); window.Clear(); buffer.Clear(); } } if (buffer.Any()) yield return buffer; } 
+3
source

The optimal solution would not be to use SequenceEqual() to check each subrange, otherwise you could iterate the length of the separator for each element in a sequence that could damage performance, especially for large separator sequences. It can be checked because the original sequence will be initialized instead.

Here is what I would write, but there is always room for improvement. I wanted to have similar semantics with String.Split() .

 public enum SequenceSplitOptions { None, RemoveEmptyEntries } public static IEnumerable<IList<T>> SequenceSplit<T>( this IEnumerable<T> source, IEnumerable<T> separator) { return SequenceSplit(source, separator, SequenceSplitOptions.None); } public static IEnumerable<IList<T>> SequenceSplit<T>( this IEnumerable<T> source, IEnumerable<T> separator, SequenceSplitOptions options) { if (source == null) throw new ArgumentNullException("source"); if (options != SequenceSplitOptions.None && options != SequenceSplitOptions.RemoveEmptyEntries) throw new ArgumentException("Illegal option: " + (int)option); if (separator == null) { yield return source.ToList(); yield break; } var sep = separator as IList<T> ?? separator.ToList(); if (sep.Count == 0) { yield return source.ToList(); yield break; } var buffer = new List<T>(); var candidate = new List<T>(sep.Count); var sindex = 0; foreach (var item in source) { candidate.Add(item); if (!item.Equals(sep[sindex])) { // item is not part of the delimiter buffer.AddRange(candidate); candidate.Clear(); sindex = 0; } else if (++sindex >= sep.Count) { // candidate is the delimiter if (options == SequenceSplitOptions.None || buffer.Count > 0) yield return buffer.ToList(); buffer.Clear(); candidate.Clear(); sindex = 0; } } if (candidate.Count > 0) buffer.AddRange(candidate); if (options == SequenceSplitOptions.None || buffer.Count > 0) yield return buffer; } 
+2
source
 public IEnumerable<IEnumerable<T>> SplitByCollection<T>(IEnumerable<T> source, IEnumerable<T> delimiter) { var sourceArray = source.ToArray(); var delimiterCount = delimiter.Count(); int lastIndex = 0; for (int i = 0; i < sourceArray.Length; i++) { if (delimiter.SequenceEqual(sourceArray.Skip(i).Take(delimiterCount))) { yield return sourceArray.Skip(lastIndex).Take(i - lastIndex); i += delimiterCount; lastIndex = i; } } if (lastIndex < sourceArray.Length) yield return sourceArray.Skip(lastIndex); } 

Call...

 var result = SplitByCollection(input, delimiter); foreach (var element in result) { Console.WriteLine (string.Join(", ", element)); } 

returns

  234, 12, 12
 32
 123, 32
+1
source

Here is my example:

 public static IEnumerable<IList<byte>> Split(IEnumerable<byte> input, IEnumerable<byte> delimiter) { var l = new List<byte>(); var set = new HashSet<byte>(delimiter); foreach (var item in input) { if(!set.Contains(item)) l.Add(item); else if(l.Count > 0) { yield return l; l = new List<byte>(); } } if(l.Count > 0) yield return l; } 
0
source

There are probably better methods, but I've used it here before: it works great for relatively small collections:

 byte startDelimit = 23; byte endDelimit = 11; List<ICollection<byte>> result = new List<ICollection<byte>>(); int lastMatchingPosition = 0; var inputAsList = input.ToList(); for(int i = 0; i <= inputAsList.Count; i++) { if(inputAsList[i] == startDelimit && inputAsList[i + 1] == endDelimit) { ICollection<byte> temp = new ICollection<byte>(); for(int j = lastInputPosition; j <= i ; j++) { temp.Add(inputAsList[j]); } result.Add(temp); lastMatchingPosition = i + 2; } } 

I do not have my open IDE at the moment, so mine does not compile as is, or may have some holes that you will need to connect. But I start when I come across this problem. Again, as I said, if this is for large collections, it will be slow, so better solutions may still exist.

-1
source

All Articles