Identification (and deletion) of sequences from a vector in Matlab / Octave

I am trying to trim any sequence of 3 or more of a vector of numbers in Matlab (or Octave). For example, given a vector dataset,

dataSet = [1 2 3 7 9 11 13 17 18 19 20 22 24 25 26 28 30 31]; 

deleting all sequences of 3 or more will result in a reduction in the DataSet:

 prunedDataSet = [7 9 11 13 22 28 30 31 ]; 

I can redirect the solution, but I suspect there is a more concise (and possibly efficient) way to do this using vector / matrix operations, but I am always embarrassed about something giving an index or value at a given index, Suggestions?

Here's the brute force method I came up with:

 dataSet = [1 2 3 7 9 11 13 17 18 19 20 22 24 25 26 28 30 31]; benign = []; for i = 1:size(dataSet,2)-2; if (dataSet(i) == (dataSet(i+1)-1) && dataSet(i) == dataSet(i+2)-2); benign = [benign i ] ; end; end; remove = []; for i = 1:size(benign,2); remove = [remove benign(i) benign(i)+1 benign(i)+2 ]; end; remove = unique(remove); prunedDataSet = setdiff(dataSet, dataSet(remove)); 
+4
source share
2 answers

Here's a solution using DIFF and STRFIND

 %# define dataset dataSet = [1 2 3 7 9 11 13 17 18 19 20 22 24 25 26 28 30 31]; %# take the difference. Whatever is part of a sequence will have difference 1 dds = diff(dataSet); %# sequences of 3 lead to two consecutive ones. Sequences of 4 are like two sequences of 3 seqIdx = findstr(dds,[1 1]); %# remove start, start+1, start+2 dataSet(bsxfun(@plus,seqIdx,[0;1;2])) = [] dataSet = 7 9 11 13 22 28 30 31 
+6
source

Here's an attempt to use vector matrix notation:

 s1 = [(dataSet(1:end-1) == dataSet(2:end)-1), false]; s2 = [(dataSet(1:end-2) == dataSet(3:end)-2), false, false]; s3 = s1 & s2; s = s3 | [false, s3(1:end-1)] | [false, false, s3(1:end-2)]; dataSet(~s) 

The idea is this: s1 is true for all positions where the number a appears before a+1 . s2 true for all positions where a appears two positions up to a+2 . Then s becomes true when both previous conditions are satisfied. Then we build s so that each true meaning extends to its two followers.

Finally, dataSet(~s) stores all values ​​for which the specified conditions are false, i.e. it stores numbers that are not part of a 3-sequence.

+4
source

All Articles