To find the longest sequence containing no more than threshold times NaN , we must find the beginning and end of the specified sequence.
To create all possible starting points, we can use hankel :
H = hankel(X) H = 18 3 NaN NaN 8 10 11 NaN 9 14 6 1 4 23 24 3 NaN NaN 8 10 11 NaN 9 14 6 1 4 23 24 0 NaN NaN 8 10 11 NaN 9 14 6 1 4 23 24 0 0 NaN 8 10 11 NaN 9 14 6 1 4 23 24 0 0 0 8 10 11 NaN 9 14 6 1 4 23 24 0 0 0 0 10 11 NaN 9 14 6 1 4 23 24 0 0 0 0 0 11 NaN 9 14 6 1 4 23 24 0 0 0 0 0 0 NaN 9 14 6 1 4 23 24 0 0 0 0 0 0 0 9 14 6 1 4 23 24 0 0 0 0 0 0 0 0 14 6 1 4 23 24 0 0 0 0 0 0 0 0 0 6 1 4 23 24 0 0 0 0 0 0 0 0 0 0 1 4 23 24 0 0 0 0 0 0 0 0 0 0 0 4 23 24 0 0 0 0 0 0 0 0 0 0 0 0 23 24 0 0 0 0 0 0 0 0 0 0 0 0 0 24 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Now we need to find the last valid element in each row. For this we can use cumsum :
C = cumsum(isnan(H),2) C = 0 0 1 2 2 2 2 3 3 3 3 3 3 3 3 0 1 2 2 2 2 3 3 3 3 3 3 3 3 3 1 2 2 2 2 3 3 3 3 3 3 3 3 3 3 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
The endpoint for each line is the one where the corresponding element in C at most threshold :
threshold = 1; T = C<=threshold T = 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
The last valid item was found with:
[~,idx]=sort(T,2); lastone=idx(:,end) lastone = 3 2 1 4 15 15 15 15 15 15 15 15 15 15 15
We need to make sure that the actual length of each line is respected:
lengths = length(X):-1:1; real_length = min(lastone,lengths); [max_length,max_idx] = max(real_length) max_length = 11 max_idx = 5
If there are more sequences of equal maximum length, we just take the first and display it:
selected_max_idx = max_idx(1); H(selected_max_idx, 1:max_length) ans = 8 10 11 NaN 9 14 6 1 4 23 24
full script
X = [18 3 nan nan 8 10 11 nan 9 14 6 1 4 23 24]; H = hankel(X); C = cumsum(isnan(H),2); threshold = 1; T = C<=threshold; [~,idx]=sort(T,2); lastone=idx(:,end)'; lengths = length(X):-1:1; real_length = min(lastone,lengths); [max_length,max_idx] = max(real_length); selected_max_idx = max_idx(1); H(selected_max_idx, 1:max_length)