The longest non-repeating substring in C ++

I am trying to find the longest substring without duplicate characters. I have a boolean vector to track 256 ascii characters.

#include <iostream> #include <cstdio> #include <string> #include <algorithm> using namespace std; int main() { string s = "aaaaaaaaaaaaasadfrhtytbgrbrsvsrhvsg"; vector<bool> v(256, false); int j = 0, len = 0, index = 0; for(int i = 0; i < s.length(); i++) { if(!v[s[i]]) { j++; if(j > len) { len = j; index = i - j; } v[s[i]] = true; } else { j = 0; v.clear(); } } cout << s.substr(index, len) + " " << len << endl; } 

I can understand why it gives adfrht 6 output if the correct output is sadfrhty 8.

+5
source share
2 answers

The reason you get the wrong result is because the basic approach lacks several moving parts. You do not track all the information necessary for its calculation. You not only need to keep track of which characters you saw, but also in what position in the line in which they were seen (I suppose you want to keep this with O (n) complexity).

Thus, when you see a character that you met before, you need to reset the "consecutive non-repeating characters seen so far" to start after the previous appearance of the same character that you are looking at at the current position. In addition, all the characters that have been seen so far, to this point, are no longer visible, because if you think about it for a second, this should make sense to you.

This is the missing part from your implementation. In addition, it does not track the position of the longest line, absolutely correctly.

Below are the results you were looking for.

Let us know what class you got for your homework :-)

 #include <iostream> #include <cstdio> #include <string> #include <algorithm> #include <vector> using namespace std; int main() { string s = "aaaaaaaaaaaaasadfrhtytbgrbrsvsrhvsg"; vector<bool> v(256,false); vector<int> seen_at(256); int longest_starting_pos=0, longest_length=0; int current_length=0; for (int i=0; i<s.length(); i++) { if (v[s[i]]) { for (int j=i-current_length; j<seen_at[s[i]]; ++j) v[s[j]]=false; current_length=i-seen_at[s[i]]-1; } v[s[i]]=true; seen_at[s[i]]=i; if (++current_length > longest_length) { longest_length=current_length; longest_starting_pos=i-current_length+1; } } cout<<s.substr(longest_starting_pos, longest_length)+" "<<longest_length<<endl; } 
+4
source

Your algorithm is incorrect. What is wrong with your algorithm is that after checking a character, it does not return to that character to check it again if the substring including this character is not the longest. The first s checked in a string of length 2, which is equal to as , but when the next a is found, s lost, although it can make the next substring longer. Try this code:

 #include <iostream> #include <cstdio> #include <string> #include <algorithm> #include <vector> using namespace std; int main() { string s = "aaaaaaaaaaaaasadfrhtytbgrbrsvsrhvsg"; vector<bool> v(256,false); int longStart = 0; int longEnd = 0; int start = 0 for (end = 0; end < s.length(); end++) { if (!v[s[end]]) // if character not already in the substring { v[s[end]] = true; if (end - start > longEnd - longStart) { longEnd = end; longStart = start; } } else //the character is already in the substring, so increment the //start of the substring until that character is no longer in it { //get to the conflicting character, but don't get to the new character while ((s[start] != s[end]) && (start < end)) { start++; v[s[start]] = false; } //remove the conflicting character start++; //don't set v[s[start]] to false because that character is still //encountered, but as the newly appended character, not the first } } longEnd++; //make longEnd the index after the last character for substring purposes cout << s.substr(longStart, longEnd - longStart) + " " << (longEnd - longStart) << endl; } 

Basically what this code does, it supports a running substring, and whenever it encounters a character that is already in a substring, it increments the beginning of the substring until the new character is no longer in the substring, and then continues as usually, He also checks every time the end grows, if this substring is longer than previously considered the longest. This is O (n) I believe that you like.

Also distribute the code. Short code does not mean anything if you cannot read it and easily debug it. In addition, if you have problems with your code, do it all manually to better understand how it all works and what happens.

Hope this helps!

0
source

All Articles