Today at school, the teacher asked us to introduce a duplication-removal algorithm. This is not so difficult, and everyone came up with the following solution (pseudocode):
for i from 1 to n - 1 for j from i + 1 to n if v[i] == v[j] then remove(v, v[j]) // remove(from, what) next j next i
The computational complexity for this algorithm is n(n-1)/2 . (We are in high school, and we did not talk about big-O, but it seems to be O(n^2) ). This solution seems ugly and of course slow, so I tried to do something faster:
procedure binarySearch(vector, element, *position) // this procedure searches for element in vector, returning // true if found, false otherwise. *position will contain the // element place (where it is or where it should be) end procedure ---- // same type as v vS = new array[n] for i from 1 to n - 1 if binarySearch(vS, v[i], &p) = true then remove(v, v[i]) else add(vS, v[i], p) // adds v[i] in position p of array vS end if next i
This way vS will contain all the elements that we have already passed. If the element v[i] is in this array, then it is a duplicate and is deleted. The computational complexity for the binary search is log(n) , and the main loop (second fragment) is n . Therefore all CC n*log(n) if I am not mistaken.
Then I had a different idea about using a binary tree, but I can not postpone it.
Mostly my questions:
- Is my CC calculation correct? (and if not, why?)
- Is there a faster way to do this?
thanks
string algorithm complexity-theory big-o duplicates
Blackbear
source share