Constant time and median searches

This is a general interview question. You have a stream of numbers coming in (say, over a million). The numbers are between [0-999]).

Implement a class which supports three methods in O(1) * insert(int i); * getMean(); * getMedian(); 

This is my code.

 public class FindAverage { private int[] store; private long size; private long total; private int highestIndex; private int lowestIndex; public FindAverage() { store = new int[1000]; size = 0; total = 0; highestIndex = Integer.MIN_VALUE; lowestIndex = Integer.MAX_VALUE; } public void insert(int item) throws OutOfRangeException { if(item < 0 || item > 999){ throw new OutOfRangeException(); } store[item] ++; size ++; total += item; highestIndex = Integer.max(highestIndex, item); lowestIndex = Integer.min(lowestIndex, item); } public float getMean(){ return (float)total/size; } public float getMedian(){ } } 

I don't seem to think of a way to get the median in O (1) time. Any help was appreciated.

+7
java algorithm data-structures
source share
3 answers

You have already done all the hard lifting by building store counters. Together with the size value, this is quite simple.

You just start the store iteration, adding up the counts until you reach half the size . This is your median value if size is odd. For an even size you will capture two surrounding values ​​and get their average value.

Performance is equal to O (1000/2) on average, which means O (1), since it does not depend on n , that is, productivity does not change, even if n reaches billions.

Remember that O (1) does not mean instant or even fast. As Wikipedia says:

An algorithm is called constant time (also written as time O (1)) if the value of T (n) is limited to a value that is independent of the size of the input .

In your case, this limit is 1000.

+8
source share

The possible values ​​that you can read are very limited - only 1000. Thus, you can come up with something like sorting counting - every time you enter a number, you increase the counter for this value.

To realize the median in constant time, you will need two numbers - the median index (i.e. the median value) and the number of read values ​​that are on the left (or right) of the median. I will just stay here, hoping that you can figure out how to continue by yourself.

EDIT (as indicated in the comments): you already have an array with sorted elements ( stored ), and you know the number of elements to the left of the median ( size/2 ). You just need to combine the logic. I would like to note that if you use linear additional memory, you will not need to iterate over the entire array on each insert.

+2
source share

For the general case , where the range of elements is unlimited, such a data structure does not exist on the basis of any algorithm based on comparison, since this will make it possible to sort O(n) .

Evidence. Suppose such DS exist, let it be D
Let A be the input array to sort. (Suppose A.size() even for simplicity, which can be easily relaxed by adding a garbage item and discarding it later).

 sort(A): ds = new D() for each x in A: ds.add(x) m1 = min(A) - 1 m2 = max(A) + 1 for (i=0; i < A.size(); i++): ds.add(m1) # at this point, ds.median() is smallest element in A for (i = 0; i < A.size(); i++): yield ds.median() # Each two insertions advances median by 1 ds.add(m2) ds.add(m2) 

Claim 1: This algorithm works in O(n) . Evidence. Since we have constant operations add () and median (), each of them is O(1) per iteration, and the number of iterations is linear - the complexity is linear.

Claim 2: output sorted (A).
Proof (recommendations): after inserting n times m1 median is the smallest element in A Every two inserts after he advances the median by one element, and since the advances are sorted, the overall output is sorted.

Since the indicated algorithm is sorted in O(n) and is not possible in the comparison model, such DS do not exist.

QED

+1
source share

All Articles