If your data is not sorted, then you have no choice but to check each point, since you cannot know if there is another point for which y is greater than all other points and for which x > x_min . In short: you cannot know if another item should be included unless you check all of them.
In this case, I would suggest that at your request it is impossible to check the linear time sub , since you have to check all of them. The best case for finding everyone would be linear.
If your data is sorted, then your best case will be constant (all n points are with the highest y), and the worst case will be linear (all n points are the smallest y). The middle case will be closer to constant, I think if your x and x_min are roughly random in a certain range.
If you want this to scale (i.e. you could have large n values), you will also want to save your result set, since you will need to check for new potential points against it and reset the lowest value when you insert (if the size > n). Using a tree, it can be log time.
So, to do everything, the worst case is for unsorted points, in which case you are looking at nlog (n) time. Sorted points are better, and in this case you are looking at the average case of log (n) time (again, assuming roughly randomly distributed values for x and x_min), which is sublinear.
If at first it is not clear why the sorted points will have a constant time to search, I will quickly go here.
If the n points with the highest y values had x > x_min (the best case), then you are just clutching at what you need from above, so this is obvious.
For the middle case, assuming a roughly random distribution of x and x_min, the probability that x > x_min is basically half. For any two random numbers a and b, a > b is as true as b > a . This is the same with x and x_min; x > x_min is equally true as x_min > x , which means 0.5 probability. This means that for your points, on average, each verified second will meet your requirement x > x_min , so on average you will check 2n points to find the n highest points that match your criteria. Thus, the best case was time c, the average value is 2c, which is still constant.
Note, however, that for n values approaching the size of the set, this hides the fact that you are going through the entire set, basically returning it back to linear time. Therefore, my statement that this is a constant time does not hold if you accept random n values within the range of your set.
If this is not a purely academic issue and is caused by some real need, then it depends on the situation.
(edit) I just realized that my statements about constant time presuppose a data structure in which you have direct access to the highest value and can consistently go to lower values. If the data structure provided to you does not match this description, then obviously this is not so.