Comprehensive search and sorting followed by binary search

Question

Comprehensive search and sorting followed by binary search

This is a direct quote from the textbook Invitation to Computer Science by J. Michael Skneider and Judith L. Gersting.

At the end of section 3.4.2, we talked about the trade-off between using sequential searches in an unsorted list, rather than sorting the list, and then using binary search. If the list size is n = 100,000, how many worst searches should be done before the second alternative is better in terms of the number of comparisons?

I really don’t understand what this is about.

A sequential search has order (n), and binary search has order (lgn), which in any case lgn will always be less than n. And in this case, n is already set so that I have to find.

This is one of my homework, but I don’t know what to do. Can someone explain the question to me in English?

+4

sorting arrays algorithm time-complexity binary-search

Saitsiri sahi Oct 20 '10 at 10:43

source share

5 answers

Nikita Rybak · Answer 1 · 2010-10-20T10:46:51+0000

and binary has order (lgn), which in any case lgn will always be less than n. Here you are mistaken. Upon appointment, you are also invited to consider the cost of sorting the array.

Obviously, if you need only one search, the first approach is better than sorting the array and doing a binary search: n < n*logn + logn . And you are asked how many searches that you need for the second approach will become more effective.

The end of the clue.

sharptooth · Answer 2 · 2010-10-20T10:46:13+0000

The question is how to decide which approach to choose - just use linear search or sort, and then use binary search.

If you are only looking for a linear search a couple of times, the best is O (n), and the sort is O (n * logn). If you often look at the same sorting of a collection, it’s better - searching several times can become O (n * n), but sorting and then searching with binary search again O (n * logn) + NumberOfSearches * O (logn), which may be less or more than using a linear search, depending on how NumberOfSearches and n are related.

The task is to determine the exact value of NumberOfSearches (and not the exact number, but the function n), which will make one of the options preferable:

  NumberOfSearches * O(n) <> O(n*logn) + NumberOfSearches * O(logn)

don't forget that each O () can have a different constant value.

Ishtar · Answer 3 · 2010-10-20T11:58:34+0000

The order of the methods is not important here. This tells you how well the algorithms scale when the problem gets bigger and bigger. You cannot do any exact calculations if you only know O(n) == its complexity becomes linear in size of the problem. This will not give you any numbers.

This may mean that an algorithm with O(n) complexity is faster than an O(logn) algorithm for some n. Since O (log (n)) scales better when it gets bigger, we know for sure that there is n (problem size), where an algorithm with complexity O (logn) is faster. We just don’t know when (for what n ).

In plain English:

If you want to know how many searches, you need exact equations to solve, you need exact numbers. How many comparisons are required to search in sequential order? (Remember that n is given, so you can specify a number.) How many comparisons (in the worst case!) Are required for a binary search? Before you can perform a binary search, you will have to sort. Let the number of comparisons needed to sort the cost of the binary search be added. Now compare two numbers, which is less?

Binary searches are fast, but sorting is slow. Sequential searches are slower than binary searches, but faster than sorting. However, sorting should only be done once, no matter how many times you search. So, when does one heavy variety outweigh the need to perform a slow (sequential) search each time?

Good luck

ruslik · Answer 4 · 2010-10-20T10:52:12+0000

The question is to estimate the number NUM_SEARCHES needed to offset the cost of sorting. So, we will have:

  time( NUM_SEARCHES * O(n) ) > time( NUM_SEARCHES * O(log(n)) + O(n* log(n)) )

user482092 · Answer 5 · 2010-10-20T18:38:48+0000

Thanks guys. I think I understand now. Could you take a look at my answer and see if I am on the right track.

To search for the worst case, the Number of comparisons for sequential searches is n = 100,000. The number of comparisons for binary searches is lg (n) = 17. The number of comparisons to sort is (n-1) / 2 * n = (99999) (50,000). (I follow my tutorial and used the selection sorting algorithm that was included in my class)

So, let p be the number of worst searches, then 100,000p> (99999) (50,000) + 17p
OR p> 50008

In conclusion, I need the 50,008 worst searches to sort and use binary search better than a sequential list search of n = 100,000.

Comprehensive search and sorting followed by binary search

More articles: