Sort versus linear search for min / max search

I recently stumbled upon the following piece of code in perl, which returns the minimum numeric value among all the arguments passed.

return 0 + ( sort { $a <=> $b } grep { $_ == $_ } @_ )[0]; 

I usually use a simple linear search to find min / max in a list that seems simple and adequately optimal for me. Is the code above more convenient than a simple linear search? Anything with perl in this case? Thanks!

+4
source share
3 answers

O () says nothing about how long the algorithm has been running. For example, all other things being equal, I always chose Algorithm 2 from the following two:

  • Algorithm 1: O (2 * N + 1000 days) = O (N)
  • Algorithm 2: O (5 * N + 100 ms) = O (N log N)

O () indicates how time the algorithm takes scales as the input size increases. (Well, it can be used for any resources, not just time.) Since the previous two answers only talk about O () terms, they are useless.

If you want to know how fast the algorithm, the algorithm of which is better for entering a given size, you will need to compare it.

In this case, it looks like List :: Util min always significantly better.

 $ perl x.pl 10 Rate sort LUmin sort 1438165/s -- -72% LUmin 5210584/s 262% -- $ perl x.pl 100 Rate sort LUmin sort 129073/s -- -91% LUmin 1485473/s 1051% -- $ perl x.pl 1000 Rate sort LUmin sort 6382/s -- -97% LUmin 199698/s 3029% -- 

the code:

 use strict; use warnings; use Benchmark qw( cmpthese ); use List::Util qw( min ); my %tests = ( 'sort' => 'my $x = ( sort { $a <=> $b } @n )[0];', 'LUmin' => 'my $x = min @n;', ); $_ = 'use strict; use warnings; our @n; ' . $_ for values %tests; local our @n = map rand, 1..( $ARGV[0] // 10 ); cmpthese(-3, \%tests); 
+4
source

You're right. If you don't need sorted data for any other purpose, a simple linear search is the fastest. In any case, in order to do its job, sorting would have to look at each reference point at least once.

Only when the sorted data will be useful for other purposes - or when I did not care about the runtime, energy consumption, heat dissipation, etc., would I sort the data to find the minimum and maximum values.

Now @SimeonVisser is correct. The variety has O (n * log (n)). It is not so much slower than O (n), as many programmers think it was. In practical cases of interest, the overhead of managing a balanced binary tree (or other such structure) probably matters the same as log (n). Thus, no need to shrink from the horror of the prospect of sorting! However, linear search is even faster: you are absolutely right.

Additionally, @DavidO adds such an insightful comment that I would quote here in my own words:

Linear search is also a simpler algorithm for generalizing. For example, linear search can easily (and relatively efficiently) be based on disks on large data sets. While sorting on disks becomes relatively expensive and even more complicated if the sizes of the fields are not normalized.

+2
source

Linear search is O (n) for obvious reasons. Sorting is O (n log n) (see sort in the Perl documentation). So yes, linear search is really faster in terms of complexity. This applies not only to Perl, but also to any programming language that implements these algorithms.

As in many problems, there are several ways to solve it, and there are also several ways to get the min / max list. Conceptually, I would say that linear search is better when you only need the min or max list, since the problem does not require sorting.

0
source

Source: https://habr.com/ru/post/1416654/


All Articles