Is it possible to write a shell script that is faster than the equivalent script in Perl?

I wrote some scripts in Perl and shell, and I compared the real runtime. In all cases, the Perl script was more than 10 times faster than the shell script.

So, I was wondering if it is possible to write a shell script that is faster than the same script in Perl? And why is Perl faster than the shell, although I use the system function in a Perl script?

+6
performance benchmarking shell perl
source share
6 answers

There are several ways to make your shell (e.g. Bash) faster.

  • Try using fewer external commands if Bash internals can perform the task for you. For example, excessive use of sed , grep , awk et for line / text manipulation.
  • If you are working with relatively large files, do not use Bash during the read cycle. Use awk. If you manage really BIG files, you can use grep to search for the desired templates, and then pass them awk for β€œediting”. The grep search algorithm is very good and fast. If you want to get only the front or end of the file, use the head and tail.
  • file processing tools such as sed, cut, grep, wc, etc., everything can be done with one awk script or using internal Bash elements, if it's not difficult. Therefore, you can try to reduce the use of these tools, which overlap in their functions. Unix pipes / chains are excellent, but using too many of them, such as command|grep|grep|cut|sed makes your code slow. Each pipe is overhead. In this example, only one awk does all of them. command | awk '{do everything here}' command | awk '{do everything here}' The closest tool you can use that can match Perl speed for specific tasks, like string manipulation or math, is awk. Here is a fun test for this solution . The file contains about 9 million numbers.

Exit

 $ head -5 file 1 2 3 34 42 $ wc -l <file 8999987 # time perl -nle '$sum += $_ } END { print $sum' file 290980117 real 0m13.532s user 0m11.454s sys 0m0.624s $ time awk '{ sum += $1 } END { print sum }' file 290980117 real 0m9.271s user 0m7.754s sys 0m0.415s $ time perl -nle '$sum += $_ } END { print $sum' file 290980117 real 0m13.158s user 0m11.537s sys 0m0.586s $ time awk '{ sum += $1 } END { print sum }' file 290980117 real 0m9.028s user 0m7.627s sys 0m0.414s 

For every attempt, awk is faster than Perl.

Finally, try learning awk beyond what they can do as a single liner.

+6
source share

This can fall dangerously close to optimizing your chair, but here are some ideas that can streamline your results:

  • Fork / exec: the almost-useful thing that a shell script does is run through a shell that launches a new shell and runs a command, such as sed , awk , cat , etc. More often than not, more than one process is running, and the data moves through the channels.

  • Data Structures: Perl data structures are more complex than Bash or Csh. This usually causes the programmer to create a data warehouse. This can take the following forms:

    • use not optimal data structures (arrays instead of hashes)
    • store data in text form (for example, integers as strings), which must be re-interpreted each time.
    • save the data in a file and reanalyze it again and again.
    • and etc.
  • Not optimized implementation: some shell design may not be designed with optimization, but with user convenience. For example, I have reason to believe that the Bash implementation of the parameter extension, in particular, ${foo//search/replace} is suboptimal with respect to the same operation in sed . This is usually not a problem for everyday tasks.

+4
source share

Well, I know that I am asking for this by opening a can of worms that were closed two years ago, but I'm not 100% satisfied with any of the answers.

Correct answer: YES. But most new coders will continue to switch to Perl and Python and write code that fights WRAP CALLS TO EXTERNAL EXECUTABLES a lot because they lack the mentoring or experience needed to know when to use which tools.

Korn Shell (ksh) has fast built-in math and a fully capable and fast regular expression engine that, with a sigh, can handle a regular expression like Perl. It also has associative arrays. It can even load external .so libraries. And it was a finished and mature product 10 years ago. It is even already installed on your Mac.

+2
source share

No, I think this is not possible:
The bash command is a truly interpreted language, but Perl programs are compiled into bytecode before execution

+1
source share

Some shell commands may work faster than Perl in some situations. I once compared a simple sed script with the equivalent in perl, and sed won. But when the requirements became more complex, the perl version began to beat the sed version. So the answer is, it depends. But for other reasons (simplicity, maintainability, etc.) I would be inclined to do things in Perl anyway, if the requirements are not very simple, and I expect them to remain that way.

+1
source share

Yes. C code will be faster than Perl code for the same thing, so a script that uses a compiled executable to do a lot of work will be faster than a perl program that does the same.

Of course, a Perl program can be rewritten to use an executable, in which case it will probably be faster.

-2
source share

All Articles