Why Perl Scripts are Super Fast? Benchmark Perl Scripts

Its quite evident that Perl Scripts runs super fast when it comes to handling regular expressions and text processing. Programmer usually argue over which programming language is fastest or better or supports more features but we need proofs and evidences to support any sort of claims. Lets try to determine why Perl Scripts runs extremely fast?
We can use Perl Benchmarking Module which let us test the speed of a Perl script.

Calculating differences in script execution time
Ideally we test speed by start time (When the script starts) and end time (When the script finished) and take the difference between the two values. This will become our script execution time. In Perl these time values are obtained with the built-in time() function:

While this is fine for basic use, it becomes complicated if what you really want is to compare the times of different scripts, or run arbitrary pieces of code for fixed time intervals. For these uses, the Benchmark module is more appropriate. This module comes bundled with Perl, and can be imported into your Perl script through the “use” command. Take a look at the next example, which rewrites the previous one to use Benchmark instead of time().

Every time you create a new Benchmark object with new(), the current time is returned. The difference between the start and end times is calculated with the Benchmark module’s timediff() function, and the result is formatted for display with the timestr() function. Here’s the sample output of the script above:

Time taken was 2 wallclock secs ( 2.14 usr 0.00 sys + 0.00 cusr
0.00 csys = 2.14 CPU) seconds

As you can see, Benchmark returns a little more detail than the time() function.

Timing multiple runs of a script

Of course, a sample size of one is not necessarily representative of how fast your script is, especially on Web servers that are subject to varying loads. Therefore, what you really need is a way to run this script many times, and calculate the average time taken after compiling the data from each run. Luckily, Benchmark comes with a function to do this too. It’s called timethis(), and it’s demonstrated in the following example:

The timethis() function accepts two arguments: the number of times to run the code block, and the code block itself. This code block must be provided to timethis() in a format suitable to the eval() function.

Once the benchmark is complete, timethis() displays a report like this:

timethis 100000: 210 wallclock secs (209.37 usr + 0.00 sys = 209.37
CPU) @ 477.62/s (n=100000)

There are two pieces of useful data here: the number of CPU seconds, which tells you how long Perl takes to run the code N times, and the per-second data, which tells you how many runs take place per second. Obviously, the higher the second value, the faster your code is. Instead of a fixed number of iterations, now let’s see how to have timethis() run the code for a fixed period of time.

Hide Counting how often a script runs in a predefined time window

Instead of timing how long a piece of code takes to execute a fixed number of iterations, you can flip things around and have timethis() run the code for a fixed period of time to see how many iterations it completes in that time. You do this by using a negative value as the first argument. Consider the following example, which makes timethis() run the code for a minimum of 10 seconds:

The output will look something like this:

timethis for 10: 11 wallclock secs (10.93 usr + 0.00 sys = 10.93 CPU) @ 700.82/s (n=7660)

So in 11 seconds (well, 10.93 if you want to be difficult), Perl was able to execute the code 7660 times, or approximately 700 times per second. You can even create an interactive benchmarking tool with timethis(), by having the user enter the code and the number of iterations at the prompt:

Most of this is pretty simple, and should be clear to you if you understood the previous examples. The only item of note here is the alteration of the Perl input separator to the code END, so that the user can enter multi-line code blocks and terminate them with the statement END (the default separator is a carriage return, which would make Perl jump to the next statement as soon as the user pressed [Enter]).

Here’s an example of this script in action (lines beginning with a ‘>’ indicate output from the program, the rest are lines input by the user):

> Enter number of iterations:
> Enter your Perl code (end with END):
for ($a=1; $a<1001; $a++) { $value = $a ** 10; } END > Processing…
> timethis 500: 6 wallclock secs ( 5.72 usr + 0.00 sys = 5.72 CPU) @ 87.41/s (n=500)

Timing and comparing different techniques

If you’re the kind of Perl programmer who likes experimenting with different ways of accomplishing the same thing, you’re going to just love the next tool in Benchmark’s arsenal. The timethese() function allows you to time more than one code fragment at a time:

This example tries to calculate the sine of 5,000 numbers, using three different approaches. The first, named “huey”, uses a while() loop; “dewey” uses a for() loop; and “louie” uses a foreach() loop. Each of these code snippets is placed inside a single call to the timethese() function, which accepts two arguments: the number of iterations and a hash whose values are the code snippets to be tested (the keys of the hash contain the unique names for the code fragments). The timethese() function then internally calls timethis() for each hash element and returns the time taken for each option. Here’s a sample of the output:

Benchmark: timing 1000 iterations of dewey, huey, louie…
dewey: 92 wallclock secs (91.72 usr +  0.00 sys = 91.72 CPU) @ 10.90/s (n=1000)
huey: 160 wallclock secs (159.56 usr +  0.00 sys = 159.56 CPU) @ 6.27/s (n=1000)
louie: 45 wallclock secs (44.98 usr +  0.00 sys = 44.98 CPU) @ 22.23/s (n=1000)

It is clear from the output that the foreach() loop is the most efficient of the three alternatives, at least for this particular scenario. Another way to run this test is with the cmpthese() function, which internally calls timethese(), and accepts the same arguments as timethese(). The main advantage is that it formats the result better for comparison purposes:

Note the use of “use Benchmark qw (:all)” instead of just “use Benchmark.” This ensures all the methods in the Benchmark object get exported. The output of cmpthese() is a table which compares the speed of each option against the speed of its competition. Since this table contains summary percentage values, it is somewhat easier to understand than the output of timethese():

Rate louie  huey dewey
louie 14.1/s    —  -50%  -54%
huey  28.5/s  102%    —   -8%

Leave me a comment and let me hear your opinion. If you’ve got any thoughts, comments or suggestions for things we could add, leave a comment! Also please Subscribe to our RSS for latest tips, tricks and examples on cutting edge stuff.

0 I like it
0 I don't like it