PHP, Perl & systemcalls

recently handgestrickt tried to get index information from our iTunes library XML file, which is 30MB big. we did not want to use an XML-parser, because this one usually ends up somewhere in the nirvana and you can cook a pot of tea, until it is finished. just for the interest we tried several ways of grabbing information quick and dirty.

  1. using the systemcalls echo, grep and sed.
  2. using PHP
  3. using Perl

here are the test files:

test.php:

<pre>
<?php
$library = '/Users/stefan/Music/iTunes/iTunes Music Library.xml';
list($usec,$sec)=explode(' ',microtime());
$now1 = ((float)$usec+(float)$sec);

//first system calls
`echo "\$dict_indexes = array(" > result1.php`;
`grep -b "<dict>" "$library" | sed -n "s/:.*$/,/ p" >> result1.php`;
`echo ");" >> result1.php`;

list($usec,$sec)=explode(' ',microtime());
$now2 = ((float)$usec+(float)$sec);
echo 'echo, grep and sed took: '.($now2-$now1)." seconds\n";
list($usec,$sec)=explode(' ',microtime());
$now1 = ((float)$usec+(float)$sec);

//next real php
$handle1=fopen($library,'rb');
$handle2=fopen('result2.php','wb');
fwrite($handle2,'$dict_indexes = array(');
while(!feof($handle1)) {
        $offset = ftell($handle1);
        $line = fgets($handle1);
        if(preg_match('/<dict>/',$line)) fwrite($handle2,$offset.",\n");
}
fwrite($handle2,');');
fclose($handle1);
fclose($handle2);

list($usec,$sec)=explode(' ',microtime());
$now2 = ((float)$usec+(float)$sec);
echo 'real PHP took: '.($now2-$now1)." seconds\n";
list($usec,$sec)=explode(' ',microtime());
$now1 = ((float)$usec+(float)$sec);

//next Perl
`perl test.pl`;

list($usec,$sec)=explode(' ',microtime());
$now2 = ((float)$usec+(float)$sec);
echo 'Perl took: '.($now2-$now1)." seconds\n";
?>
</pre>

test.pl:

#/usr/bin/perl
open(FILE1,'</Users/stefan/Music/iTunes/iTunes Music Library.xml');
open(FILE2,'>result3.php');
print FILE2 '$dict_indexes = array(';
our $offset = tell FILE1;
while(<FILE1>) {
        if(/<dict>/) {
                print FILE2 "$offset,\n";
        }
        $offset = tell FILE1;
}
print FILE2 ');';
close(FILE1);
close(FILE2);

the results in the browser:

echo, grep and sed took: 2.16565012932 seconds
real PHP took: 14.6712100506 seconds
Perl took: 3.63878393173 seconds

what a surprise this is. one thing is clear: we will not use PHP for such tasks. Perl is not too slow and has the advantage, that you do not need to use cryptic shell oneliners.

Tuesday, 29. May 2007 • trackback url

Add Comment

( to reply to a comment, click the reply link next to the comment )

Comment Title:
Your Name:
Email Address:
Make Public?
Website:
Make Public?

Comment:


Allowed XHTML tags : a, b, i, strong, code, acrynom, blockquote, abbr. Linebreaks will be converted automatically.


Captcha:

captcha image

Soundfile:


please type the content of the above image or the soundfile into the following form-field: