PHP, Perl & systemcalls
recently handgestrickt tried to get index information from our iTunes library XML file, which is 30MB big. we did not want to use an XML-parser, because this one usually ends up somewhere in the nirvana and you can cook a pot of tea, until it is finished. just for the interest we tried several ways of grabbing information quick and dirty.
- using the systemcalls echo, grep and sed.
- using PHP
- using Perl
here are the test files:
test.php:
<pre>
<?php
$library = '/Users/stefan/Music/iTunes/iTunes Music Library.xml';
list($usec,$sec)=explode(' ',microtime());
$now1 = ((float)$usec+(float)$sec);
//first system calls
`echo "\$dict_indexes = array(" > result1.php`;
`grep -b "<dict>" "$library" | sed -n "s/:.*$/,/ p" >> result1.php`;
`echo ");" >> result1.php`;
list($usec,$sec)=explode(' ',microtime());
$now2 = ((float)$usec+(float)$sec);
echo 'echo, grep and sed took: '.($now2-$now1)." seconds\n";
list($usec,$sec)=explode(' ',microtime());
$now1 = ((float)$usec+(float)$sec);
//next real php
$handle1=fopen($library,'rb');
$handle2=fopen('result2.php','wb');
fwrite($handle2,'$dict_indexes = array(');
while(!feof($handle1)) {
$offset = ftell($handle1);
$line = fgets($handle1);
if(preg_match('/<dict>/',$line)) fwrite($handle2,$offset.",\n");
}
fwrite($handle2,');');
fclose($handle1);
fclose($handle2);
list($usec,$sec)=explode(' ',microtime());
$now2 = ((float)$usec+(float)$sec);
echo 'real PHP took: '.($now2-$now1)." seconds\n";
list($usec,$sec)=explode(' ',microtime());
$now1 = ((float)$usec+(float)$sec);
//next Perl
`perl test.pl`;
list($usec,$sec)=explode(' ',microtime());
$now2 = ((float)$usec+(float)$sec);
echo 'Perl took: '.($now2-$now1)." seconds\n";
?>
</pre> test.pl:
#/usr/bin/perl
open(FILE1,'</Users/stefan/Music/iTunes/iTunes Music Library.xml');
open(FILE2,'>result3.php');
print FILE2 '$dict_indexes = array(';
our $offset = tell FILE1;
while(<FILE1>) {
if(/<dict>/) {
print FILE2 "$offset,\n";
}
$offset = tell FILE1;
}
print FILE2 ');';
close(FILE1);
close(FILE2); the results in the browser:
echo, grep and sed took: 2.16565012932 seconds
real PHP took: 14.6712100506 seconds
Perl took: 3.63878393173 seconds
what a surprise this is. one thing is clear: we will not use PHP for such tasks. Perl is not too slow and has the advantage, that you do not need to use cryptic shell oneliners.

