Welcome, Guest. Please login or register.
Did you miss your activation email? October 07, 2008, 07:11:34 PM
Did you miss your activation email? October 07, 2008, 07:11:34 PM
Developers' Blog
PHP String Replacement Speed Comparison
Recently I redid a test to determine the speed of various string replacement methods in PHP. A little over a year ago, Grudge did a test of the same nature. Based off of his results some decisions were made on how to proceed. My goal was to verify his test using some of the formats we decided to go with.
String Replacement Methods
I looked at the four main methods of string replacement used in SMF:
sprintf
http://www.php.net/manual/en/function.sprintf.php
sprintf operates almost exactly like the *printf functions in C. Main difference is that PHP supports argument numbering. We are slowly moving towards having all of our sprintfs using argument numbering.
preg_replace
http://www.php.net/manual/en/function.preg-replace.php
This function uses Perl regular expressions to find and replace substrings. Regular Expressions can be extremely powerful and at the same time extremely hard to understand.
strtr
http://www.php.net/manual/en/function.strtr.php
strtr is used for translating one substring to another. The way SMF has been using it is with the
str_replace
http://www.php.net/manual/en/function.str-replace.php
str_replace finds substrings in the order they are given and replaces them with the corresponding substring.
Understanding the difference between strtr and str_replace
strtr and str_replace both seem to operate the same way. However, due to strtr's unique properties the results can be very different.
Example: PHP: Hypertext Preprocessor
Replacements:
PHP => PHP: Hypertext Preprocessor
PHP: Hypertext Preprocessor => PHP
strtr: PHP
str_replace: PHP: Hypertext Preprocessor
To understand why lets examine the behavior:
strtr
First strtr will sort the parameters by length of the substring with the longest first. So we get
Replacements:
PHP: Hypertext Preprocessor => PHP
PHP => PHP: Hypertext Preprocessor
So it finds “ PHP: Hypertext Preprocessor” in our example and replaces it with “PHP” so now our example contains “PHP”. Next it searches for “PHP” in “PHP” but the only instance it found is one it has already worked on so it doesn't do the replacement. So it is done.
str_replace
First str_replace finds finds “PHP” in “PHP: Hypertext Preprocessor” and replaces it with “PHP: Hypertext Preprocessor” which gives us “PHP: Hypertext Preprocessor: Hypertext Preprocessor”. Next it finds “PHP: Hypertext Preprocessor” in “PHP: Hypertext Preprocessor: Hypertext Preprocessor” which gives us “PHP: Hypertext Preprocessor” since it doesn't care that it already changed that part of the string.
Testing Procedure
A test script was written—attached to this post—and uploaded to the Simple Machines website. The script was accessed and the results recorded throughout the day. The results will be more realistic on a production server then on a test server, due to the server load on production sites. The times were varied in order to account for server load changes.
For each method the time was stored before and after the method was ran 100,000 times. The difference in time was then calculated and displayed.
The str_replace method was ran using two different methods. For the first method the parameters were put into arrays prior to starting the test. For the second method the parameter arrays were created in every iteration. This type of setup is typical and adds to the execution time.
Test Results
In the following table I give all the results of the each test. The times for each method are in seconds.
Calculated Results
The following table gives the speed of each method as a multiple of the time it took sprintf.
Another method of comparing speeds is to look at the number of executions each method did each second.
Analysis of results
The sprintf method is the fastest in string replacement; however, str_replace method #1 was only slighty slower.
While the test were done over 100,000 iterations the actual number of replacements done during script execution shouldn't exceed more then a few hundred iterations. As such, even strtr isn't too slow to be used.
It is also important to note that the preg_replace expression was very simplistic. For more complex expressions the function will take longer to execute as pattern matching is not a simple problem.
My Recommendations
Whenever you are concerned with pure speed you should go with *printf. However, when you are concerned about human readability str_replace provides a good alternative. I recommend that you avoid using strtr unless you need its unique properties. At the same time don't invest a lot of time trying to rip it out.
I hope you all gained some useful information for this.
String Replacement Methods
I looked at the four main methods of string replacement used in SMF:
- sprintf
- preg_replace
- strtr
- str_replace
sprintf
http://www.php.net/manual/en/function.sprintf.php
sprintf operates almost exactly like the *printf functions in C. Main difference is that PHP supports argument numbering. We are slowly moving towards having all of our sprintfs using argument numbering.
preg_replace
http://www.php.net/manual/en/function.preg-replace.php
This function uses Perl regular expressions to find and replace substrings. Regular Expressions can be extremely powerful and at the same time extremely hard to understand.
strtr
http://www.php.net/manual/en/function.strtr.php
strtr is used for translating one substring to another. The way SMF has been using it is with the
Code: [Select]
string strtr ( string $str, array $replace_pairs ) format. In this format, the longest substrings are found first and the function will try to avoid doing a replacement on something it has already worked on.str_replace
http://www.php.net/manual/en/function.str-replace.php
str_replace finds substrings in the order they are given and replaces them with the corresponding substring.
Understanding the difference between strtr and str_replace
strtr and str_replace both seem to operate the same way. However, due to strtr's unique properties the results can be very different.
Example: PHP: Hypertext Preprocessor
Replacements:
PHP => PHP: Hypertext Preprocessor
PHP: Hypertext Preprocessor => PHP
strtr: PHP
str_replace: PHP: Hypertext Preprocessor
To understand why lets examine the behavior:
strtr
First strtr will sort the parameters by length of the substring with the longest first. So we get
Replacements:
PHP: Hypertext Preprocessor => PHP
PHP => PHP: Hypertext Preprocessor
So it finds “ PHP: Hypertext Preprocessor” in our example and replaces it with “PHP” so now our example contains “PHP”. Next it searches for “PHP” in “PHP” but the only instance it found is one it has already worked on so it doesn't do the replacement. So it is done.
str_replace
First str_replace finds finds “PHP” in “PHP: Hypertext Preprocessor” and replaces it with “PHP: Hypertext Preprocessor” which gives us “PHP: Hypertext Preprocessor: Hypertext Preprocessor”. Next it finds “PHP: Hypertext Preprocessor” in “PHP: Hypertext Preprocessor: Hypertext Preprocessor” which gives us “PHP: Hypertext Preprocessor” since it doesn't care that it already changed that part of the string.
Testing Procedure
A test script was written—attached to this post—and uploaded to the Simple Machines website. The script was accessed and the results recorded throughout the day. The results will be more realistic on a production server then on a test server, due to the server load on production sites. The times were varied in order to account for server load changes.
For each method the time was stored before and after the method was ran 100,000 times. The difference in time was then calculated and displayed.
The str_replace method was ran using two different methods. For the first method the parameters were put into arrays prior to starting the test. For the second method the parameter arrays were created in every iteration. This type of setup is typical and adds to the execution time.
Test Results
In the following table I give all the results of the each test. The times for each method are in seconds.
Time | sprintf | preg_replace | strtr | str_replace #1 | str_replace #2 |
| 08:00:00 AM | 1.1334 | 2.0955 | 48.1423 | 1.2109 | 1.4819 |
| 08:40:00 AM | 1.0436 | 2.0326 | 64.3492 | 1.7948 | 2.2337 |
| 11:30:00 AM | 1.1841 | 2.5524 | 62.0114 | 1.5931 | 1.9200 |
| 02:00:00 PM | 0.9783 | 2.4832 | 52.6339 | 1.3966 | 1.4845 |
| 03:00:00 PM | 1.0463 | 2.6164 | 52.7829 | 1.1828 | 1.4981 |
| Average | 1.0771 | 2.3560 | 55.9839 | 1.4357 | 1.7237 |
Calculated Results
The following table gives the speed of each method as a multiple of the time it took sprintf.
Method | Times Slower |
| preg_replace | 2.19 |
| strtr | 51.97 |
| str_replace #1 | 1.33 |
| str_replace #2 | 1.6 |
Method | Speed |
| sprintf | 92838 |
| preg_replace | 42444 |
| strtr | 1786 |
| str_replace #1 | 69654 |
| str_replace #2 | 58016 |
Analysis of results
The sprintf method is the fastest in string replacement; however, str_replace method #1 was only slighty slower.
While the test were done over 100,000 iterations the actual number of replacements done during script execution shouldn't exceed more then a few hundred iterations. As such, even strtr isn't too slow to be used.
It is also important to note that the preg_replace expression was very simplistic. For more complex expressions the function will take longer to execute as pattern matching is not a simple problem.
My Recommendations
Whenever you are concerned with pure speed you should go with *printf. However, when you are concerned about human readability str_replace provides a good alternative. I recommend that you avoid using strtr unless you need its unique properties. At the same time don't invest a lot of time trying to rip it out.
I hope you all gained some useful information for this.
Loading...
I think it is important to note the PHP version, the webserver & version, the OS, the hardware, the load, and some specs relating to the software you are running. Since this is a benchmark of a core PHP function, I think the minimum that should be supplied is PHP version and the load.
Neverminding that stuff, nice to see some more benchmarks
Okay i prefere str_replace but i never thought that sprintf is so fast.
Bye
DIN1031
A good implementation of *printf only needs to scan the string once. For str_replace I wouldn't be surprised if it scaned the string once for each substring.
sprintf: 1.38384985924 seconds
preg_replace: 3.27423620224 seconds
strtr: 40.9357118607 seconds
str_replace #1: 1.49082899094 seconds
str_replace #2: 1.94922995567 seconds
The latter may be faster, but having the replaced subjects being searched and replaced again is often undesirable, especially when dealing with variable searches or replacements (like custom smileys).
Oh and strtr is sooooo much better readable
If you're interested:
preg_replace: 2.66261482239 seconds
strtr: 31.150698185 seconds
str_replace #1: 1.06948304176 seconds
str_replace #2: 1.39181494713 seconds
Processor: Intel Xeon CPU 5130 @ 2.00GHz (500.026MHz)
RAM: 512 MB
OS: Linux 2.6.9
PHP version: 4.4.6
Load Averages: 0.83, 0.74, 0.54
sprintf: 0.042463 seconds
preg_replace: 0.573575 seconds
strtr: -0.780915 seconds
str_replace #1: 0.206594 seconds
str_replace #2: 0.520598 seconds
- not sure why I got a negative time
As for readability it really is minor as you can still define the replacement array the same way and then use array_key() to seperate the from and to. As show by str_replace #2 it is still faster then strtr.
To make it work properly on PHP 4, add this to the start of the file (below the <?php):
function microtime_float()
{
list($usec, $sec) = explode(" ", microtime());
return ((float)$usec + (float)$sec);
}
Then, in the script, replace all instances of
microtime(true)
with
microtime_float()
Processor: Intel Core2Duo 6600 @ 2.40GHz
RAM: 2048 MB
Operating System: Microsoft Windows XP [Version 5.1.2600]
Current processes: 47 (47 running, 0 zombie)
PHP version: 5.1.4
Server version: Apache/2.2.2 (Win32) DAV/2 mod_ssl/2.2.2 OpenSSL/0.9.8b mod_autoindex_color PHP/5.1.4
sprintf: 0.73141002655 secondspreg_replace: 1.26732420921 seconds
strtr: 23.7995750904 seconds
str_replace #1: 0.864935159683 seconds
str_replace #2: 1.10788106918 seconds
(Or should i test it on my other server with linux on it :x).
Bye
DIN1031
CPU: Intel CentrinoDuo T7200 @ 2.00GHz
RAM: 1 GB @ 2.00 GHz
Server: Apache 1.3.34 w/ PHP5 mod
PHP: 5.2.2
Results:
sprintf: 0.80260682106018 seconds
preg_replace: 2.3465528488159 seconds
strtr: 26.90654706955 seconds
str_replace #1: 0.83384084701538 seconds
str_replace #2: 1.1157829761505 seconds
although running this test on windows is kinda useless but meh
anyhow, PHP uses only 1 processor and leaves the other so I was running 50% CPU all the time.
Results from my webhosting account:
Operating system: FreeBSD
Kernel version: 6.1-RELEASE-p4
Machine Type: amd64
Apache version: 1.3.37 (Unix)
PHP version 4.4.2 (outdated!?)
RUN 1:
Server Load: 1.03 (2 cpus)
Memory Used: 67.77 %
sprintf: 1.22852993011 seconds
preg_replace: 2.19928884506 seconds
strtr: 41.4552869797 seconds
str_replace #1: 1.34867596626 seconds
str_replace #2: 1.66192102432 seconds
RUN 2:
Server Load 0.88 (2 cpus)
Memory Used 62.88 %
sprintf: 1.16300082207 seconds
preg_replace: 1.57117605209 seconds
strtr: 34.872027874 seconds
str_replace #1: 1.37249684334 seconds
str_replace #2: 1.51887798309 seconds
RUN 3:
Server Load 1.50 (2 cpus)
Memory Used 65.80 %
sprintf: 1.24492406845 seconds
preg_replace: 1.72587108612 seconds
strtr: 36.0469529629 seconds
str_replace #1: 1.16329097748 seconds
str_replace #2: 1.65367412567 seconds
Note all those server status were taken the moment the benchmark started
For anyone with max_execution_time error add this to the top of the file just after the <?php
set_time_limit(120);System information:
I love using a relatively small web host - The load average is awesome. I actually do maintenance work for the host (InVio Hosting)
Lets take the following statement:
My name is {NAME} and I am a {JOB} here at {COMPANY} where I work for {BOSSTITLE} {BOSS}example: My name is Thantos and I am a developer here at Simple Machines where I work for Lead Developer CompuartNow if I wanted to that with concatation (of any sort) it would be something like: ($t is an array of language strings)
$t[1] . $name . $t[2] . $job . $t[3] . $company . $t[4] . $bosstitle . ' ' . $bossSo if I wanted to that sentence I'd need to use four text strings. But now lets say for my particular setting "Lead Developer Compuart" isn't proper, I need "Compuart the Lead Developer". Not only have I inserted another text string ("the") but I have rearranged the order in which the data appears. No amount of changing of (just text strings)$t[1] . $name . $t[2] . $job . $t[3] . $company . $t[4] . $bosstitle . ' ' . $bossis going to get meMy name is Thantos and I am a developer here at Simple Machines where I work for Compuart the Lead DeveloperNow if I put it all into one text string and use some some sequence identifiers I can do this:
My name is %1$s and I am a %2$s here at %3$s where I work for %4$s %5$sand call it like: $str = sprintf($text, $name, $job, $company, $bosstitle, $boss);Now if they want to make the change I did then it is a simple edit job to the language string
My name is %1$s and I am a %2$s here at %3$s where I work for %5$s the %4$sHeck you could even leave the boss title out and have
My name is %1$s and I am a %2$s here at %3$s where I work for %5$sOne thing to consider is if you want to use
My name is %1$s and I am a %2$s here at %3$s where I work for %4$s %5$sand gain the speed or useMy name is {NAME} and I am a {JOB} here at {COMPANY} where I work for {BOSSTITLE} {BOSS}and gain the readability.That choice depends mostly on how the string is being used.
$str = "My name is $NAME and I am a $JOB here at $COMPANY where I work for $BOSSTITLE $BOSS";
What happens if you load the file that contains the string and those variables aren't defined yet?
Still isn't very readable. But I choose speed over readability of the language strings myself
And I agree with Thantos' large post above