Strange PHP Code I wrote
I wrote this to deal with the weird, weird way that PHP handles variable variables and arrays:
eval (“\$item_to_make = &$item_to_make;”);
I think it’s the ugliest thing I’ve ever written.
I wrote this to deal with the weird, weird way that PHP handles variable variables and arrays:
eval (“\$item_to_make = &$item_to_make;”);
I think it’s the ugliest thing I’ve ever written.
I was a bit surprised earlier this week to find out that Omegapaladin (who is lightyears ahead of me in l33t math skillz) hadn’t heard of the concept of the modulo before. There’s a lot of unfortunate nitty-gritty problems about mod, but it’s a very useful operation. I wrote a small program in PHP that uses mod in two interesting ways.
The script is attached to this post. First, though, here’s the image it creates:
This is Pi, sort of. The image can be “read” from left to right starting at the top and going to the bottom. Because there are ten possible digits in Pi but there are seven rainbow colors, I needed a way to figure out what to map 7, 8, and 9 to. The “correct” way to do this would to get a base-7 representation of Pi, but I wanted to talk about mod. So, for each digit of pi, I picked a color for the rectangle by taking the corresponding digit of pi and modding it by 7: “i % 7″
Now, the digits of Pi can be thought of as a one dimensional sequence, but the image above is two dimensional. Given that we’re considering the nth digit of pi, how do we translate this into an x and y position? Using mod:
for($i = 0; $i < $length_of_pi; $i++) {
$current_pixel_y = floor($i/25);
$current_pixel_x = $i % 25;
…
The first line determines what row we’re in, from the top (there are 25 pixels in each row). The next line determines the column within the row, using mod. This works because the number of pixels left after accounting for all the previous rows go into the current row.
I wrote the following code today:
if (isset($predictions[$key])) {
$probability = $predictions[$key];
}
else {
$probability = 0.02275;
}
After doing this, I realized that a probability can’t be greater than 1.0, nor less than 0.0; for what I’m working on, I want a probability exclusive of exactly 1.0 or 0.0, but I can’t be certain that $predictions[$key] actually meets this requirement. There are a couple of ways to deal with this, and the most straightforward way is to add more conditions to the if statement, like so:
if (isset($predictions[$key])
&& ($predictions[$key] > 0.0)
&& ($predictions[$key] < 1.0)) {
$probability = $predictions[$key];
}
else {
$probability = 0.02275;
}
This works, but it doesn’t seem like quite the right place to put these conditions as far as human readability goes. What would make more sense to check for these conditions after $probability has been assigned, like so:
if (isset($predictions[$key])) {
$probability = $predictions[$key];
}
else {
$probability = 0.02275;
}
if (($predictions[$key] < 0.0)
|| ($predictions[$key] > 1.0)) {
$probability = 0.02275;
}
But this is less than ideal, because it involves repeating the assignment of 2.3% to $probability rather awkwardly. What we want is to attach the conditions to the else block of code. Alas, there’s no way to do this! An elseif statement would be executed iff the original if condition fails, which is not what we want! Using an independent if and eliminating the else block means we have to duplicate the original if condition, to see if we need to assign 2.3% to probability because $predictions[$key] does not exist! What would be best is a new statement altogether, which would run if a previous if statement failed, or if some other conditions were met “iforelse”:
if (isset($predictions[$key])) {
$probability = $predictions[$key];
}
iforelse(($predictions[$key] < 0.0)
|| ($predictions[$key] > 1.0)) {
$probability = 0.02275;
}
AFAIK, no computer language has any such statement.
I just got an email from Zend telling me about their new move to embrace Eclipse:
Zend is launching a beta of the next generation of the Zend Studio family – Zend Studio for Eclipse.
This beta release (code named “Neon”) is based on proven Zend Studio technology and the Eclipse PHP Developers Tools (PDT) project. Zend Studio for Eclipse is the world’s most powerful PHP IDE – providing professional PHP development capabilities combined with the Eclipse multi-language support and plug-in extension technology.
When I tried Zend studio, I gave up on it after the trial period. As an IDE, it’s beautiful, complete, and the slowest software ever written. Java doesn’t need to be so slow — Eclipse runs almost as fast as a native application on my machine. I have missed some of the more complex features, so I’ll definitely be checking this new product out when I get a chance.
I’ve been working with the JSON functions in PHP. These functions were incorporated into PHP 5.2, although methods of working with JSON have been awhile for PHP. Given that PHP’s serialize and unserialize and json_encode and json_decode are interchangable with one another in many circumstances, I began to wonder which set of functions would be faster. Here’s the wall time of all four functions stringifying 10000 arrays, averaged over 10 runs:
json_encode json_decode serialize unserialize
average 6.60 1.70 6.60 1.20
Basically, in terms of speed, there is little difference. In the test above, json_encode() and serialize() have the same speed. A chart of an earlier test (attached) shows serialize() to be slightly slower, but in any case, the difference in speed has little practical significance.
So, what are the differences between JSON stringification and PHP serialization that are worth considering? JSON is widely supported by many different languages, while PHP’s typed serialize format works best for PHP. It’s possible to work with PHP’s serialization format in Javascript (see PHPGuru.org for an example I found with a quick Google search), but it’s nowhere near as easy as JSON, which can be either eval’ed, or, more appropriately, jsondecoded.
Using PHP serializations allows PHP objects to be serialized, including the use of magic methods like __wakeup(). The comments on this activestate page note that this is not always a good thing, and can lead to the execution of arbitrary PHP by an attacker if you’re not careful.
FWIW (not too much, I’ll admit), I personally find JSON to be more eloquent and cleaner looking, and I’ll be using it in the future.
Here’s the code I used for the comparisons:
set_time_limit(0);
for ($j = 0; $j < 10; $j++) {
$json_strings = array();
$phps_strings = array();
$start = time();
for ($i = 0; $i < 100000;$i++) {
$foo = array();
$foo['bar'] = array(1,2,3,4,5,6,7,8,9,0);
$foo['bartwo'] = mt_rand(0,10000);
$foo['barnone'] = sha1(uniqid());
$foo['barall'] = md5(uniqid());
$json_strings[] = json_encode($foo);
}
$runtime = time() - $start;
echo "Wall time for json_encode: " . $runtime . "\n";
$start = time();
foreach ($json_strings as $json) {
$var = json_decode($json);
}
$runtime = time() - $start;
echo "Wall time for json_decode: " . $runtime . "\n";
$start = time();
for ($i = 0; $i < 100000;$i++) {
$foo = array();
$foo['bar'] = array(1,2,3,4,5,6,7,8,9,0);
$foo['bartwo'] = mt_rand(0,10000);
$foo['barnone'] = sha1(uniqid());
$foo['barall'] = md5(uniqid());
$phps_strings[] = serialize($foo);
}
$runtime = time() - $start;
echo "Wall time for serialize: " . $runtime . "\n";
$start = time();
foreach ($phps_strings as $phps) {
$var = unserialize($phps);
}
$runtime = time() - $start;
echo "Wall time for unserialize: " . $runtime . "\n
";
}
?>
I’m currently trying to work with the latest database dump from the English Wikipedia. It’s massive (Slightly under 10 GB uncompressed), and a pain to work with – especially since some of the behavior of PHP file functions with large files is not quite right. So, what I’ve been trying to do is break the XML dump down into sections (I’m losing a small handful of articles this way, less then 10) and then process those chunks into text files, which are then stored in a 3 level directory tree by letters: The “Disgaea” article would be stored as /home/myusername/wiki/d/i/s/disgaea.txt)
In order to create these directories, I used the following function:
function make_directory_tree ($level = 0, $parent = '', $maxlevel = 3) {
global $CONFIG;
echo ("Creating directory $parent$letter\n");
foreach ($CONFIG['directories'] as $letter) {
$status = mkdir($parent.$letter);
if ($status === FALSE) {
die('Could not create directory: ' . $parent . $letter . "\n");
}
if ($level < $maxlevel) {
make_directory_tree($level + 1,$parent . $letter.'/');
}
}
}
Truthfully, there are better ways to do this than a recursive function, but I didn't think there'd be a big difference in performance, so I was surprised by how long PHP was taking to create these directories.
And, when it was all said and done, I had the nasty surprise of finding that there were four levels of directories, rather than three - I had forgotten that the foreach loop meant that the final directory layer wasn't created recursively. Oops. I wanted 19683 directories, and accidentally created 531441. Oh well, set $maxlevel = 2, and try again, I suppose.
After installing WordPress here on the previously redirected nic.dreamhost.com, I had been considering trying to use it to run the blog at http://fancruft.com, which is really just an ugly way of displaying some entries from my Lavos MySQL database.
I suppose I’m glad I didn’t even have a chance to start. Today, just about everyone with WordPress 2.1.1 was scrambling to upgrade it after it was revealed that intentionally malicious code had been placed in it. I’d encourage people to read the original announcement.
I like the WordPress front-end. It shows that someone understands how people want to interact with software. But, after a problem this sever in the underlying PHP, how can you take Automattic’s (not so) subtle dig at PHPBB: “Have you ever been frustrated with forum or bulletin board software that was slow, bloated, and always got your server hacked?”
Why, yes, I think I’d like to NOT have my sites owned by the mob’s spam division very much, thank you.
I recognize that the fact that this was discovered in days, rather than months, is a testament to the “many eyes” theory of Open Source, but an intentional backdoor placed in software by a third party is about as bad as it gets.
I just ported the Quotomatic from my older site, StorySage. It’s in the place where most sane people put a description of their blog. I’ve noticed some bugs, namely that some of the quotes are too long, and hyphens aren’t appearing right in some quotes. It was mostly a matter of copying a few lines of PHP and deleting a variable that didn’t make any sense in the new context. On the other hand, this is not really robust, and, given that WP has a plugin system, not really the way things should be done.
But, it was fast.
image: detail of installation by Bronwyn Lace