Fri, May 22, 2009 11:34 AM
First there was LAMP.
But are you using GLAMMP? You have probably not heard of it
because we just coined the term while chatting at work. You
know LAMP (Linux, Apache, MySQL and PHP or Perl and sometimes
Python). So, what are the extra letters for?
The G is for Gearman - Gearman is a system to farm out work
to other machines, dispatching function calls to machines that are
better suited to do work, to do work in parallel, to load balance
lots of function calls, or to call functions between languages.
The extra M is for Memcached - memcached is a
high-performance, distributed memory object caching system, generic
in nature, but intended for use in speeding up dynamic web
applications by alleviating database load.
More and more these days, you can't run a web site on just
LAMP. You need these extra tools (or ones like them) to do
all the cool things you want to do. What other tools do we
need to work into the acronym? PostgreSQL replaces MySQL in lots
of stacks to form LAPP. I guess Drizzle may replace MySQL in some stacks
soon. For us, it will likely be added to the
stack. Will that make it GLAMMPD? We need more
vowels! If you are starting the next must use tool for
running web sites on open source software, please use a vowel for
the first letter.
Fri, May 15, 2009 12:38 PM
register_globals is going way in PHP6. That is fine with
me. Super globals are cool and I have taken to using filter_input_array these
days anyhow. However, our code base is now 10+ years old at
dealnews. Most of the forward facing code was completely
rewritten in the last couple of years due to architecture
changes. Many new projects had register_globals turned off
via php_admin_flag in Apache. So, that area is not that big
of a problem. However, our internal admin areas have not all
be rewritten because, well frankly, they still work. Yeah,
stuff written for PHP4 in 2000 is still working. KISS helps a
lot with that. But, this code, somewhere in there, may still
be relying on register_globals. Now, we could go line by line
and try and fix it. But, it seems like a program could be
written to do this job. I mean, I use jEdit and it can
highlight unset vars using the PHPParserPlugin just fine. I
bet Zend IDE can do the same. Has anyone written such a tool
for the command line? There will be false positives I
know. Things like passing a variable by reference to a
function would look like a use before set. But, I can deal
with those if I don't have to go line by line through tons of old
code. What would the rules look like for such an
animal? This would be a great project to get off the ground
before PHP6 hits. Ideally you could provide a list of
variables for it to ignore. We have some globals we set up in
prepends and includes.
Thu, May 7, 2009 11:57 AM
Last year I was surprised to be
going to Velocity. Read the post, it was an
adventure. But, I
really like the conference. It is the perfect conference
for me. While a good majority of my work is done coding
PHP/MySQL apps, I tend to focus on architecture, frameworks,
performance and that kind of stuff. So, a web performance and
operations conference is just perfect.
Last year, I was on a panel with some
great guys. I was able to share just a bit about my
experience dealing with the instant success of a web site.
This year, my proposal was accepted to talk more about dealing with
success of a web site. The talk will be focused on my
experience at dealnews.com and
from working with power users for Phorum. Here is the summary:
Lots of people talk about scaling and performance. But,
are they preparing for all the things that could
happen? There are multiple problems and there is not
one solution to solve them all.
Everything is running fine and BAM! – your site is linked from the front
page of Yahoo! What do you do? How can you handle that
sudden rush of traffic. Requests per second are running
5x normal levels. Servers have CPU spikes. Daemons are hitting the
maximums. You are running out of bandwidth. How could
you have been prepared for this? What are the tools and
techniques for this type of sudden rush?
Or, lets say you have just come out of a meeting where
everyone discovered that your site is growing in
traffic 70% – 80% year over year. That means that 1
million page views this month will be nearly 3 million
this time in 2 years. How can you plan for that? You
don’t want to redesign the whole architecture every 2
years. What methods could be used to deal with this
constant long term growth?
While there is no magic bullet for either of these
scenarios, there are techniques used by many sites out
there to help you get through these situations. This
session will cover some of these techniques and talk
about their pros and cons.
I must admit, this if the first time since 2000 that I am a
little intimidated to speak at a conference. The
people that present and attend Velocity are so
awesome. I just hope I don't disappoint.
Tue, Apr 21, 2009 12:48 PM
I just discovered an incompatibility between Net Gearman and PHP
5.2.9+. json_decode was changed in 5.2.9 to
return NULL on invalid JSON strings. Previously, the bare
string had been returned if it was not valid JSON. This was
nice in a way as you could pass a scalar string to json_decode and
not worry about it. But, in reality, it would make debugging
a nightmare for JSON.
I have updated my github
fork and requested a pull into the main branch.
Once that is done a new PEAR release can be done.
Thu, Apr 16, 2009 11:13 PM
I am calling it. The death of the PHP die function. Now, I have no
actual authority to do so. My PHP CVS karma does not extend
that far. And I doubt it will actually get removed despite it
being nothing more than an alias for exit now.
No, what I would like to call a death to is the usage of die such
as:
$conn = mysql_connect($server, $user, $pass) or die("Could
not connect to MySQL, but needed to tell the whole
world");
I don't know who thought that particular usage was good, but they
need to .... no, that is harsh. I just really wish they had
never done that.
So, what should you use? Well, there are a couple of options
depending on what context you are working in and whether or not the
failure is actually catastrophic.
Exceptions
If you are using OOP in your PHP code, Exceptions are the logic choice
for dealing with errors. I have mixed feelings about
them. But, it has more to do with the catching of exceptions
than the throwing of them. If you are going to live in a
world of exceptions, please catch them and provide useful error
messages. The PHP world is not too bad about that, but I have
read too many Java error logs full of huge, verbose exception dumps
in my life already. Please don't follow that technique in
PHP.
trigger_error
The function trigger_error is quite
handy. It allows you, a common PHP coder, to create errors
just like the core system. So, the error messages are
familiar to anyone that is used to seeing PHP errors. So, if
your system is configured to log errors and not display them,
errors from trigger_error will be treated the same as built in
errors.
Also, errors thrown with trigger_error are caught by a custom
error handler
just like built in errors. They can be logged, printed,
whatever you want from that error handler, just like normal PHP
errors. There are even several levels of errors you can raise
like notices, warnings, errors, and even deprecated. Again,
just like the built in PHP errors.
FATAL Errors
trigger_error is also the most suitable way, IMO, to end a script
immediately.
$conn = mysql_connect($server, $user, $pass);
if(!$conn) {
trigger_error("Could not connect to MySQL
database.", E_USER_ERROR);
}
Now that will not be told to the whole world if you have
display_errors set to Off as you should in any production
environment.
Wed, Apr 8, 2009 08:00 AM
There are several key changes in Wordcraft 0.9.1. The two big
things are:
-
Tokens on post forms in the admin to help ward off
CSRF attacks.
-
Database schema updates automated.
The first comes as a result of us doing the same work on
Phorum recently. I realized I needed the same protection in
Wordcraft. The second was done out of neccesity as I changed
the datetime fields in the database schema into int fields.
Not sure why I ever made them datetime fields. Unix
timestamps are much easier to work with. It saves many
strtotime() calls and will make eventual time zone settings much
easier to implement.
In addition to those two big ones, there were some notable small
ones:
- HTML 4.01 validation fixes
- Ensuring UTF-8 on all encoding function calls
- Protection against hitting the back button when writing a
post (most annoying on Macs as the back button and the
beginning of line keystroke is the same).
And there were other a few other bug
fixes.
I will or course need many more testers and users before I can ever
declare this software as stable. If you need a simple blog,
give it a try.
About Wordcraft
Wordcraft aims to be a simple, lightweight blogging
application. Wordcraft is written exclusively for PHP 5+ and
MySQL 5.0+ using only the PHP mysqli extension, UTF-8, and HTML
4.01 to achieve that simpleness.
Tue, Apr 7, 2009 08:03 PM
For our development servers, we have always used output buffering
to replace the URLs (dealnews.com) with the URL for that
development environment. Where we run into problems is with
CSS and JavaScript. If those files contains URLs for images
(CSS) or AJAX (JS) the URLS would not get replaced. Our
solution has been to parse those files as PHP (on the dev boxes
only) and have some output buffering replace the URLs in those
files. That has caused various problems over the years and
even some confusion for new developers. So, I got to looking
for a different solution. Enter mod_substitute
for Apache 2.2.
mod_substitute
provides a mechanism to perform both regular expression and
fixed string substitutions on response bodies. - Apache
Documentation
Cool! I put in the URL mappings and VIOLA! All was
right in the world.
Fast forward a day. Another developer is testing some new
code and finds that his XML is getting munged. At first we
blamed libxml because we had just been through an ordeal with a bad
combination of a libxml compile option and PHP a while back.
Maybe we missed that box when we fixed it. We recompiled
everything on the dev box but there was no change. So I
started to think what was recently different with the dev
boxes. So, I turn off mod_substitute. Dang, that fixed
it. I looked at my substitution strings and everything looked
fine. After cursing and being depressed that such a cool tool
was not working, I took a break to let it settle in my mind.
I came back to the computer and decided to try a virgin Apache 2.2
build. I downloaded the source from the web site instead of
building from Gentoo's Portage. Sure enough, a simple test
worked fine. No munging. So, I loaded up the dev box
Apache configuration into the newly compiled Apache. Sure
enough, munged XML. ARGH!!
Up until this point, I had configured the substitutions globally
and not in a particular virtual host. So, I moved it all into
one virtual host configuration. Still broken.
A little more background on our config. We use mod_proxy to
emulate some features that we get in production with our F5 BIG-IP
load balancers. So, all requests to a dev box hit a mod_proxy
virtual host and are then directed to the appropriate virtual host
via a proxied request.
So, I got the idea to hit the virtual host directly on its port and
skip mod_proxy. Dang, what do you know. It worked
fine. So, something about the output of the backend request
and mod_proxy was not playing nice. So, hmm. I got the
idea to move the mod_substitute directives into the mod_proxy
virtual hosts configuration. Tested and working fine.
So, basically, this ensures that the substitution filtering is done
only after the proxy and all other requests have been
processed. I am no Apache developer, so I have not dug any
deeper. I have a working solution and maybe this blog post
will reach someone that can explain it. As for
mod_substitute, here is the way my config looks.
In the VirtualHost that is our global proxy, I have this:
FilterDeclare DN_REPLACE_URLS
FilterProvider DN_REPLACE_URLS SUBSTITUTE resp=Content-Type
$text/
FilterProvider DN_REPLACE_URLS SUBSTITUTE resp=Content-Type
$/xml
FilterProvider DN_REPLACE_URLS SUBSTITUTE resp=Content-Type
$/json
FilterProvider DN_REPLACE_URLS SUBSTITUTE resp=Content-Type
$/javascript
FilterChain DN_REPLACE_URLS
Elsewhere, in a file that is local to each dev host, I keep the
actual mappings for that particular host:
Substitute
"s|http://dealnews.com|http://somedevbox.dealnews.com|in"
Substitute
"s|http://dealmac.com|http://somedevbox.dealmac.com|in"
# etc....
I am trying to think of other really cool uses for this. Any
ideas?
Fri, Mar 20, 2009 10:55 PM
I am working on Wordcraft, trying to get
the last annoying HTML validation errors worked out. Thinks
like ampersands in URLs. In doing so, I am asking myself
where the escaping should take place. In the case of Wordcraft,
there are several parts to it.
- The code that pulls data from the database. Obviously
not the right place.
- The code that formats data like dates and such. It
also organizes data from several data sources into one nice
tidy array. Hmm, maybe
- The parts of the code that set up the output data for the
templates.
- The templates themselves.
Now, I am sure 1 is not the place. And I really would not
want 4 to be the place. That would make for some ugly
templating. Plus, the templates, IMO, should assume the data
is ready to be output. So, that leaves the code that does the
formatting and the code that does the data setup.
Of those two, I guess the place to do this job is in the data
setup. Wordcraft has a $WCDATA array that is available in the
scope of the templates. I suppose anything that goes into
that array should be escaped as appropriate.
I largely wrote this blog post as a
teddy bear exercise. But, I am curious. Where and
when do you escape your data for use in HTML documents?
Tue, Mar 3, 2009 09:51 PM
Have you ever noticed that PHP eats the newlines after a closing
PHP tag? Not sure what I mean? There is lots on
Google
about it. Here is an example.
Hello there!
<?php
// this is just a dump PHP block
?>
How are you?
becomes:
Hello there!
How are you?
I was talking about this with a coworker tonight. He is
trying to generate some XML and, like me and Chis
Shiflett, is anal about his output. You see, what happens
in modern use of PHP as a template language is something like
this:
<?php
$subelement = range(1, 10);
?>
<somexml>
<element>
<?php
foreach($subelement as $e) { ?>
<subelement><?php echo $e; ?></subelement>
<?php } ?>
</element>
</somexml>
That code will output this mess:
<somexml>
<element>
<subelement>1</subelement>
<subelement>2</subelement>
<subelement>3</subelement>
<subelement>4</subelement>
<subelement>5</subelement>
<subelement>6</subelement>
<subelement>7</subelement>
<subelement>8</subelement>
<subelement>9</subelement>
<subelement>10</subelement>
</element>
</somexml>
So, why does PHP do this? Well, you have to go back 11
years. PHP 3 was emerging. I was just starting to
use it for Phorum at the
time. There were two reasons.
The first was that you would want the newline after the first
closing tag to be removed as it would remove the existence of the
PHP block completely. At the time, people were shunned for
writing PHP as a tag looking language. ColdFusion was new
then too and the PHP community liked to point and laugh at it.
The second case (and this is probably a more legitimate one) was
that many editors (some still do this for some insane reason) force
every friggin file to end in a newline. We did not have
output buffering in those days. It was the stone age
man. So, to get around the "Headers already sent" errors,
Zeev decided to make the PHP ending tag be "?> with an optional
newline". It was a heated debate on the PHP Internals (then
php-dev) list. So much that I remembered it and dug it up on
MARC.
Heck, now I want to add to it. I would like it please if PHP
could remove any leading, non-newline whitespace before an open
tag. That would solve this problem. Yeah, more
magic! Nothing like it.
To me, the worst alternative to all this is the lack of a closing
tag in a file. My OCD just
can't deal with that. Please, baby seals cry when you don't
use a closing tag.
Mon, Feb 23, 2009 08:00 AM
I am pleased to announce the release of Wordcraft
0.8. I have managed to release about once a month since
November. I also have actually gotten some feedback and
tickets posted. Thanks to those that have tried it out.
I have decided to go back to YUI's Editor. I tried TinyMCE in
the last release. But, using it full time I found it messed
with my HTML too much for my liking. When I would switch to
raw HTML mode and add something like a <code> tag, it would
be lost when saving the data back into the WYSIWYG editor.
I also converted the admin HTML to HTML 4.01 Transitional. I
never use XHTML anymore these days. So, I was writing invalid
XHTML inadvertantly.
I worked on the session handling some more in this release.
Users should stay logged in to the admin better now.
I put comment blocks in all the files and documented every
function. This should help anyone wanting to dig in and help
out.
I fixed several bugs reported by users (or maybe just testers, not
sure). Thanks for that and keep the feedback
coming.