<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/">
    <channel>
        <title>Ramblings of a web guy (Tag: programming)</title>
        <description>Brian Moon, of dealnews.com, shares what he knows (and learns) about PHP, MySQL and other stuff</description>
        <link>http://brian.moonspot.net/feed.php?type=rss&amp;amp;tag=programming</link>
        <lastBuildDate>Tue, 07 Sep 2010 21:06:33 -0500</lastBuildDate>
        <generator>Wordcraft 0.10</generator>
        <item>
            <guid>http://brian.moonspot.net/engineers-work-in-their-sleep</guid>
            <title>Engineers Work In Their Sleep</title>
            <link>http://brian.moonspot.net/engineers-work-in-their-sleep</link>
            <description><![CDATA[My grandmother worked at the <a href="http://www.nasa.gov/centers/marshall/home/index.html">Marshall Space Flight Center</a>. She worked with a lot of engineers. I have always remembered when she told my cousin and I that we should be engineers. Her reasoning was that if you saw a janitor sitting in a chair with his arms crossed and his eyes closed, you pretty much knew he was not working. But, if you see an engineer in the same position, you can not prove he is not working. Now, was she saying she wanted us to sleep on the job? No, of course not. The message was to have a job where you could use your mind. Although I think watching our grandfather work hard every day as a construction foreman may have shaped her opinion. You know what is funny about that? He was the healthiest man I have ever known. Maybe I don't need to be sitting behind a desk with my arms crossed and eyes closed?<br><br>For what it is worth, I don't believe that all jobs that use your mind are desk jobs. Some of the smartest people I know have jobs in fields that would be 
considered "manual" labor. I believe that smart people will always rise to the top no matter what you profession.]]></description>
            <dc:creator>brianlmoon</dc:creator>
            <pubDate>Wed, 07 Jul 2010 10:42:07 -0500</pubDate>
            <category>personal</category>
            <category>programming</category>
        </item>
        <item>
            <guid>http://brian.moonspot.net/2008/10/15/php-appalachia-corrections/</guid>
            <title>PHP Appalachia Corrections</title>
            <link>http://brian.moonspot.net/2008/10/15/php-appalachia-corrections/</link>
            <description><![CDATA[Just got home finally from PHP Appalachia.&nbsp; I enjoyed meeting
all the great people.<br>
<br>
I presented about what I learned and how we deal with importing
large amounts of CSV data into MySQL.&nbsp; I threw my idea onto
the wiki at the last minute, made the slides while everyone ate
breakfast and I had planned on researching it all (been a few years
since I wrote it), but we had no reliable internet.&nbsp; Some
claims I made and their corrections.<br>

<ol>
    <li style="list-style: none">
        <br>
    </li>

    <li>I said our largest file is about 1.8 million lines.&nbsp;
    WRONG.&nbsp; Actually it is about 4.6 million.&nbsp; I was
    correct however that it does finish importing and indexing in
    about 5 minutes.
    </li>

    <li style="list-style: none">
        <br>
    </li>

    <li>I claimed I LOAD DATA INFILE to MyISAM first and then
    "insert into ... select from" into an InnoDB table for speed
    reasons.&nbsp; WRONG.&nbsp; In fact, I do that because I need
    to merge fields from the file sometimes into one field in the
    databaes.&nbsp; I could not find a way to do that with LOAD
    DATA INFILE.&nbsp; As to speed.&nbsp; I can't say either way as
    I have no solid data.&nbsp; Sounds like a good test.&nbsp;
    MyISAM probably still wins on a LOAD DATA INFILE into a blank,
    fresh table based on my experience.
    </li>

    <li style="list-style: none">
        <br>
    </li>

    <li>Total rows currently indexed is 7.2 million.&nbsp; I did
    not make a claim, but I thought I would just mention
    that.&nbsp; I wanted to include that, but did not have
    Internet.&nbsp; (Damn you Hughes)
    </li>

    <li style="list-style: none">
        <br>
    </li>
</ol>]]></description>
            <dc:creator>brianlmoon</dc:creator>
            <pubDate>Tue, 14 Oct 2008 23:03:05 -0500</pubDate>
            <category>MySQL</category>
            <category>PHP</category>
            <category>Programming</category>
        </item>
        <item>
            <guid>http://brian.moonspot.net/2008/10/03/deploying-scalable-websites-with-memcached/</guid>
            <title>Deploying Scalable Websites with Memcached </title>
            <link>http://brian.moonspot.net/2008/10/03/deploying-scalable-websites-with-memcached/</link>
            <description><![CDATA[I spoke at the MySQL Conference and Expo this year about the architecture we have here at <a href="http://dealnews.com/">dealnews.com</a>.  After my talk, Jimmy Guerrero of Sun/MySQL invited me to give a webinar on how dealnews uses memcached.  That is taking place next week, Thursday, October 09, 2008.  It is a free webinar.  We have used memcached in a variety of ways as we have grown. So, I will be talking about how dealnews used memcached in the past and present.<br />
<br />
For more information, visit the <a href="http://www.mysql.com/news-and-events/web-seminars/display-220.html">MySQL web site</a>.]]></description>
            <dc:creator>brianlmoon</dc:creator>
            <pubDate>Fri, 03 Oct 2008 09:55:45 -0500</pubDate>
            <category>memcached</category>
            <category>MySQL</category>
            <category>PHP</category>
            <category>Programming</category>
        </item>
        <item>
            <guid>http://brian.moonspot.net/2008/09/20/strtotime-the-php-date-swiss-army-knife/</guid>
            <title>strtotime() - The PHP, date swiss army knife</title>
            <link>http://brian.moonspot.net/2008/09/20/strtotime-the-php-date-swiss-army-knife/</link>
            <description><![CDATA[Man, what did I do before <a href=
"http://php.net/strtotime">strtotime()</a>.&nbsp; Oh, I know, I had
a 482 line function to parse date formats and return
timestamps.&nbsp; And I still could not do really cool stuff.&nbsp;
Like tonight I needed to figure out when Thanksgiving was in the
US.&nbsp; I knew it was the 4th Thursday in November.&nbsp; So, I
started with some math stuff and checking what day of the week Nov.
1 would fall on.&nbsp; All that was making my head hurt.&nbsp; So,
I just tried this for fun.<br>

<pre>
strtotime("thursday, november ".date("Y")." + 3 weeks")
</pre><br>
That gives me Thanksgiving.&nbsp; Awesome.&nbsp; It is cool for
other stuff too.&nbsp; At its very basic, it can take a MySQL
datetime field and turn it into a timestamp.&nbsp; Very handy for
date calculations.&nbsp; It also understands <a class=
"link external" href="http://www.faqs.org/rfcs/rfc2822">RFC
2822</a> and ISO 8601 date formats.&nbsp; These are common in HTTP
headers and some XML documents like RSS and Atom feeds.&nbsp; Also,
PHP can output those two standard formats with the <a href=
"http://php.net/date">date()</a> function.&nbsp; So, this makes
them a good standards compliant way to pass full, timezone specific
dates around.]]></description>
            <dc:creator>brianlmoon</dc:creator>
            <pubDate>Fri, 19 Sep 2008 22:02:47 -0500</pubDate>
            <category>MySQL</category>
            <category>PHP</category>
            <category>Programming</category>
        </item>
        <item>
            <guid>http://brian.moonspot.net/2008/07/03/caching-and-ttl-behavior/</guid>
            <title>Caching and TTL behavior</title>
            <link>http://brian.moonspot.net/2008/07/03/caching-and-ttl-behavior/</link>
            <description><![CDATA[So, I am working on <a href="http://code.google.com/p/memproxy/">MemProxy</a> some.  Mainly, I am trying to implement more of the Cache-Control header's many options.  The one that has me a bit perplexed s-maxage.  Particularly when combined with max-age.<br />
<br />
s-maxage is the maximum time in seconds an item should remain in a shared cache.  So, if s-maxage is set by the application server, my proxy should keep it for that amount of time at the most.  Up until now, I have just been looking at max-age.  But, s-maxage is the proper one for a proxy to use if it is present.  I do not send the s-maxage through because this is a reverse proxy and, IMO, that is proper behavior for an application accelerating proxy.  However, I do send forward the max-age value that is set by the application servers.  If no max-age is set, I send a default as defined in the script.  Also, if no-cache or no-store is set, I send those and a max-age of 0.<br />
<br />
My problem arises when max-age is less than s-maxage.  Up until now, I have sent a max-age back to the client that represents the time left for the cached item in my proxy's cache.  So, if the app server sent back max-age=300 and a request comes in and the cache is found and the cache was created 100 seconds ago, I send max-age-200 back to the client.  But, I was only using max-age before.  Now, in cases where s-maxage is longer than max-age, I would come up with negative numbers.  That is not cool.  The easiest solution would be to always send the original max-age back to the client.  But, that seems kind of lame.<br />
<br />
So, my question is, if you are using an application (HTTP or otherwise) accelerator, what would you expect?  If you application set a max-age of 300 would you always expect the end client to receive a max-age of 300?  Or should it count down over time?  The only experience I have is a CDN.  If you watch CDN traffic, the max-age gets smaller and smaller over time until it hits 0.  I have not tried sending an s-maxage to my CDN.  I don't know what they would do with that.  Maybe that is a good test.<br />
<br />
UPDATE: Writing this gave me an idea.  If the item will be in the proxy cache longer than the max-age ttl, send the full max-age ttl.  Otherwise, send the time left in the proxy cache.  Thoughts on that?<br />
<br />
(thanks for being my <a href="http://compaspascal.blogspot.com/2007/12/teddy-bear-principle-in-programming.html">teddy bear</a> blogosphere)]]></description>
            <dc:creator>brianlmoon</dc:creator>
            <pubDate>Wed, 02 Jul 2008 23:56:25 -0500</pubDate>
            <category>Caching</category>
            <category>memcached</category>
            <category>MySQL</category>
            <category>PHP</category>
            <category>Programming</category>
        </item>
        <item>
            <guid>http://brian.moonspot.net/2008/07/01/velocity-conference-roundup/</guid>
            <title>Velocity Conference Roundup</title>
            <link>http://brian.moonspot.net/2008/07/01/velocity-conference-roundup/</link>
            <description><![CDATA[As I said before, I was <a href="http://brian.moonspot.net/2008/06/18/did-you-know-i-am-going-to-be-at-velocity/">invited to be on a panel at Velocity Conference</a>.  I was delighted to go.  I had never been to San Francisco.  I have been to Portland and Santa Clara several times.  The panel was great.  It was the Brian and photo sharing sites show.  Seriously, it was me (dealnews.com), John Allspaw of <a href="http://www.flickr.com/">Flickr</a>, Don MacAskill of <a href="http://www.smugmug.com/">SmugMug</a> and Farhan Mashraqi of <a href="http://www.fotolog.com/">Fotolog</a>.  Oh, there was also Shayan Zadeh of <a href="http://www.zoosk.com/">Zoosk</a>, a social dating network and Michael Halligan, a consultant from <a href="http://www.bitpusher.com/">BitPusher</a>.  We all had similar ideas.  I told my <a href="http://brian.moonspot.net/2006/12/22/is-yahooed-a-word/">Yahoo story</a>.  I told everyone that they should denormalize (or optimize as Farhan prefered) their data to improve performance.  Others agreed.  I have written about my methods for denormalizing normalized data before.  (See <a href="http://brian.moonspot.net/2007/06/23/caching-and-patience/">pushed cache</a>)  Fun was had by all.<br />
<br />
I mentioned John Allspaw above.  He gave a talk on his own as well.  It was good.  The <a href="http://www.slideshare.net/jallspaw/velocity2008-capacity-management1-484676">slides are on SlideShare</a>.  He and I see eye to eye on a lot of things.  One thing he says in there that may shock a lot of people is to test using produciton.  I agree fully.  We could have never been sure our infastructure was ready last year without testing the production servers.<br />
<br />
I also learned about <a href="http://varnish.projects.linpro.no/">Varnish</a> at the conference. It is a super fast reverse proxy.  It uses the virtual memory systems of recent kernels to store its cache.  The OS worries about moving things from memory to disk based on usage.  The claim is that the OSes are better at this than any programmer could do (without copying them of course).  It is fast.  The developers are proud.  And by proud I mean cocky.  I have been playing with it.  As you know, I have my own little <a href="http://code.google.com/p/memproxy/">caching proxy solution</a>.  Varnish is much faster, as I expected.  However, storing cache in memcached is very attractive to me.  Varnish can't do that.  It would likely slow it down a great deal.  MemProxy does do that.  Also, because MemProxy is written in PHP and my application layer is PHP, I can do things at the proxy layer to inspect the request and take action.  Works well for my use.  But, if you are using squid or mod_cache or something, you may want to give Varnish a look.<br />
<br />
There was a good bit of information about the client side of performance.  There were folks from Microsoft there talking about IE8.  It looks like IE8 will catch up with the other browsers in a lot of ways.  Yahoo talked about <a href="http://www.slideshare.net/stoyan/image-optimization-7-mistakes">image optimization</a>.  Good stuff in there.  I use Fireworks and it does a pretty good job of making small images.  I am looking more into combining images and making image maps that use CSS.  We use a CDN, but fewer connections is better for users.<br />
<br />
There was also a lot of great debate.  SANs rock!  SANs suck!  Rails Scales!  Rails Sucks!  The Cloud is awesome!  The Cloud is a lie!  (lots of cloud)<br />
<br />
I had dinner both nights with guys from Six Apart.  Good conversations were had.  I don't know if I am a big vegan fan though.  I mean, the food was good, but it all kinda tasted the same.  Perhaps I ordered poorly.  At dinner on Tuesday I met a guy going to work for Twitter soon.  He is an engineer that hopefully will be another step toward getting them back to 100% again.  Lets keep our fingers crossed.<br />
<br />
They did announce that the conference would be held again next year.  I am definitely going back.  Probably two of us from dealnews will go.  OSCON is fun.  MySQL conference is too.  But, more and more, capacity planning and scaling is what I do.  And this conference is all about those topics.]]></description>
            <dc:creator>brianlmoon</dc:creator>
            <pubDate>Tue, 01 Jul 2008 01:01:56 -0500</pubDate>
            <category>memcached</category>
            <category>MySQL</category>
            <category>PHP</category>
            <category>Programming</category>
            <category>Scalability</category>
        </item>
        <item>
            <guid>http://brian.moonspot.net/2008/06/18/did-you-know-i-am-going-to-be-at-velocity/</guid>
            <title>Did you know I am going to be at Velocity?</title>
            <link>http://brian.moonspot.net/2008/06/18/did-you-know-i-am-going-to-be-at-velocity/</link>
            <description><![CDATA[Well, neither did I until today. HA!<br />
<br />
<a href="http://en.oreilly.com/velocity2008/public/content/home">Velocity</a> is a new O'Reilly conference dedicated to "Optimizing Web Performance and Scalability".  It starts next Monday.  Yesterday I was contacted by <a href="http://en.oreilly.com/velocity2008/public/schedule/speaker/2314">Adam Jacobs</a> of HJK Solutions about taking part in a <a href="http://en.oreilly.com/velocity2008/public/schedule/detail/4762">panel discussion</a> about what happens when success comes suddenly to a web site.  I think he thought I was in the bay area.  Little did he know I am in Alabama.  But, amazingly, I was able to work it all out so I can be there.  I wish I had known about this conference ahead of time.  It sounds really awesome.  Performance has always been something I focus on.  I hope to share some and learn at the same time.<br />
<br />
So, if you are going to be there, come see our panel.<br />
<br />
P.S. Thanks to John Allspaw of Flickr for recommending me to Adam.]]></description>
            <dc:creator>brianlmoon</dc:creator>
            <pubDate>Tue, 17 Jun 2008 23:31:32 -0500</pubDate>
            <category>memcached</category>
            <category>MySQL</category>
            <category>Phorum</category>
            <category>PHP</category>
            <category>Programming</category>
        </item>
        <item>
            <guid>http://brian.moonspot.net/2008/06/17/an-introduction-to-mysql-birmingham-al/</guid>
            <title>An Introduction to MySQL - Birmingham, AL</title>
            <link>http://brian.moonspot.net/2008/06/17/an-introduction-to-mysql-birmingham-al/</link>
            <description><![CDATA[I am giving a talk titled "An Introduction to MySQL" here in <a href="http://upcoming.yahoo.com/event/726700">Birmingham, AL on June 21, 2008 at 3PM</a>.<br />
<br />
I love living in Alabama.  I was born and raised in Huntsville.  However, Birmingham has always seemed a bit behind in technology compared to what I do for a living.  There is good reason.  The industry here is medical, banking, industrial and utilities.  I don't really want my doctors keeping my medical records in an alpha release of anything.  Same goes for my banking and utilities.  But, as <a href="http://www.indeed.com/jobs?q=mysql+php&amp;l=Birmingham%2C+AL">this page shows</a>, the companies here are catching up.  So, I am happy to present MySQL to as many people as I can in this town.  Hopefully I will help some folks that have not been exposed to MySQL or any open source for that matter.<br />
<br />
The event is part of our local Linux user group's (BALU) <a href="http://bham-lug.org/meetings.html">planned events</a>.]]></description>
            <dc:creator>brianlmoon</dc:creator>
            <pubDate>Mon, 16 Jun 2008 19:00:25 -0500</pubDate>
            <category>Linux</category>
            <category>MySQL</category>
            <category>PHP</category>
            <category>Programming</category>
        </item>
        <item>
            <guid>http://brian.moonspot.net/2008/06/11/memproxy-01/</guid>
            <title>MemProxy 0.1</title>
            <link>http://brian.moonspot.net/2008/06/11/memproxy-01/</link>
            <description><![CDATA[<a href="http://memproxy.googlecode.com/files/memproxy-0.1.tar.gz">MemProxy 0.1 is out</a>!  It has taken me a while, but I have finally gotten around to releasing the code that I credited with saving us during a <a href="http://brian.moonspot.net/2006/12/22/is-yahooed-a-word/">Yahoo! mention</a>.  It is a caching proxy "server" that uses memcached for storing the cache.  I put server in quotes because it is really just a PHP script that handles the caching and talking to the application servers.  Apache and other HTTP servers already do a good job talking HTTP to a vast myriad of clients.  I did not see any reason to reinvent the wheel.  Here are some of the features that make it different from anything I could find:<br />
<ul><br />
	<li>Uses memcached for storage</li><br />
	<li>Serves cache headers to clients based on TTL of cached data</li><br />
	<li>Uses custom headers to assemble multiple pieces of cache into one object</li><br />
	<li>Minimal dependencies.  Only PHP and pecl/memcached needed.</li><br />
	<li>Small code base.  It is just two files, one when settings are cached.</li><br />
	<li>Application agnostic.  If the backend is hosted on an HTTP server this can cache it.</li><br />
</ul><br />
Some other things it does that you might expect:<br />
<ul><br />
	<li>Handles HTTP 1.1 requests to the backend</li><br />
	<li>Allows TTLs set by the standard Cache-Control header</li><br />
	<li>Appears transparent to the client.</li><br />
	<li>Sends proper HTTP error codes relating to proxies/gateways</li><br />
	<li>Allows pages to be refreshed or removed from cache</li><br />
	<li>Allows a page to be viewed from the application server without caching it</li><br />
	<li>more....</li><br />
</ul><br />
You can find the code on <a href="http://code.google.com/p/memproxy/">Google Code</a>.  The code (or something like it rather) has been in use at <a href="http://dealnews.com/">dealnews</a> for well over a year.  But, this is a new code base.  It had to be refactored for public consumption.  So, there may be bugs.]]></description>
            <dc:creator>brianlmoon</dc:creator>
            <pubDate>Tue, 10 Jun 2008 20:46:15 -0500</pubDate>
            <category>memcached</category>
            <category>MySQL</category>
            <category>Phorum</category>
            <category>PHP</category>
            <category>Programming</category>
        </item>
        <item>
            <guid>http://brian.moonspot.net/2008/06/06/oop-does-not-equal-portable-or-shareable/</guid>
            <title>OOP does not equal portable or shareable</title>
            <link>http://brian.moonspot.net/2008/06/06/oop-does-not-equal-portable-or-shareable/</link>
            <description><![CDATA[So, just now, I was reading a good <a href="http://blog.stuartherbert.com/php/2008/06/06/on-management-false-sirens-and-the-threat-of-rails/">Rails post by Stuart Herbert</a> and nodding my head along.  I have not gotten into the Rails bashing fun on my blog, but I do poke fun at it around the office.  Then I got to this part:<br />
<blockquote><span style="color:#000000;">The OO in Rails continues to leave PHP for dead, and </span><span style="color:#000000;">OO brings many advantages to a thriving development community.  There are </span><span style="color:#000000;">real advantages to being able to share code between both the must-be-real-time web front-end and the non-real time backends, and to be able to easily reuse whatever external open-source libraries save you time and effort.</span></blockquote><br />
Now, I have no idea about the first part.  I am not an OOP guy.  But, what I have issue with is the idea that for code to be reusable, it has to be OOP.  So, if I am a college kid or young PHP developer, I would read this and think "Oh, so, to reuse code or share it, I have to be using OOP".  Man, this is just so dead wrong and irresponsible.  Can someone tell me why only OOP can be reused?  Why can't people write sane functions that can be reused?  I do it every day.  They do it in C all the time.  Our front end web servers run the same code base as the cron jobs that do a wide variety of things.  They use the same libraries.  They use the same objects (yeah, i use them when they are a good idea).<br />
<br />
Please, someone explain this to me.<br />
<br />
(I have a half written post about how you can write good, maintainable, reusable code without OOP.  I have not finished it yet, but I guess I need to.  It seems the world is going to OOP hell otherwise.)]]></description>
            <dc:creator>brianlmoon</dc:creator>
            <pubDate>Fri, 06 Jun 2008 14:29:57 -0500</pubDate>
            <category>PHP</category>
            <category>Programming</category>
        </item>
        <item>
            <guid>http://brian.moonspot.net/2008/06/05/in_array-is-quite-slow/</guid>
            <title>in_array is quite slow</title>
            <link>http://brian.moonspot.net/2008/06/05/in_array-is-quite-slow/</link>
            <description><![CDATA[So, we had a cron job hanging for hours.  No idea why.  So, I started debugging.  It all came down to a call to in_array().  See, this job is importing data from a huge XML file into MySQL.  After it is done, we want to compare the data we just added/updated to the data in the table so we can deactivate any data we did not update.  We were using a mod_time field in mysql in the past.  But, that proved to be an issue when we wanted to start skipping rows from the XML that were present but unchanged.  Doing that saved a lot of MySQL writes and sped up the process.<br />
<br />
So, anyhow, we have this huge array of ids accumulated during the import.  So, an in clause with 2 million parts would suck.  So, we suck back all the ids in the database that exist and stick that into an array.  We then compared the two arrays by looping one array and using in_array() to check if the value was in the second array.  Here is a pseudo example that shows the idea:<br />
<br />
[sourcecode language='php']<br />
<br />
foreach($arr1 as $key=>$i){<br />
<br />
if(in_array($i, $arr2)){<br />
<br />
unset($arr1[$key]);<br />
<br />
}<br />
}<br />
<br />
[/sourcecode]<br />
<br />
So, that was running for hours with about 400k items.  Our data did not contain the value as the key, but it could as the value was unique.  So, I added it.  So, now, the code looks like:<br />
<br />
[sourcecode language='php']<br />
<br />
foreach($arr1 as $key=>$i){<br />
<br />
if(isset($arr2[$i])){<br />
<br />
unset($arr1[$key]);<br />
<br />
}<br />
}<br />
<br />
[/sourcecode]<br />
<br />
Yeah, that runs in .8 seconds.  Much better.<br />
<br />
So, why were we using in_array to start with if in_array is clearly not the right solution to this problem?  Well, it was basic code evolution.  Originally, these imports would be maybe 100 items.  But, things changed.<br />
<br />
FWIW,  I tried array_diff() as well.  It took 25 seconds.  Way better than looping and calling in_array, but still not as quick as a simple isset check.  There was refactoring needed to put the values into the keys of the array.<br />
<br />
<strong>UPDATE:</strong> I updated this post to properly reflect that there is nothing wrong with in_array, but simply that it was not the right solution to this problem.  I wrote this late and did not properly express this.  Thanks to all those people in the comments that helped explain this.]]></description>
            <dc:creator>brianlmoon</dc:creator>
            <pubDate>Thu, 05 Jun 2008 02:38:18 -0500</pubDate>
            <category>MySQL</category>
            <category>Phorum</category>
            <category>PHP</category>
            <category>Programming</category>
        </item>
        <item>
            <guid>http://brian.moonspot.net/2008/06/03/stupid-php-tricks-normalizing-simplexml-data/</guid>
            <title>Stupid PHP Tricks: Normalizing SimpleXML Data</title>
            <link>http://brian.moonspot.net/2008/06/03/stupid-php-tricks-normalizing-simplexml-data/</link>
            <description><![CDATA[<a href="http://us3.php.net/manual/en/book.simplexml.php">SimpleXML</a> is neat.  Some people don't think it is so simple.  Boy, use the old stuff.  The <a href="http://us3.php.net/manual/en/book.domxml.php">DOM-XML</a> stuff.<br />
<br />
Anyhow, one annoying thing about SimpleXML has to do with caching.  When using web services, we often cache the contents we get back.  We were having a problem where we would get an error about a SimpleXML node not existing.  We were caching the data in memcached which serializes the variable.  So, when it unserialized the variable, there were references in there to some SimpleXML nodes that we did not take care of.  Basically, a tag like:<br />
<br />
<code>&lt;foo&gt;bar&lt;/foo&gt;</code><br />
<br />
is a string.  But a tag like:<br />
<br />
<code>&lt;foo&gt;&lt;/foo&gt;</code><br />
<br />
is an empty SimpleXML Object.  That is a little annoying, but I don't feel like digging into the C code and figuring out why.  So, we just work around it.  We made a recursive function to do the dirty work for us.<br />
<br />
<code>function makeArray($obj) {<br />
$arr = (array)$obj;<br />
if(empty($arr)){<br />
$arr = "";<br />
} else {<br />
foreach($arr as $key=&gt;$value){<br />
if(!is_scalar($value)){<br />
$arr[$key] = makeArray($value);<br />
}<br />
}<br />
}<br />
return $arr;<br />
}<br />
</code><br />
That will turn whatever you pass it into an array or empty string if it is empty.<br />
<br />
But, while I was hacking around tonight, I came up with another idea.  Check out this hackery:<br />
<br />
<code>$data = json_decode(json_encode($data));</code><br />
<br />
Yeah!  One liner.  That converts all the SimpleXML elements into stdClass objects.  All other vars are left intact.<br />
<br />
Ok, so this is where someone in the comments can tell me about the magic SimpleXML method or magic OOP function I have missed to take care of all this.  Go ahead, please make my code faster.  I dare you.]]></description>
            <dc:creator>brianlmoon</dc:creator>
            <pubDate>Mon, 02 Jun 2008 21:59:04 -0500</pubDate>
            <category>memcached</category>
            <category>MySQL</category>
            <category>PHP</category>
            <category>Programming</category>
        </item>
        <item>
            <guid>http://brian.moonspot.net/2008/05/28/short-array-syntax-for-php/</guid>
            <title>Short Array Syntax for PHP</title>
            <link>http://brian.moonspot.net/2008/05/28/short-array-syntax-for-php/</link>
            <description><![CDATA[So, I was asked in IRC today about the proposed short array syntax for PHP.  For those that don't know, I mean the same syntax that other languages (javascript, perl, python, ruby) all have.  Currently in PHP we have this:<br />
<br />
$var = array(1,2,3);<br />
<br />
The proposed additional syntax is:<br />
<br />
$var = [1,2,3];<br />
<br />
So, I voted +1 for this feature on the PHP Internals list.  A colleague asked me why I voted +1.  At first I had no good answer other than it was just a gut feeling.  It just feels like a good addition to the language.  It is common among web languages and therefore users coming into PHP from other languages may find it more comfortable.<br />
<br />
The best thing I could tell him was that it would make arrays fall in line with other data types in PHP.  For example, you never write:<br />
<br />
$var = int(1);<br />
<br />
$var = string(foo);<br />
<br />
So, why oh why do we have to have what looks like a function, but in reality is not, for creating an array?  It is a language construct and should look like a language construct.  I think the [ ] syntax makes more sense when you think about it in those terms.<br />
<br />
I say <a href="http://marc.info/?l=php-internals&amp;m=121151618528857&amp;w=2">commit it Andi</a>.  That seems to be what everyone else does. =)]]></description>
            <dc:creator>brianlmoon</dc:creator>
            <pubDate>Wed, 28 May 2008 09:18:32 -0500</pubDate>
            <category>MySQL</category>
            <category>PHP</category>
            <category>Programming</category>
        </item>
        <item>
            <guid>http://brian.moonspot.net/2008/05/14/php-session-cookie-refresh/</guid>
            <title>PHP session cookie refresh</title>
            <link>http://brian.moonspot.net/2008/05/14/php-session-cookie-refresh/</link>
            <description><![CDATA[I have always had an issue with PHP Sessions. Albeit, a lot of my issues are now invalid. When they were first implemented, they had lots of issues.  Then the $_SESSION variable came to exist and it was better. Then memcached came to exist and you could store sessions there. That was better. But, still, after all this time, there is one issue that still bugs me.<br>
<br>
When you start a session, if the user had no cookie, they get a new session id and they get a cookie. You can configure that cookie to last for n seconds via php.ini or session_cookie_set_params(). But, and this is a HUGE but for me, that cookie will expire in n seconds no matter what. Let me explain further. For my needs, the cookie should expire in n seconds <strong>from last activity. </strong>So, each page load where sessions are used should reset the cookie's expiration. This way, if a user leaves the site, they have n seconds to come back and still be logged in.<br>
<br>
Consider an application that sets the cookie expiration to 5 minutes. The person clicks around on the site, gets a phone call that lasts 8 minutes and then gets back to using the site. Their session has expired!!!! How annoying is that? The only sites I know that do that are banks. They have good reason. I understand that.<br>
<br>
My preference would be to either set an ini value that tells PHP sessions to keep the session active as long as the user is using the site. Or give me access to the internal function php_session_send_cookie(). That is the C function that sends the cookie to the user's browser. Hmm, perhaps a patch is in my future.<br>
<br>
In the short term, this is what I do:<br>
<code><br>
setcookie(<br>
ini_get("session.name"),<br>
session_id(),<br>
time()+ini_get("session.cookie_lifetime"),<br>
ini_get("session.cookie_path"),<br>
ini_get("session.cookie_domain"),<br>
ini_get("session.cookie_secure"),<br>
ini_get("session.cookie_httponly")<br>
);<br>
</code><br>
<br>
That will set the session cookie with a fresh ttl.<br>
<br>
Ok, going to dig into some C code now and see if I can make a patch for this.]]></description>
            <dc:creator>brianlmoon</dc:creator>
            <pubDate>Tue, 13 May 2008 19:40:47 -0500</pubDate>
            <category>memcached</category>
            <category>PHP</category>
            <category>Programming</category>
        </item>
        <item>
            <guid>http://brian.moonspot.net/2008/05/07/thoughts-on-the-2008-mysql-conference-and-expo/</guid>
            <title>Thoughts on the 2008 MySQL Conference and Expo</title>
            <link>http://brian.moonspot.net/2008/05/07/thoughts-on-the-2008-mysql-conference-and-expo/</link>
            <description><![CDATA[Well, it has been almost a month.  I know I am late to the blogosphere on my thoughts.  Just been busy.<br />
<br />
Again this year, the Phorum team was invited to be a part of the DotOrg Pavilion.  What is that?  Basically they just give expo floor space to open source projects.  It is cool.  We had a great location this year.  We were right next to the area where they served food and drinks during the breaks.  We had lots of traffic and met some of our power users.  <a href="http://www.imvu.com/">IMVU.com</a> is getting 1.5 million messages per month in their Phorum install.  They did have to customize it to fit into their sharding.  But, that is expected.  A guy (didn't catch his name) from Innobase came by and told us that they just launced <a href="http://forums.innodb.com/">InnoDB support forums</a> on their site using Phorum.  Cool.  So now MySQL and Innobase use Phorum.  I am humbled by the message that sends to me about Phorum.<br />
<br />
Speaking of our booth, we were right next to the <a href="http://www.phpmyadmin.net/">phpMyAdmin</a> guys.  Wow, that product has come a long way.  I was checking out the visual database designer they have now.  It was neat.  I also met the Gentoo MySQL package maintainer.  He was in the phpMyAdmin booth.<br />
<br />
I was interviewed by <a href="http://www.webdevradio.com/">WebDevRadio</a> as I <a href="http://doughboy.wordpress.com/2008/05/03/interview-with-webdevradio/">already posted</a>.  I was also asked to do a short Q&amp;A with the Sun Headlines video team.  They used one part of my clip.  I won't link to that.  No, if you find it good for you.  I need to be interviewed some more or something.  I did not look comfortable at all.<br />
<br />
There were lots of companies with <em>open</em> in their name or slogan.  I guess this is expected pandering.<br />
<br />
I attended part of the InnoDB talk given by <a href="http://en.oreilly.com/mysql2008/public/schedule/speaker/88">Mark Callaghan</a> of Google.  It appears that Google is serious about improving InnoDB on large machines.  That is, IMO, good news for anyone that likes InnoDB.  If I counted right, they had more than 5 people who at least part of their job is to improve InnoDB.<br />
<br />
I gave my two talks.  The first had low attendance, but the feedback was nice.  It was just after the snack break in the expo hall and I was in the farthest room from the expo hall.  That is what I keep telling myself. =)  The second was better attended and the feedback seemed good there.  I was told by Maurice (Phorum Developer) that I talked too fast and at times sounded like Mr. Mackey from South Park by repeating the word <em>bad </em>a lot.  I will have to work on that in the future.  I want to do more speaking.<br />
<br />
On the topic of my second talk, there seemed to be a lot of "This is how we scaled our site" talks.  I for one found them all interesting.  Everyone solves the problem differently.<br />
<br />
Next year I am thinking about getting more specific with my talk submissions.  Some ideas include: PHP, MySQL and Large Data Sets, When is it ok to denormalize your data?, Using memcached (not so much about how it works), Index Creation (tools, tips, etc.).<br />
<br />
In closing, I want to give a big thanks to Jay Pipes and Lenz Grimmer from MySQL.  Despite Jay's luggage being lost he was still a big help with some registration issues among other things.  Both of them helped out the Phorum team a great deal this year.  Thanks guys.]]></description>
            <dc:creator>brianlmoon</dc:creator>
            <pubDate>Wed, 07 May 2008 12:48:47 -0500</pubDate>
            <category>memcached</category>
            <category>MySQL</category>
            <category>Phorum</category>
            <category>PHP</category>
            <category>Programming</category>
        </item>
        <item>
            <guid>http://brian.moonspot.net/2008/05/06/amazon-mp3-store-has-holes/</guid>
            <title>Amazon MP3 Store has holes</title>
            <link>http://brian.moonspot.net/2008/05/06/amazon-mp3-store-has-holes/</link>
            <description><![CDATA[A coworker found out <a href="http://somogyiperspective.blogspot.com/2008/05/amazon-does-not-want-my-money.html">how secure Amazon's MP3 store</a> is.  Even big guys like Amazon make errors in their web site security.<br />
<blockquote><em>So, I clicked purchase and the album immediately started downloading. It was at this point that I had the thought cross my mind: "Did I update my credit card info?"<br />
<br />
Well, no, I didn't. Before the album finished downloading, I was trying to change the method of payment. Turns out, for a digital purchase, you can't do such a thing. So, I waited and wondered was was going to come of this...</em></blockquote>]]></description>
            <dc:creator>brianlmoon</dc:creator>
            <pubDate>Tue, 06 May 2008 11:03:04 -0500</pubDate>
            <category>Apple</category>
            <category>MySQL</category>
            <category>PHP</category>
            <category>Programming</category>
            <category>Web Security</category>
        </item>
        <item>
            <guid>http://brian.moonspot.net/2008/05/06/example-mycnf-files/</guid>
            <title>Example my.cnf files</title>
            <link>http://brian.moonspot.net/2008/05/06/example-mycnf-files/</link>
            <description><![CDATA[UPDATE: There are some examples being added at the <a href="http://forge.mysql.com/tools/search.php?k=mycnf&amp;t=tag">MySQL Forge</a> now.<br />
<br />
When I first started installing MySQL for myself, it was quite handy to have the example my.cnf files in the source package.  I was a noob to the MySQL configuration.  Even after I became more experienced, I would use them as a starting point.  However, I now find that they are so behind the times they are not as useful.  Here are some of the comments from the files.<br />
<br />
<strong>my-small.cnf</strong><br />
<br />
# This is for a system with little memory (&lt;= 64M) where MySQL is only used<br />
# from time to time and it's important that the mysqld daemon<br />
# doesn't use much resources.<br />
<br />
<strong>my-medium.cnf</strong><br />
<br />
# This is for a system with little memory (32M - 64M) where MySQL plays<br />
# an important part, or systems up to 128M where MySQL is used together with<br />
# other programs (such as a web server)<br />
<br />
<strong>my-large.cnf</strong><br />
<br />
# This is for a large system with memory = 512M where the system runs mainly<br />
# MySQL.<br />
<br />
<strong>my-huge.cnf</strong><br />
<br />
# This is for a large system with memory of 1G-2G where the system runs mainly<br />
# MySQL.<br />
<br />
I end up using the large or huge files as a starting point for every server I set up by hand.  The small and medium should be renamed underpowered and teeny-tiny.  Who has less than 64MB of RAM on a server now?  Can you even buy sticks of memory that small in any modern system?  Most come with 256MB sticks minimum.  And they never come with just one stick.<br />
<br />
I will use the large example as a starting point for a server that has 2GB of RAM and will be running an entire site on one server.  I use huge for any server that runs only MySQL.  And even then, most of them have 4GB of RAM or more.<br />
<br />
I don't know if anyone at MySQL has plans on tweaking these files or not.  Perhaps those good guys at the <a href="http://www.mysqlperformanceblog.com/">MySQL Performance Blog</a> or <a href="http://www.percona.com/">Percona</a> could create some example my.cnf files.  I could put some out there, but I fear their sole purpose would be for someone to point out what I am doing wrong. =P  Hey, they work for me.  Hmm, maybe this would make a good <a href="http://forge.mysql.com/">MySQL Forge</a> section.  A whole area of user contributed my.cnf files.  They could be architecture specific and everything.  What runs best on Solaris?  Linux?  BSD?  Windows?  32-bit?  64-bit?<br />
<br />
One thing I would for sure like to see is example files for InnoDB dominant servers.  Most of our servers all run primariy InnoDB tables.  None of these above examples covers InnoDB.  They have comments, but no preconfigured values.  I have seen more than one server using InnoDB tables without any custom configuration in their my.cnf.  In the end that is the fault of the server admin/owner no doubt.<br />
<br />
What do you say?  Anyone up for a MySQL Forge section for my.cnf files?]]></description>
            <dc:creator>brianlmoon</dc:creator>
            <pubDate>Tue, 06 May 2008 10:17:00 -0500</pubDate>
            <category>Linux</category>
            <category>MySQL</category>
            <category>Programming</category>
        </item>
        <item>
            <guid>http://brian.moonspot.net/2008/05/03/interview-with-webdevradio/</guid>
            <title>Interview with WebDevRadio</title>
            <link>http://brian.moonspot.net/2008/05/03/interview-with-webdevradio/</link>
            <description><![CDATA[While I was at the MySQL Conference, I sat down with Michael Kimsal of <a href="http://www.webdevradio.com/index.php">WebDevRadio</a> and <a href="http://www.webdevradio.com/index.php?id=74">recapped the two talks</a> that I gave at the conference.  I have uploaded the slides so you can follow along if you want.<br />
<br />
<a href="http://content.dealnews.com/files/one_to_cluster.pdf">One to a Cluster</a> - The evolution of the dealnews.com architecture.<br />
<br />
<a href="http://content.dealnews.com/files/phorum_mysql_tricks.pdf">MySQL Tips and Tricks</a> - Some simple tips and some of the more advanced SQL we use in Phorum.<br />
<br />
Thanks Michael.  Any time you need a guest, just let me know.]]></description>
            <dc:creator>brianlmoon</dc:creator>
            <pubDate>Sat, 03 May 2008 10:13:00 -0500</pubDate>
            <category>memcached</category>
            <category>MySQL</category>
            <category>Phorum</category>
            <category>PHP</category>
            <category>Programming</category>
        </item>
        <item>
            <guid>http://brian.moonspot.net/2008/05/02/embracing-the-new-communication/</guid>
            <title>Embracing the new communication</title>
            <link>http://brian.moonspot.net/2008/05/02/embracing-the-new-communication/</link>
            <description><![CDATA[As I said a while back, I started using Twitter.  I get it.  Today I had a good idea and so I created a couple of new Twitter feeds.  If you are a big fan of my day job, you might want to look at <a href="http://tinyurl.com/6hznd3">http://tinyurl.com/6hznd3</a> and <a href="http://tinyurl.com/6f83rl">http://tinyurl.com/6f83rl</a>.  We will see where it goes from here]]></description>
            <dc:creator>brianlmoon</dc:creator>
            <pubDate>Fri, 02 May 2008 18:03:12 -0500</pubDate>
            <category>Programming</category>
            <category>Web 2.0</category>
        </item>
        <item>
            <guid>http://brian.moonspot.net/2008/04/22/playing-with-mysql-index-merge/</guid>
            <title>Playing with MySQL's index merge</title>
            <link>http://brian.moonspot.net/2008/04/22/playing-with-mysql-index-merge/</link>
            <description><![CDATA[So, I mentioned before that I found out about <a href="http://dev.mysql.com/doc/refman/5.0/en/index-merge-optimization.html">index_merge</a> at the MySQL Conference. I was wondering why I had not heard more about it since it came out in 5.0.3. When talking with some MySQL people about it, I received mixed results. So, I decided to kind of run my own tests on some data and see what I could figure out.<br>
<br>
I apologize for Wordpress' bad output. =(<br>
<br>
<strong>The Data</strong><br>
<br>
I created a table with 5 million rows. Early tests with MySQL's Harrison Fisk (HarrisonF) over my shoulder with small data sets showed MySQL would optimize out the indexes in favor of table scans. I wanted to avoid that. This is my table schema:<br>
<br>
<code><br>
CREATE TABLE `test2` (<br>
`id1` int(10) unsigned NOT NULL default '0',<br>
`id2` int(10) unsigned NOT NULL default '0',<br>
`id3` int(10) unsigned NOT NULL default '0',<br>
`dt` datetime NOT NULL default '0000-00-00 00:00:00',<br>
`somevar` varchar(255) NOT NULL default '',<br>
KEY `id1` (`id1`),<br>
KEY `id2` (`id2`)<br>
) ENGINE=MyISAM<br>
</code><br>
<br>
The field id1 was filled with random vaules between 1 and 5000. I filled id2 with random values between 1 and 100, except that about half the data has the value 999 in it. This was to emulate the issue we were seeing on the smaller table. We found that if a value was in more than n% of the rows, the optimizer would skip the index. I wanted to test that on larger data sets. id3 was filled with random values between 1 and 1000000. dt was a random date/time between 1999 and 2008. and somevar was a random string chars.<br>
<br>
<strong>Intersect Merges</strong><br>
<br>
<code><br>
mysql&gt; explain select count(*) from test2 where id2=99 and id1=4795;<br>
+----+-------------+-------+-------------+---------------+---------+---------+------+------+----------------------------------------------------+<br>
| id | select_type | table | type    | possible_keys | key   | key_len | ref | rows | Extra                       |<br>
+----+-------------+-------+-------------+---------------+---------+---------+------+------+----------------------------------------------------+<br>
| 1 | SIMPLE   | test2 | index_merge | id1,id2    | id1,id2 | 4,4   | NULL |  3 | Using intersect(id1,id2); Using where; Using index |<br>
+----+-------------+-------+-------------+---------------+---------+---------+------+------+----------------------------------------------------+<br>
</code><br>
<br>
This is the most basic of example. MySQL uses the two indexes, finds where they intersect and merges the data together. This query is quite fast, although a key on the two together would be faster. If you have this showing up a lot, you probably need to combine the two keys into one. I should also note that in this example, only the keys are needed, no data from the tables. This is important.<br>
<br>
<code><br>
mysql&gt; explain select sql_no_cache somevar from test2 where id2=99 and id1=4795;<br>
+----+-------------+-------+------+---------------+------+---------+-------+------+-------------+<br>
| id | select_type | table | type | possible_keys | key | key_len | ref  | rows | Extra    |<br>
+----+-------------+-------+------+---------------+------+---------+-------+------+-------------+<br>
| 1 | SIMPLE   | test2 | ref | id1,id2    | id1 | 4    | const | 930 | Using where |<br>
+----+-------------+-------+------+---------------+------+---------+-------+------+-------------+<br>
</code><br>
<br>
As you see, as soon as we ask for data that is not in the indexes, our intersect is dropped in favor of using the key with the least values and simply scanning on those to match the rest of the where clause. This was the case pretty much every time I tried it. I was never able to use an index_merge with intersect when requesting data not available in the key.<br>
<br>
<strong>Union Merges</strong><br>
<br>
<code><br>
explain select sql_no_cache somevar from test2 where id2=99 or id1=4795;<br>
+----+-------------+-------+-------------+---------------+---------+---------+------+-------+-----------------------------------+<br>
| id | select_type | table | type    | possible_keys | key   | key_len | ref | rows | Extra               |<br>
+----+-------------+-------+-------------+---------------+---------+---------+------+-------+-----------------------------------+<br>
| 1 | SIMPLE   | test2 | index_merge | id1,id2    | id2,id1 | 4,4   | NULL | 27219 | Using union(id2,id1); Using where |<br>
+----+-------------+-------+-------------+---------------+---------+---------+------+-------+-----------------------------------+</code><br>
<br>
mysql&gt; select sql_no_cache somevar from test2 where id2=99 or id1=4795;<br>
26237 rows in set (0.20 sec)<br>
<br>
This merge type takes to keys involved in an OR and then merges the data much like a UNION statement would. As you can see, in this case, it did use the index even though we requested `somevar` that is not in the index.<br>
<br>
To show the alternative to this, I selected using id3 instead of id1. id3 has no index.<br>
<br>
<code><br>
mysql&gt; explain select sql_no_cache somevar from test2 where id2=99 or id3=266591;<br>
+----+-------------+-------+------+---------------+------+---------+------+---------+-------------+<br>
| id | select_type | table | type | possible_keys | key | key_len | ref | rows  | Extra    |<br>
+----+-------------+-------+------+---------------+------+---------+------+---------+-------------+<br>
| 1 | SIMPLE   | test2 | ALL | id2      | NULL | NULL  | NULL | 5000000 | Using where |<br>
+----+-------------+-------+------+---------------+------+---------+------+---------+-------------+</code><br>
<br>
mysql&gt; select sql_no_cache somevar from test2 where id2=99 or id3=266591;<br>
25252 rows in set (26.01 sec)<br>
<br>
As you can see, this does a table scan even though there is a key on id2. It does you know good.<br>
<br>
<strong>Sort Union Merge</strong><br>
<br>
<code><br>
mysql&gt; explain select sql_no_cache id1, id2 from test2 where id2=99 or id1 between 4999 and 5000;<br>
+----+-------------+-------+-------------+---------------+---------+---------+------+-------+----------------------------------------+<br>
| id | select_type | table | type    | possible_keys | key   | key_len | ref | rows | Extra                 |<br>
+----+-------------+-------+-------------+---------------+---------+---------+------+-------+----------------------------------------+<br>
| 1 | SIMPLE   | test2 | index_merge | id1,id2    | id2,id1 | 4,4   | NULL | 44571 | Using sort_union(id2,id1); Using where |<br>
+----+-------------+-------+-------------+---------------+---------+---------+------+-------+----------------------------------------+</code><br>
<br>
mysql&gt; select sql_no_cache somevar from test2 where id2=99 or id1 between 4999 and 5000;<br>
27295 rows in set (0.19 sec)<br>
<br>
This behaves much like the union merge. However, because one index is using a range, MySQL must first sort one index and then merge the two. Again, if I switch this to an AND instead of an OR, index_merge is not used in favor of scanning the id2 indexed data for matches to the rest of the where clause.<br>
<br>
<strong>Conclusion</strong><br>
<br>
Hmm, after all this, I see why this was not a big announcement. It can only make bad SQL and tables better. Tables and queries that are already optimized using composite indexes will see no benefit from this. At best this will help me with some one off queries or reports that are only run monthly where I don't want to pollute the indexes with special cases just for those queries.]]></description>
            <dc:creator>brianlmoon</dc:creator>
            <pubDate>Tue, 22 Apr 2008 13:44:31 -0500</pubDate>
            <category>MySQL</category>
            <category>Programming</category>
        </item>
        <item>
            <guid>http://brian.moonspot.net/2008/04/17/2008-mysql-conference-part-1/</guid>
            <title>2008 MySQL Conference, part 1</title>
            <link>http://brian.moonspot.net/2008/04/17/2008-mysql-conference-part-1/</link>
            <description><![CDATA[It is always surprising what I learn when I go to a conference these days.  Years ago, I could go to any talk and just suck it all in.  Now, it is the little nuggets.  The topics as a whole do more to confirm what I have already developed while running the <a href="http://www.phorum.org/">Phorum</a> project and building the infastructure for <a href="http://dealnews.com/">dealnews.com</a>.  That confirmation is still nice.  You know you are not the only one that thought a particular solution was a good idea.<br />
<br />
One of the confirmations I have had is that the big sites like Flickr, Wikipedia, Facebook and others don't use exotic setups when it comes to their hardware and OS.  During a keynote panel, they all commented that they did not do any virtualization on their servers.  Most did not use SANs.  Some ran older MySQL versions but some were running quite recent versions.  I have kept thinking that I did not have the desire to get to fancy with that stuff and clearly I am not the only one.<br />
<br />
One of the little nuggets that will likely change my world is <a href="http://dev.mysql.com/doc/refman/5.0/en/index-merge-optimization.html">index_merge in MySQL</a>.  I feel silly as this has been around since 5.0.3 but I was not aware of it.  Basically MySQL will now use more than one key to resolve a where clause and possibly an order by depending on the query.  This could lead to me removing several keys from tables in both Phorum and at dealnews.<br />
<br />
There were others, but I am tired and trying to get OpenID into the Phorum trunk right now so I will have to think of more later.]]></description>
            <dc:creator>brianlmoon</dc:creator>
            <pubDate>Thu, 17 Apr 2008 16:43:50 -0500</pubDate>
            <category>MySQL</category>
            <category>Phorum</category>
            <category>PHP</category>
            <category>Programming</category>
        </item>
        <item>
            <guid>http://brian.moonspot.net/2008/03/06/local-best-practices-for-sql-backed-web-applications/</guid>
            <title>Local: Best practices for SQL backed web applications</title>
            <link>http://brian.moonspot.net/2008/03/06/local-best-practices-for-sql-backed-web-applications/</link>
            <description><![CDATA[<b>When</b><br />
Tuesday, March 11, 2008 at 12:00 PM<br />
<br />
<b>Where</b><br />
<a href="http://www.biztech.org/">BizTech</a><br />
515 Sparkman Drive<br />
Huntsville , AL 35816<br />
<br />
<b>Details</b><br />
Brian Moon of <a href="http://dealnews.com/">dealnews.com</a> will be discussing best practices for writing database backed web based applications. Many users teach themselves SQL and programming on the web. Other developers may have experience in enterprise desktop applications. No matter what your background, there are common mistakes made when deploying web based applications that use a database.<br />
<br />
Also, at this event, we will be giving away two copies of <a href="http://www.nusphere.com/products/phped.htm">NuSphere's PhpED</a>. Plus, everyone who attends can purchase any NuSphere product at 50% off.<br />
<br />
Lunch will be served at this event.]]></description>
            <dc:creator>brianlmoon</dc:creator>
            <pubDate>Thu, 06 Mar 2008 13:08:35 -0600</pubDate>
            <category>Linux</category>
            <category>memcached</category>
            <category>MySQL</category>
            <category>PHP</category>
            <category>Programming</category>
        </item>
        <item>
            <guid>http://brian.moonspot.net/2008/03/03/people-really-do-run-php-on-windows/</guid>
            <title>People really do run PHP on Windows</title>
            <link>http://brian.moonspot.net/2008/03/03/people-really-do-run-php-on-windows/</link>
            <description><![CDATA[One of my favorite restaurants these days is Buffalo Wild Wings.  They show the UFC fights.  It is cheaper to go there than to throw a party at the house.  I went there tonight to get some nutritional information for last nights snacks, I got this:<br />
<br />
<code>Fatal error: Maximum execution time of 30 seconds exceeded in C:Inetpubwwwrootindex.php on line 3</code><br />
<br />
Dang.  I hate that for them.  I am sure they just pay someone to host their site.  Maybe it will clear up soon.  Someone should key them into how to turn display_errors to off.]]></description>
            <dc:creator>brianlmoon</dc:creator>
            <pubDate>Sun, 02 Mar 2008 22:21:45 -0600</pubDate>
            <category>MySQL</category>
            <category>PHP</category>
            <category>Programming</category>
        </item>
        <item>
            <guid>http://brian.moonspot.net/2008/02/21/forums-are-the-red-headed-step-child-of-a-web-site/</guid>
            <title>Forums are the red headed step child of a web site</title>
            <link>http://brian.moonspot.net/2008/02/21/forums-are-the-red-headed-step-child-of-a-web-site/</link>
            <description><![CDATA[I have seen it time and time again.  And yet, every time, it irritates me to no end.  You are on a professional web site.  You are navigating around and at some point you hit the link for their forums.  And just like that you feel transported to another place.  The whole site design just changes.  Colors, layout, navigation... everything.  Here are some examples, including the new C7Y site from php|Architect which inspired this post. (I really do love you guys on the podcast I promise =)<br />
<ul><br />
	<li>php|architect's C7Y - <a href="http://c7y.phparch.com/">main site</a> - <a href="http://c7y-bb.phparchitect.com/">forums</a></li><br />
	<li>Zend's Developer Zone - <a href="http://devzone.zend.com/public/view">main site</a> - <a href="http://www.zend.com/forums/">forums</a><br />
Zend's forums do at least use the Zend.com header, but you can't get to the forums from the main Zend.com site.  You have to go to the Developer Zone.</li><br />
	<li>TextPad (great windows editor) - <a href="http://www.textpad.com/">main site</a> - <a href="http://forums.textpad.com/index.php">forums</a><br />
The header is kind of the same.  Fonts and link colors change slightly though which is worse in some ways than a wholesale change.  It looks like they just wedged in their HTML into the phpBB template.</li><br />
</ul><br />
I could continue to list some here, but you get the idea.   So, what is the problem?  Does most message board software make it too hard to edit their templates?  Are forums an after thought and some underling is given the task to make them work and not allowed access to the main site's templates?<br />
<br />
Some people do better at it.  <a href="http://forums.mysql.com/">MySQL</a> for example.  Theirs is still not perfect.  An ad awkwardly appears in the forums in a way that makes it look like an error.  However, thanks to <a href="http://www.phorum.org/">Phorum</a> (cha-ching), MySQL was able to make their own log in system work with their forums.  Heck, even at <a href="http://forums.dealnews.com/">dealnews</a> I have not done that.  Mostly because our forum logins predate our site accounts for email alerts and newsletters.  I am not asking for perfection though.  I would just like to feel like the company/entitiy gave some love to making their forums part of their site and not an afterthought.<br />
<br />
So, I call for all web sites to start treating their forums like real pages.  Give them the same love and attention you give that front page or any other page.  And, if your message board software makes that hard, give Phorum a try.]]></description>
            <dc:creator>brianlmoon</dc:creator>
            <pubDate>Wed, 20 Feb 2008 18:07:44 -0600</pubDate>
            <category>Design</category>
            <category>HTML</category>
            <category>MySQL</category>
            <category>Phorum</category>
            <category>PHP</category>
            <category>Programming</category>
        </item>
        <item>
            <guid>http://brian.moonspot.net/2008/02/20/speaking-at-mysql-conference-2008/</guid>
            <title>Speaking at MySQL Conference 2008</title>
            <link>http://brian.moonspot.net/2008/02/20/speaking-at-mysql-conference-2008/</link>
            <description><![CDATA[I had mentioned a while back that I <a href="http://doughboy.wordpress.com/2007/10/31/mysql-conference-submissions/">submitted three proposals</a> for the <a href="http://en.oreilly.com/mysql2008/public/content/home">2008 MySQL Conference</a>.  Well, two were accepted.<br />
<br />
<b>From one server to a cluster</b><br />
<br />
In the last 10 years, dealnews.com has grown from a single shared hosting account to an entire rack of equipment. Luckily, we started using PHP and MySQL very early in the company's history.<br />
<br />
From the early days of growing a forum to surviving Slashdotting, Digging and even a Yahoo! front page mention, we have had to adapt both our hardware and software many times to keep up with the growth.<br />
<br />
I will discuss the traps, bottlenecks, and even some big wins we have encountered along the way using PHP and MySQL. From the small scale to using replication and even some MySQL Cluster.  We have done many interesting things to give our readers (and our content team) a good experience when using our web site.<br />
<br />
<b>MySQL hacks and tricks to make Phorum fast</b><br />
<br />
Phorum is the message board software used by MySQL. One reason they chose Phorum was because of its speed. We have to use some tricks and fancy SQL to make this happen. Things we will talk about in this session include:<br />
<ul><br />
	<li>Using temporary tables for good uses.</li><br />
	<li> Why PHP and MySQL can be a bad mix with large data sets.</li><br />
	<li>What mysqlnd will bring to the table with the future of PHP and MYSQL.</li><br />
	<li>How Phorum uses full text indexing and some fancy SQL to make our search engine fast.</li><br />
	<li>Forcing MySQL to use indexes to ensure proper query performance.</li><br />
</ul><br />
You can find <a href="http://en.oreilly.com/mysql2008/public/schedule/speaker/66">my conference page</a> here.  (as <a href="http://terrychay.com/blog/">Terry</a> would say, me, me, me!)]]></description>
            <dc:creator>brianlmoon</dc:creator>
            <pubDate>Wed, 20 Feb 2008 15:12:01 -0600</pubDate>
            <category>Linux</category>
            <category>memcached</category>
            <category>MySQL</category>
            <category>Phorum</category>
            <category>PHP</category>
            <category>Programming</category>
        </item>
        <item>
            <guid>http://brian.moonspot.net/2008/02/13/apache-worker-and-php/</guid>
            <title>Apache Worker and PHP</title>
            <link>http://brian.moonspot.net/2008/02/13/apache-worker-and-php/</link>
            <description><![CDATA[The PHP manual basically <a href="http://www.php.net/manual/en/faq.installation.php#faq.installation.apache2">tells you not to use Apache 2 with a threaded MPM</a> and PHP as an Apache module.  In general, it may be good advice.  But, at <a href="http://dealnews.com/">dealnews.com</a>, we have found it very valuable.<br />
<br />
<b>Apache threaded MPMs</b><br />
<br />
Well, first, what is an MPM?  It stands for Multi-Processing Module.  It is the process model that Apache uses for its children process.  Each request that comes in is handed to a child.  Apache 1 used only one model for this, the prefork model.  That uses one process per Apache child.  The most commonly used threaded MPM is the Worker MPM.  In this MPM, you have several processes that run multiple threads within it.  This is the one I will be talking about.  You can read more on Apache MPMs at the <a href="http://httpd.apache.org/docs/2.0/mpm.html">Apache web site</a>.<br />
<br />
<b>Huge memory savings</b><br />
<br />
With the Apache prefork or even FastCGI, each apache/php process allocates its own memory.  Most healthy sites I have worked on use about 15MB of memory per apache process.  Code that has problems will use even more than this.  I have seen some use as much as 50MB of RAM.  But, lets stick with healthy.  So, a server with 1GB of RAM will only realistically be able to run 50 Apache processes or 50 PHP children for FastCGI if each uses 15MB or RAM.  That is 750MB total.  That leaves just 256MB for the OS and other applications.  Now, if you are <a href="http://marc.info/?l=php-internals&amp;m=113891145109208&amp;w=2">Yahoo!</a> or someone else with lots of money and lots of equipment, you can just keep adding hardware.  But, most of us can't do that.<br />
<br />
As I wrote above, the worker MPM apache uses children (processes) and threads.  If you configure it to use 10 child processes, each with 10 threads you would have 100 total threads or clients to answer requests.  The good news is, because 10 threads are in one process, they can reuse memory that is allocated by other threads in the same process.  At dealnews, our application servers use 25 threads per child.  In our experience, each child process uses about 35MB of RAM.  So, that works out to about 1.4MB per thread.  That is 10% the usage for a prefork server per client.<br />
<br />
Some say that you will run out of CPU way before RAM.  That was not what we experienced before switching to worker.  Machines with 2GB of RAM were running out of memory before we hit CPU as a bottleneck due to having just 100 Apache clients running.  Now, with worker, I am happy to say that we don't have that problem.<br />
<br />
<b>Building PHP for best success with Worker</b><br />
<br />
This is an important part.  You can't use radical extensions in PHP when you are using worker.  I don't have a list of extensions that will and won't work.  We stick with the ones we need to do our core job.  Mainly, most pages use the mysql and memcached extension.  I would not do any fancy stuff in a worker based server.  Keep a prefork server around for that.  Or better yet, do funky memory sucking stuff in a cron job and push that data somewhere your web servers can get to it.<br />
<br />
<b>Other benefits like static content</b><br />
<br />
Another big issue you hear about with Apache and PHP is running some other server for serving static content to save resources.  Worker allows you to do this without running two servers.  Having a prefork Apache/PHP process that has 15MB of RAM allocated serve a 10k jpeg image or some CSS file is a waste of resources.  With worker, like I wrote above, the memory savings negate this issue.  And, from my benchmarks (someone prove me wrong) Apache 2 can keep up with the lighttpds and litespeeds of the world in terms of requests per second for this type of content.  This was actually the first place we used the worker mpm.  It may still be a good idea to have dedicated apache daemons running just for that content if you have lots of requests for it.  That will keep your static content requests from over running your dynamic content requests.<br />
<br />
<b>Some issues we have seen</b><br />
<br />
Ok, it is not without problems (but, neither was prefork).  There are some unknown (meaning undiagnosed by us) things that will occasionally cause CPU spikes on the servers running worker.  For example, we took two memcached nodes offline and the servers that were connected to them spiked their CPU.  We restarted Apache and all was fine.  It was odd.  We had another issue where a bug in my PHP code that was calling fsockopen() without a valid host name and a long timeout would cause a CPU spike and would not seem to let go.  So, it does seem that bad PHP code makes the server more sensitive.  So, your mileage may vary.<br />
<br />
As with any new technology, you need to test a lot before you jump in with both feet.  Anyone else have experience with worker and want to share?<br />
<br />
<b>One last tip</b><br />
<br />
We have adopted a technique that <a href="http://lerdorf.com/bio.php">Rasmus Lerdorf</a> had mentioned.  We decide how many MaxClients a server can run and we configure that number to always run.  We set the min and max settings of the Apache configuration the same.  Of course, we are running service specific servers.  If you only have one or two servers and they run Apache and MySQL and mail and dns and... etc. you probably don't want to do that.  But, then again, you need to make sure MaxClients will not kill your RAM/CPU as well.  I see lots of servers that if MaxClients was actually reached, they would be using 20GB of RAM.  And, these servers only have 2GB of RAM.  So, check those settings.  If you can, configure it to start up more (all if you can) Apache process rather than a few and make sure you won't blow out your RAM.]]></description>
            <dc:creator>brianlmoon</dc:creator>
            <pubDate>Wed, 13 Feb 2008 01:09:32 -0600</pubDate>
            <category>Linux</category>
            <category>memcached</category>
            <category>MySQL</category>
            <category>PHP</category>
            <category>Programming</category>
        </item>
        <item>
            <guid>http://brian.moonspot.net/2008/02/08/trying-out-twitter/</guid>
            <title>Trying out Twitter</title>
            <link>http://brian.moonspot.net/2008/02/08/trying-out-twitter/</link>
            <description><![CDATA[I have decided to try out Twitter.  Mostly it is a curiousity about what it does and how it works.  The recent outages baffle me.  It seems like a very simple application.  I am hoping to find the thing that makes it a tough thing to scale out and/or up.  (Other than it being built on Rails that is.)<br />
<br />
<a href="http://twitter.com/brianlmoon">http://twitter.com/brianlmoon</a>]]></description>
            <dc:creator>brianlmoon</dc:creator>
            <pubDate>Thu, 07 Feb 2008 19:47:59 -0600</pubDate>
            <category>Personal</category>
            <category>Programming</category>
        </item>
        <item>
            <guid>http://brian.moonspot.net/2008/02/07/managing-two-data-centers/</guid>
            <title>Managing two data centers</title>
            <link>http://brian.moonspot.net/2008/02/07/managing-two-data-centers/</link>
            <description><![CDATA[Call it paranoia.  Call it being prepared.  Whatever your stance, we are considering using more than one data center for <a href="http://dealnews.com/">dealnews.com</a>.  It is not a capacity issue.  We can keep growing our current data center without a problem.  But, stories of power outages and power outages we have experience have us wanting to explore the idea.<br />
<br />
Here is the problem.  No one in our company has experience with this.  And, there does not seem to be any resources on the internet talking about this.  Our problems are not so much with managing the data between the two.  The problem is failover and how to deal with one data center being out.  Here are some of the ideas that have been thrown on to the wall.<br />
<br />
<b>Round Robin DNS</b><br />
<br />
This was the first idea.  It seems simple enough.  We have two data centers.  We publish different DNS for each data center and traffic goes to each one.  The problem here is that it is, well, random.<br />
<br />
<b>Global Traffic Management</b><br />
<br />
There are devices that "balance" traffic  across multiple different locations.  But, I am unsure how those deal with outages at one of the locations.  It seems like there is still one point of failure.<br />
<br />
<b>BGP Routing</b><br />
<br />
This is the biggest mystery to me.  I know what it is.  I know what it means.  I have no idea how to deploy this type of solution.  I understand that you can "move" your IP addresses with routing changes.  But, that means running routers.  Where are these routers?  Does this happen at some provider?  Is there a provider that handles this?  Does that mean that all of our data centers are with one provider?  I think one more peace of mind feature of this is that we would not be tied to just one vendor.  So, if one vendor had major issues or there was some legal troubles (we lived through the dot come boom and bust) we would have security in knowing we had other equipment that was not affected.<br />
<br />
Is there something else?  Are we being way paranoid?  Maybe it is not cost effective in the end.  I/we have no idea really.  Anyone out there that has knowledge on this subject?]]></description>
            <dc:creator>brianlmoon</dc:creator>
            <pubDate>Thu, 07 Feb 2008 13:53:38 -0600</pubDate>
            <category>Linux</category>
            <category>MySQL</category>
            <category>PHP</category>
            <category>Programming</category>
        </item>
        <item>
            <guid>http://brian.moonspot.net/2008/02/04/how-not-to-get-support-and-how-to-turn-the-other-cheek/</guid>
            <title>How NOT to get support and how to turn the other cheek.</title>
            <link>http://brian.moonspot.net/2008/02/04/how-not-to-get-support-and-how-to-turn-the-other-cheek/</link>
            <description><![CDATA[So, I checked my email this morning and found this jewel:<br />
<blockquote>I might use Phorum if you brain deads knew how to upload or download your files via FTP. Your documentation has no order to it, its all a mess. I even dropped a release level to see if it was just that release. Ill give you a clue, DONT TRANSFER YOUR FILES VIA AUTO, EXPECIALY YOUR TXT FILES. TRANSFER THEM IN ASCII MODE ONLY, THIS INCLUDES YOUR PHP FILES. Then you just f---ing* MIGHT get readable files. Now you might say hey wait a min, we have full documentation on our web site, but you forget, someone has to open the sample.config.php file and read the crap that resides there.</blockquote><br />
<blockquote><i>* edited for content</i></blockquote><br />
Should I respond?  If so, how?  I decided to respond in as nice a way as I could.<br />
<blockquote> I normally don't answer direct support emails.  Neither do I normally  answer very angry emails.  However, I view this as an educational  experience.<br />
<br />
Judging by your email, I would say you are using Notepad on Windows to  edit and read files.  That is mistake number one.  Notepad only reads  one file format: Windows text files.  Windows natively uses a CRLF for  it's line endings.  It is the only operating system that does so.  Notepad is the only application on the Windows platform that only reads  that format.  If you would use Wordpad instead, this would not have been  a problem for you.  For some reading on the subject, you may want to read:<br />
<br />
<a href="http://en.wikipedia.org/wiki/Newline" class="moz-txt-link-freetext">http://en.wikipedia.org/wiki/Newline</a><br />
<a href="http://www.cs.toronto.edu/%7Ekrueger/csc209h/tut/line-endings.html" class="moz-txt-link-freetext">http://www.cs.toronto.edu/~krueger/csc209h/tut/line-endings.html</a><br />
<br />
Because PHP scripts are most commonly deployed on a Linux platform, the  Unix line feed (LF or \n) is best for PHP applications.  Here are some  suggestions for some great text editors for Windows.<br />
<br />
TextPad - <a href="http://www.textpad.com/" class="moz-txt-link-freetext">http://www.textpad.com/</a><br />
Metapad - <a href="http://www.liquidninja.com/metapad/" class="moz-txt-link-freetext">http://www.liquidninja.com/metapad/</a><br />
PSPad   - <a href="http://www.pspad.com/en/" class="moz-txt-link-freetext">http://www.pspad.com/en/</a><br />
<br />
I hope this has helped educate you on the world of new lines and how  real programming works.  In the future, a kind word in the forums would  be much more appreciated than an email like this.  Not all people would  be as kind as I am being and want to help you grow.</blockquote><br />
What do you think?  Should have just let this guy go?  Should have been as ugly to him as he was to me?]]></description>
            <dc:creator>brianlmoon</dc:creator>
            <pubDate>Mon, 04 Feb 2008 10:23:30 -0600</pubDate>
            <category>Linux</category>
            <category>MySQL</category>
            <category>Phorum</category>
            <category>PHP</category>
            <category>Programming</category>
        </item>
        <item>
            <guid>http://brian.moonspot.net/2008/01/17/responsible-use-of-the-_request-variable/</guid>
            <title>Responsible use of the $_REQUEST variable.</title>
            <link>http://brian.moonspot.net/2008/01/17/responsible-use-of-the-_request-variable/</link>
            <description><![CDATA[A recent <a href="http://marc.info/?l=php-internals&amp;m=119956617516891&amp;w=2">thread split</a> on the PHP Internals list has been about the use of the $_REQUEST variable.  I have seen more than one person make the following logic mistake:<br />
<ol><br />
	<li>I may get data via GET</li><br />
	<li>I may get data via POST</li><br />
	<li>Ah, I should use $_REQUEST as it will catch both.</li><br />
</ol><br />
There is a problem with that logic.  Cookies!  Cookies are also put ino $_REQUEST.  In fact, they are put into $_REQUEST last.  So, any data that was sent via GET or POST is overwritten by cookies of the same name.<br />
<br />
When does this cause a problem?  Well, let's say you have a script that has a form that asks for a user name.  You call the field username.  So, you are looking for that data in $_REQUEST.  Unknown to you, another member of your team makes a cookie named username on a totally unrelated application.  His cookie needs to be accessible from several parts of the site, so he assigned the cookie to the path /.  So, now, when a user submits your form, the data comes in looking like this:<br />
<code><br />
$_GET["username"] = "user input";<br />
$_COOKIE["username"] = "Tom";<br />
$_REQUEST["username"] = "Tom";<br />
</code><br />
<br />
So, now you have bad data for the username you wanted.  This becomes even more menacing when you start thinking about security issues like XSS or CRSF.  As Stefan Esser, a strong PHP Security advocate, wrote <a href="http://marc.info/?l=php-internals&amp;m=120056333422186&amp;w=2">in another reply to the thread</a>:<br />
<blockquote><font color="#000080"> Just imagine my example...<br />
<code><br />
switch ($_REQUEST['action'])<br />
{<br />
case 'logout':<br />
logout();<br />
break;<br />
...<br />
}<br />
</code><br />
When someone injects you a cookie like   +++action=logout   through an<br />
XSS or through a feature like  foobar.co.kr can set cookies for *.co.kr<br />
(in FF atleast).<br />
Then you CANNOT use the application anymore. This is a DOS. You cannot<br />
defeat this problem except detecting and telling the user to delete his<br />
cookies manually...</font></blockquote><br />
Yikes!  So, now you have all kinds of problems with using $_REQUEST.<br />
<br />
So, what is the best way to handle both GET and POST data?  Well, here are a couple options.<br />
<br />
<b>Merge GET and POST data</b><br />
<br />
You could use array_merge() to merge the $_GET and $_POST variables into one.  I would use a new variable for this data.  You can overwrite super globals.  Some think it is a bad idea.  I can't argue that it could cause confusion if you did this in an environment where several parts of the application are going to be using user input.  If you do want to do this you could do the following.<br />
<code><br />
$user_input = array_merge($_GET, $_POST);<br />
// or overwrite $_REQUEST - not recommended<br />
$_REQUEST = array_merge($_GET, $_POST);<br />
</code><br />
<br />
<b>Use GET OR POST, not both</b><br />
<br />
I personally like to only use <b>either</b> $_GET or $_POST.  I have very rarely seen a case where using both made sense.  I normally favor $_POST if it is set.<br />
<br />
<code>if(!empty($_POST)){<br />
$user_input = $_POST;<br />
} elseif {<br />
$user_input = $_GET;<br />
}<br />
</code><br />
<br />
Now we have a save array that can be used and we know that the data only came from one place.]]></description>
            <dc:creator>brianlmoon</dc:creator>
            <pubDate>Thu, 17 Jan 2008 10:52:31 -0600</pubDate>
            <category>Phorum</category>
            <category>PHP</category>
            <category>Programming</category>
        </item>
    </channel>
</rss>
