DirecTV not ready for college football season

I thought my PHP/MySQL friends would get a kick out of another story of a company not being ready for technical demand.  I live in Alabama (southeastern USA).  For those that do not realize, we like our college football a lot (American football that is).  University of Alabama football has a particularly large buzz this year due to the hiring of our new coach, Nick Saban.  The first game of the year was this past Saturday.  It was not on network or cable television since it was against a no name team.  It was however available on Pay-Per-View for $29 or something (I don't even know, I just ordered it blind).  Since getting tickets for me and my family was not really an option, I decided to just order the game and watch from the comfort of my home.  Directv tells you (a lot) that the easiest way to order sports events is their web site.  So, I tried logging in to my account at their web site.  I tried for an hour.  The authentication servers were down or something (it was like playing WoW).  So, I ended up calling.  Their automated systems tell me that the best way to order an event is online.  Uh, tried that.  So, I go through their automated system.  After many odd beeps, it says to hold for a real live person!  After holding for 20 minutes (I knew better than to hang up) I was greeted very nicely by a woman.  She asked what she could do for me and I told her I wanted to order the Alabama game.  Her reply?  "I should have known!"  Apparently, they had a whole bank of 100 people answering calls from our region of the country and still had their queue of 50 lines full since 6AM EDT.  My call was at 6:30 PM EDT.  Not only was directv.com not working, but their automated phone system could not deal with the volume.  Hello?!?  Its BAMA and its SABAN.  You should have been ready.  The lady on the phone was very nice and took care of me quickly.  I feel for those folks in that call center.  Their tech people, or more likely the people that should have informed the tech people and given them a budget, should have known this was coming for months.

ROLL TIDE!

Out with cluster, hello replication

Well, I have written a good bit about MySQL Cluster this year.  We had been using it as a sort of pregenerated cache for our forward facing web servers.  However, we have decided on a different route.

Why the change

With normal MySQL, configuration can make big performance differences.  With cluster, it can make the cluster shut down. We woke up one morning to errors about REDO logs being overloaded.  It had been overloaded for about 8 hours.  We had made some changes the day before, but they all worked fine on our test cluster.  So, we shut down the processes that were new and even shut off all other processes that were loading data into the servers.  4 hours later, the simplest insert would still complain about the REDO logs.  The only thing that cleared it up was a full rolling restart of the storage nodes.  That took 5 hours.  Luckily, we were still operating with a single server as a backup that had the same data stored in InnoDB tables.  We switched to it before we started the rolling restart. So, after all that was done, we really became worried about the ups and downs of cluster.  We started thinking that maybe cluster was not ready for us and the use we had in mind.

Enter Replication

We had tried replication years ago with not a lot of success.  We found we had to baby sit it weekly as queries would fail on the slave that did not fail on the master or indexes would get corrupted on the slave.  Just annoying things that would take up time.  But, that was MySQL 4.0 and we were trying to replicate our main database.  Faced with massive down time with cluster, an annoyance every few weeks seems like an acceptable risk.

The new setup

So, with the new  structure, we have two masters and two slaves.  We had already written code that would synchronously write data to two (or more) MySQL servers using transactions to ensure the data was written to all the hosts.  Its what was used to write the data to the cluster and InnoDB servers already.  So, now, it  just writes data to the two masters using InnoDB and then the two slaves read from them.  So far, there have been no outages whatsoever.

Scaling: Cluster vs. Replication

One of the big things we liked about MySQL Cluster was that it scaled very well.  As connecitons increased, query speed and response time did not degrade at the same rate as connections.  InnoDB on the other hand linearly degrades.  Basically, an InnoDB server with 128 concurent connections returns data twice as slow as it does with 64 connections.  Cluster on the other hand would only see about a 10% reduction in performance.  But, with InnoDB and replication, we can more quickly add new nodes and scale out in an emergency.  Once the hardware is in place, we could have new nodes online in about 30 minutes.  If we hit a wall with cluster, it would take us a day or two to bring the server online after the hardware was in place.  Adding storage nodes would mean a full dump and reload of the database.  So, the solution we have now will not scale as well as cluster with the same equipment, but it will be easier to scale out once we need to.

O'Reilly Open Source Conference Day Two

So, day two was the cool keynote day.  Day one keynotes were from Tim O'Reilly (not that he is not cool) and the vendors sponsors.  The Intel building blocks stuff was neat, but most of it was vendor stuff IMO.

Today we had the "cool thing to here and see, but I proabably won't use it" keynote.  It was The Processing Development Environment.  It was really cool.  You can read more about it at processing.org.

The next keynote was hard for me to follow.  There were no slides he stood behind the podium the whole time.  Gnat  seemed to love it as he all told us in IRC.  You can read the guys blog at overcomingbias.com.  It was basically about overcoming the biases you have.... I think.

Interestingly, (speaking of bias) the next keynote was from Microsoft.  Coincidence?  According to the speaker, MS (or at least this guy) is really trying to make some Open Source stuff.  Time will tell.  Also, they are "working" with the OSI to get their licensing approved as Open Source licenses.  As somone in IRC said, its a win/win from them.  If they don't get approved, they can just blame the OSI for being inflexible.  Nate kind of put him on the spot about patents after his talk.  He handled it well and kind of rode the fence.

The last keynote was, for me, the pay off keynote.  Its the one I will remember from this year the most.  It was about branding.  The poor guy did not have his slides due to technical issues and still did a great job.  You can read Steve's blog at steve-yegge.blogspot.com.  Maybe he will post the slides.

I attended a couple of good sessions today.  One was about caching, mostly with APC.  But, if you stripped down the APC stuff and just took some of his concepts, you could apply some of it to lots of caching methods.  The talk was given by Gopal Vijayaraghavan of Yahoo! I don't have a URL for the site where his slides may be.  If I find it, I will post it.

Another one was about legacy PHP code.  I didn't agree with 100% of what he was saying, but if you are in the boat he described, anything is better than where you are.  The guys site is clintonrnixon.net.  Hopefully he will put of the slides and maybe a blog post about it.

The last talk that I want to tell you about was from Amy Hoy.   She gave the "When Interface Design Attacks!" again this year. Just like last year, it was brilliant.  There were new topics like web 2.0.  I was happy to see that the Phorum 5.2 template I have been working on (emerald) already included many of her recommendations.  I guess she rubbed off on me last year.  Amy has started her own consulting company.  If we need a usability and/or interface design help again (bleh, the last one was less than exciting) I will push for using her for sure.  Check out her site (linked above) for more stuff from her.

The day (and conference really) ended with parties.  We went to the Sourceforge Open Source Awards party.  phpBB won best tool for communication.  Gag me with a chicken bone.  I guess it has a large install base.  But, MySpace has lots of users too.  That does not mean its not a black eye on the internet.  Ok, MySpace is worse than phpBB for sure.  But, c'mon, I write Phorum.  I am biased (see above keynote =).  It was a popularity contest and I guess there are more kiddies to vote for them than say Pidgin which is what I voted for.  With all the trouble they have had with their name, I wonder if "Gaim" would have gotten more votes.  (see other keynote on branding =).  The phpBB team may need to see the branding keynote from this morning.  It talked about how it takes a generation to change perception about a brand.  Most people I talked to here have a negative reaction to the phpBB brand.

The rest of the night we just hung out at the party hosted by Jive Software. We use OpenFire from those guys.  I am not a big Java user on the server.  Its just one more different thing to admin in a company that is 99% GNU C apps on the servers.  But, Openfire does a damn good job with XMPP.

In closing, O'Reilly Open Source Convention was great.  I got some great ideas of stuff we should be doing.  I got confirmation of things that we are already doing.  And most important, IMO, we got to share with others how we solve problems.  As  Gopal said in his caching talk, sometimes is better to stop doing stuff and tell others what you are doing (paraphrase).

O'Reilly Open Source Conference Day One

Day one is complete.  Portland is great as always.  Its really day 1 1/2 since we got in at 1PM yesterday.  That allowed us to go to the MySQL/Zend party last night.  Great party by those guys.  Touched based with old friends and made some new ones.

I kind of session hopped today.  Of note, I attended Andi Gutmans PHP Security talk which really had little to do with PHP.  Like Larry Wall's onion metaphor, Andi presented an onion metaphor for security.  I stopped in for a while on the SOLR talk.  It looks neat.  I like that it is a REST interface to Lucene.  If we were not using Sphinx already I might take a longer look.  But, we like Sphinx and, SOLR and Lucene are Java.  Not that there is anything wrong with that, we just don't use Java a lot, so its just one more thing that would be out of the norm.  I admit I spent a good bit of time in what is being called the "hallway track" working on some code.  Work does not stop just because you are at a conference.

I got to hang out with Jay Pipes of the MySQL Community team a good bit.  We talked about the MySQL forums (which or course runs Phorum) and how they want to improve them.  They would like to see tagging, user and post rating and some other things.  Some good things will come out of that.  Hopefully they have some of the tagging stuff done already at MySQL Forge and can contribute that code to Phorum, saving us time.

I hosted the Caching for fun and profit BoF.  It was not packed, but it was a good time.  The MySQL BoF was at the same time, so we lost some folks to that I am sure.  They had beer and pizza.  Brad Fitzpatrick did come by and contribute.  Thanks Brad.  It was mostly the same stuff you get on the memcached mailing list.  "How do we expire lots of cache at once?"  Questions about different clients.  Stuff like that.  It kind of turned into a memcached BoF, but I tried to share the dealnews experience with the attendees including our MySQL Cluster pushed caching.

I have met many readers of both dealnews and this blog (hi to you) while here.  Glad to know that both my professional work and my personal work are of use to folks.  The demographic at this conference is dead on for dealnews.  Maybe I can get them to sponsor it next year.  That would be cool.

I say every year that I want to present "next year".  Something always keeps me from doing it.  Usually its just not having time to prep for it.  By the time I think about it, the call for papers has passed.  I really want to get it done this time.  We shall see I suppose.

We went to the Sun party tonight.  It was a good time.  There was beer that was free as in beer.  More hanging with friends and talking about all kinds of stuff.  Now, all you Slashdotters sit down.  I saw people from the PostgreSQL and MySQL teams drinking beer and having fun together.  OMGWTFBBQ!!!1!!  See, the people that really matter in those projects don't bicker and fight about which is better.  They just drink beer and have a good time together.

Anyhow, I will blog more after day 2.  There won't be a day 3 as I have to catch an 11:30 flight back home.  That is usually how it goes.  Not sure why they book anything on Friday really.  Even O'Reilly has its "after party" on Thursday night.  Its late, and I need sleep.

MySQL cluster and all dump 1000

So, I had written a while back: "We currently have a DataMemory of 4GB and IndexMemory of 2GB. Based on the crude methods we have to monitor it, I think we are at about 40% capacity." Boy, I was wrong.

After that post, I started looking at this more in detail because we were considering buying more RAM "just in case". I figured out how to use the super secret command "all dump 1000". The command is not documented in the MySQL documentation that I could find. I did find it in the NDB API documentation before writing this post however. Not sure why I could not find it before.

For those that still don't know how to use it, simply type "all dump 1000" from your management console. Then check your cluster log files on the management server. You will see something like this:

2007-07-20 17:56:47 [MgmSrvr] INFO -- Node 2: Data usage is 46%(90672 32K pages of total 196608)
2007-07-20 17:56:47 [MgmSrvr] INFO -- Node 2: Index usage is 1%(2006 8K pages of total 131104)
2007-07-20 17:56:47 [MgmSrvr] INFO -- Node 3: Data usage is 46%(90724 32K pages of total 196608)
2007-07-20 17:56:47 [MgmSrvr] INFO -- Node 3: Index usage is 1%(2039 8K pages of total 131104)
2007-07-20 17:56:48 [MgmSrvr] INFO -- Node 4: Data usage is 43%(86153 32K pages of total 196608)
2007-07-20 17:56:48 [MgmSrvr] INFO -- Node 4: Index usage is 1%(2016 8K pages of total 131104)
2007-07-20 17:56:48 [MgmSrvr] INFO -- Node 5: Data usage is 46%(90672 32K pages of total 196608)
2007-07-20 17:56:48 [MgmSrvr] INFO -- Node 5: Index usage is 1%(2007 8K pages of total 131104)

Anyhow, I ran that and low and behold I saw that we were at about 93% capacity. As you can see above, we have made some changes. This got me to really digging as to what the difference could have been. As far as I can tell, our use of TEXT fields in our data was causing the issue. We have several fields in our data structure that hold data larger than 256 bytes. However, they hardly ever are more than 600 bytes. Based on what I read, it seems that the data after the first 256 bytes would be stored in 2k chunks. So, our litte ~300 byte extras were being stuck into 2k chunks. That caused a huge amount of wasted space.

So, the first thing I realized was that we could use 6GB DataMemory instead of 4GB. The machines have 8GB of RAM in them. We were using only about 200MB of IndexMemory. That leaves well over 1GB of the system.

The second thing I did was carefully analyze our data. I went back to school and started counting bytes. Its easy to get lazy as a developer these days. We just use a data type we know will work and it really is no big deal. But, not in this case. I changed int to mediumint and some to even small and tiny. I realized that there were two TEXT fields I could eliminate altogether. I looked at the varchar fields and made them only as long as they need to be. In the end, we ended up at the numbers you see above, about 46% capacity or 2.76GB. We were at 3.72GB before. The removal of the two TEXT fields may have had more to do with the improvement than anything.

This was all with MySQL 5.0. 5.1 will bring better storage of the large data. For example, all data over 256 bytes, for a row, will be stored together in 2k chunks rather than one 2k chunk per column. That will likely save us 1GB of DataMemory. The other feature of 5.1 is Disk Data Tables. We are currently testing those and I will blog about them when I have more information. Early numbers look good though. We have just set our largest table (the one with the TEXT fields) as a disk data table and our DataMemory is down to just 280MB. Yeah, that is an MB. Our COO is a bit concerned with the performance hit we may see with data on disk.

Oh, one more pointer about all dump 1000. For a quick way to grab my current usage, I use this command from my prompt (on a Mac):
ssh user@mgmserver "ndb_mgm -e 'all dump 1000' && tail -n 1000 /var/lib/mysql-cluster/ndb_1_cluster.log | fgrep usage | tail -n 8";;
Use at your own risk. It may make your server catch fire. =)

Caching for fun and profit, a BoF at OS CON

Last year at O'Reilly's Open Source Conference I hosted a BoF (Birds of a Feather) about memcached. It was a popular event. So, this year, I decided to broaden the scope to caching in general. Its titled Caching for fun and profit. It will be Wednesday, July 25 from 8:30-9:30pm in Room E141.

Anything goes. We can talk about memcached, Tugela, basic file caching... whatever.

More and more web sites are finding that they need to uses caching to increase their performance. There are those of us that have solved some problems. Others that are new to these techniques have a lot of questions. This BoF is an opportunity for web developers to share their ideas on caching.

Specifically, the dealnews.com dev team can talk about the 3 main types of caching we use and where each is applicable.

HTML Purifier and Phorum

There have been several posts about HTML Purifier 2.0 lately. I did not look to closely at it until I saw this post on our Phorum support forum. Seems the creator of HTML Purifier has chosen Phorum for his site. I hope that means it met his standards for HTML and security. He has posted some questions about the Phorum core. We always welcome a fresh mind.

He is writing a module for Phorum to allow straight HTML in Phorum posts. We have an HTML module already, but its quite basic compared to what you can do with his library. Several people have wanted to use the WYSIWYG text editors that are out there. This should/could open that up to people. I don't see the Phorum core ever having one, but that is what modules are for.

MySQL Cluster SQL Tips

So, I mentioned in my MySQL Cluster post that I found out that cluster and joins don't get a long too well. There are a couple of tips I have for using joins or replacing them with other techniques that I thought I would share.

Here are some things I keep in mind when using the NDB engine. These may apply to the other engines as well.

  1. Test using the NDB engine. Testing against another engine is not adequate.

  2. Avoid JOIN if you can, test often if you can't

  3. Avoid subqueries always

  4. Select using primary keys when at all possible. Always select using a key.

  5. Be sure to include both your where and order by columns in your keys.


Try your join from several directions.

Sometimes, the way you join can affect the engine. This is more apparent with NDB than MyISAM or InnoDB. Have a look at these two queries:

select pub_articles.article_id
from pub_articles
inner join article_edition_lookup on
pub_articles.article_id=article_edition_lookup.article_id and
pub_articles.publication_id=article_edition_lookup.publication_id and
publication_date='2007-06-07'
where pub_articles.publication_id=2;


select article_id
from article_edition_lookup
inner join pub_articles using (article_id, publication_id)
where publication_id=2 and publication_date='2007-06-07';


They look similar. Same tables. Same basic criteria. In fact the explain output looks nearly identical. But, the first takes 20 seconds on cluster and the second takes .02 seconds. Now, some MySQL internals person may be able to look at this and know why, but to me, I just have to test them to see. (If anyone cares, the primary key on pub_articles is publication_id, article_id and the PK on article_edition_lookup is publication_id, publication_date, article_id).

Try using temporary tables.

If you can't get the join to be fast no matter what, try temporary tables. I have had some success using the HEAP engine to store data and then joining the two tables together. I don't have an example of that from any code I have currently written. I have mostly just used it at the mysql prompt to get data. Here is an example that uses the same tables from above and retrieves the same data using a temporary table.

create temporary table if not exists foo engine=heap
select article_id from article_edition_lookup where publication_id=2 and publication_date='2007-06-08';


select article_id
from pub_articles use key (primary)
inner join foo using (article_id)
where publication_id=2 order by first_publish_time desc;


Together that all happens in about .06 seconds. You need to realize that this does create a table, in memory, on the API node. So, this data will only be good for this one connection. If your temporary table has lots of rows, you may need to add a key to it when you create it. I am not sure I would use this method for a query that is going to be run a lot. Chances are there is a better solution if you have to resort to this technique.

Subqueries

I have not found a way to make subqueries fast. Even simple queries like the following are slow.

select article_id from pub_articles where publication_id=2 and article_id in (select article_id from article_edition_lookup where publication_id=2 and publication_date='2007-06-07' );

That is a primary key lookup on both of those tables. Yet the query takes 30 seconds to run. Just stay away from them altogether. For what it's worth, the same query takes 2 seconds on the InnoDB engine. I don't think subqueries are optimized in 5.0. This may be a known issue. I got used to not having them for so long that I never think to use them now.

Unions

I have not used UNION much with cluster. When I have it has worked well. I would use caution when using it. Be sure to test the queries before putting them in a production environment.

RAID is dying?

There is a bunch of posts on Planet MySQL this week about RAID.  This comment from Kevin Burton really kind of made me go "huh?".
You’re thinking too low level. Who cares if the disk fails. The entire shard is setup for high availability. Each server is redundant with 1-2 other boxes (depends on the number of replicas). If you have automated master promotion you’ll never notice any downtime. All the disks can fail in the server and a slave will be promoted to a new master.

Monitoring then catches that you have a failed server and you have operations repair it and put it back into production as a new slave.

Someone has to think low level.  The key phrase in there is  you have operations repair it and put it back into production as a new slave.  This tells me all I need to know.  Kevin later states that his company does in fact not operate their own equipment, but uses a provider for all their hosting.

At this point, I think this is a philosophy argument and not a real world application argument at this point.  Sure, I guess if I am Google or Yahoo I can do this.  But, for the mass majority of web sites running out there, having 4 data centers and "operations" at your beck and call is not a reality.  For real people, having a server go down is pain in the ass. Why should I want to spend a full day of labor rebuilding a server because a $200 part broke or  just got corrupted.  It takes 10 minutes to start a rebuild and maybe another 10 minutes to install a new drive if the rebuild fails.

His other argument is about performance.  Sure, its debatable whether RAID is faster or slower.  It probably depends on the application.  If your RAID is a bottle neck for your application, then you need to address that. For us, its far from the bottleneck so why bother with the downtime of having one (of our 30, not 1000) servers down.

BTW, would you rather admin 30 servers or 1000?  I think 30.

I should add that we only use RAID on servers that are used for data storage.  Losing data sucks.  For web servers we don't use RAID.  They do fit the model that Kevin describes.  We have a lot of them.  If one goes down, its ok.  Maybe Kevin's application can fit all its data on one web node.  Don't know.  I just know its right for us and I don't see a future where I won't want it on our servers.   We are even using RAID in our MySQL Cluster servers. Why?  Because I don't want to have to wait a day to get a storage node back up and running for a $200 part.

Five months with MySQL Cluster

So, the whole world changed at dealnews when Yahoo! linked us. We realized that our current infrastructure was not scaling very well. We had to make a change.

The Problem

Even though we were using all sorts of cool techniques, the server architecture was really still just a bunch of web servers all serving the same content. In addition to that, our existing systems as the time used a pull method. When a request came in, memcache was checked, if the data was not there, it was fetched from our main MySQL server. So, when there is no data in the cache or when it expires, this was very bad. Like when Yahoo! hit us. Some cache item would expire and 60,000 users would hit a page and each page would try and create the cache item.

The Solution

I was tasked with two things. Find a way to handle something like the Yahoo burst and finding a way to store the data we need to generate our web pages that was highly available and would scale. For bursting, I wrote a proxy using apache, mod_rewrite, php and memcached. I have reasons I did it this way that are not relevent to this post. Maybe more on that later.

For the data solution, I considered several things: MySQL replication, writing my own replicating memcached client, and other exotic ideas. One of the semi-exotic ideas for us was MySQL Cluster. We had not used it at all. Some things about it made us gun shy. But, we tested it and were very happy with the results.

Initial Test

With the help of Gentoo, getting a cluster up and running was really, really easy. In fact, it seemed too easy. We ran a cluster on some dev boxes at first. We did some generic testing using the PHPTestSuite from the guys at MySQL Performance Blog. What we found was that while the cluster appeared slower at low concurrent connections, it scaled much better than InnoDB (our prefered storage engine) when the concurrent connections grew.

Application Testing

So, we moved to the next step, testing our application. We discovered early on with cluster that we would have to redesign our application. Our DB was highly relational. Almost no data could be put on the site without data from other tables. We used a lot of joins. We learned (later) that joins in the cluster are not a good idea. Neither are sub-selects. So, we wrote some proof of concept scripts for our application. We were very happy. Very few issues were found. Nothing anywhere near show stopping.

Installation

We ordered our servers. Six new Dell dual-core, dual processor Opterons with a lot of memory. Two would become SQL nodes and the other four would be storage nodes. Our data set is not that large compared to a lot of companies. So, we configured the cluster with 4 replicas. Our main goal is high availability and scalability. I could find nothing in my tests or in the manual that indicated this would be bad for scalability and it should be great for HA.

We rewrote our application (basically, our public web site) to use the new cluster and its new table design. We hit our first snag when we tried to seed the data in the cluster. We got errors from the cluster about its transaction logs not being big enough to handle the inserts. Through the manual, forum posts, the mailing list archives and some blogs I was able to find the correct settings for our needs. I remembered back when I first installed the cluster thinking it was too easy. I now realize that getting a cluster running is easy. Making it run well, is a whole other story.

The second snag was with joins. Our test bed for the cluster was not a cluster. We used a group of servers using InnoDB to test against. That was a mistake. Joins did not work at all with the cluster. We had to back up, rewrite some code and redo some tables. In the end, the design is probably faster on InnoDB or cluster.

Everyday Use

We started using the cluster for every day use about a month ago. I guess 5 months is not bad for starting from nothing to live in production. We have been slowly moving applications to it. We take care each time to monitor the cluster and see that its not throwing new errors. So far, so good. We have about 80% of our page views (40% of our page views are our front page) and about 50% of our end user applications using the cluster now. We are doing caching at the proxy level for a lot of this. But, when tested, the new architecture is much more reliable even without the caching proxies. Some things like our forums will never translate to the cluster. But, they have their own dedicated systems already and are non-critical for our business. They could be shut down if there was a problem with them.

Administration

MySQL Cluster is a whole new animal. Its not like monitoring mysqld, apache or other stuff we already use. It took me a while to get the hang of rolling restarts, brining nodes up and down after crashes, etc. We have had just one crashed node since we switched over to production use. The cluster stayed up and kept serving content. We have written a Nagios monitor to keep track of the nodes' status. It uses ndb_mgm and reports any problems to us.

Feedback

Now, as the title says, I have only been using MySQL Cluster for 5 months. If you are reading this and have more experience and are thinking "What a moron!", please tell me. We are still learning.

Update:

Ronald Bradford had some questions on his blog for me.  I figured I would just answer them here.

You didn’t mention any specific sizes for data, I’d be interested to know, particularly growth and how you will manage that?

We currently have a DataMemory of 4GB and IndexMemory of 2GB.  Based on the crude methods we have to monitor it, I think we are at about 40% capacity.  We are using MySQL Cluster purely as a data store for content on our web site.  So, we can trim the data store down significantly.  If it does not appear on the site, its not in cluster.

You also didn’t mention anything about Disk? MySQL Cluster may be an in-memory database but it does a lot of disk work, and having appropriate disk is important. People overlook that.

Yes, we have U320 15k SCSI drives.  We do use RAID 1 on our servers contrary to some opinions.  We see a lot of drive failures.  About one every 4 months.  Sucks to lose a whole machine just because a $200 drive failed.

You didn’t mention anything about timings? Like how does backups for example compare now to previously.

Well, we don't currently back up the cluster data as it is being copied from our main database already.  Maybe that is a mistake, I don't know.  But, I can't come up with a reason to backup data that is just a copy of another database server.  Also, I have written a PHP class that does parallel writing to multiple servers using transactions.  Everything we write to the cluster also gets written to an "oh shit" mysql server that users InnoDB.  So, in the event we have a total cluster failure, F5 BIG-IP load balancers will send mysql traffic to the InnoDB server.

You didn’t mention version? 5.1 (not GA) is significant improvement in memory utilization due to true varchar support, saving a lot of memory, but as I said not yet production software.

Yeah, I am drooling over 5.1.  But, we are using current Gentoo stable, 5.0.38 I believe.  5.1 looks superior in many many ways.  I can't wait to upgrade.