How NOT to get support and how to turn the other cheek.

So, I checked my email this morning and found this jewel:
I might use Phorum if you brain deads knew how to upload or download your files via FTP. Your documentation has no order to it, its all a mess. I even dropped a release level to see if it was just that release. Ill give you a clue, DONT TRANSFER YOUR FILES VIA AUTO, EXPECIALY YOUR TXT FILES. TRANSFER THEM IN ASCII MODE ONLY, THIS INCLUDES YOUR PHP FILES. Then you just f---ing* MIGHT get readable files. Now you might say hey wait a min, we have full documentation on our web site, but you forget, someone has to open the sample.config.php file and read the crap that resides there.

* edited for content

Should I respond?  If so, how?  I decided to respond in as nice a way as I could.
 I normally don't answer direct support emails.  Neither do I normally answer very angry emails.  However, I view this as an educational experience.

Judging by your email, I would say you are using Notepad on Windows to edit and read files.  That is mistake number one.  Notepad only reads one file format: Windows text files.  Windows natively uses a CRLF for it's line endings.  It is the only operating system that does so. Notepad is the only application on the Windows platform that only reads that format.  If you would use Wordpad instead, this would not have been a problem for you.  For some reading on the subject, you may want to read:

http://en.wikipedia.org/wiki/Newline
http://www.cs.toronto.edu/~krueger/csc209h/tut/line-endings.html

Because PHP scripts are most commonly deployed on a Linux platform, the Unix line feed (LF or \n) is best for PHP applications.  Here are some suggestions for some great text editors for Windows.

TextPad - http://www.textpad.com/
Metapad - http://www.liquidninja.com/metapad/
PSPad   - http://www.pspad.com/en/

I hope this has helped educate you on the world of new lines and how real programming works.  In the future, a kind word in the forums would be much more appreciated than an email like this.  Not all people would be as kind as I am being and want to help you grow.

What do you think?  Should have just let this guy go?  Should have been as ugly to him as he was to me?

MacBreak missing a demographic

I listen to the MacBreak Weekly podcast every week.  I have liked Leo Laporte ever since The ScreenSaver days.  He has several good regulars on the show and mixes in topical guests as well.  However, I think there is a demographic of Mac user that the show is missing.

There is a growing audience of new Mac users in the tech sector.  Just to the O'Reilly Open Source Conference and take a count.  Mac OS X and the switch to the Intel platform has brought about the most stable, easy to use *nix based desktops and laptops the world has ever seen.  I was a long time Windows user.  I made fun of Mac users.  I even ran Linux on a Dell laptop for a while.  Boy, that was fun.  Nothing like waking up and having to edit X configurations so you can work.  Apple just got it right.  I can run my AMP stack on my MacBook Pro with no problems.  And the Mac UI is wonderful.  I am becoming a fan boy.

So, on this weeks MBW, Leo and the panel were talking about Leopard.  The subject came up about the best new feature for home users, power users and mac software developers.  There was neither anyone on the show that fit into my demographic of Mac user nor did anyone mention us.  No mention of Apache 2.2 or PHP 5.2.  No mention of a much improved Terminal.app.  No mention of a built in SSH Agent that works with your keychain.  If you work with Linux/BSD server, you use Terminal almost as much as any other application.

So, Leo, please include this growing Mac demographic into your discussions.  There has to be someone out there in our space that is as knowledgable as Andy Ihnatko and Scott Bourne are about their topics.  Merlin comes close when he is there, but I think he is still and old school Mac user that happens to have gotten into the geekier parts of Mac OS X.

Still, love the show.  Keep up the good work.

O'Reilly Open Source Conference Day Two

So, day two was the cool keynote day.  Day one keynotes were from Tim O'Reilly (not that he is not cool) and the vendors sponsors.  The Intel building blocks stuff was neat, but most of it was vendor stuff IMO.

Today we had the "cool thing to here and see, but I proabably won't use it" keynote.  It was The Processing Development Environment.  It was really cool.  You can read more about it at processing.org.

The next keynote was hard for me to follow.  There were no slides he stood behind the podium the whole time.  Gnat  seemed to love it as he all told us in IRC.  You can read the guys blog at overcomingbias.com.  It was basically about overcoming the biases you have.... I think.

Interestingly, (speaking of bias) the next keynote was from Microsoft.  Coincidence?  According to the speaker, MS (or at least this guy) is really trying to make some Open Source stuff.  Time will tell.  Also, they are "working" with the OSI to get their licensing approved as Open Source licenses.  As somone in IRC said, its a win/win from them.  If they don't get approved, they can just blame the OSI for being inflexible.  Nate kind of put him on the spot about patents after his talk.  He handled it well and kind of rode the fence.

The last keynote was, for me, the pay off keynote.  Its the one I will remember from this year the most.  It was about branding.  The poor guy did not have his slides due to technical issues and still did a great job.  You can read Steve's blog at steve-yegge.blogspot.com.  Maybe he will post the slides.

I attended a couple of good sessions today.  One was about caching, mostly with APC.  But, if you stripped down the APC stuff and just took some of his concepts, you could apply some of it to lots of caching methods.  The talk was given by Gopal Vijayaraghavan of Yahoo! I don't have a URL for the site where his slides may be.  If I find it, I will post it.

Another one was about legacy PHP code.  I didn't agree with 100% of what he was saying, but if you are in the boat he described, anything is better than where you are.  The guys site is clintonrnixon.net.  Hopefully he will put of the slides and maybe a blog post about it.

The last talk that I want to tell you about was from Amy Hoy.   She gave the "When Interface Design Attacks!" again this year. Just like last year, it was brilliant.  There were new topics like web 2.0.  I was happy to see that the Phorum 5.2 template I have been working on (emerald) already included many of her recommendations.  I guess she rubbed off on me last year.  Amy has started her own consulting company.  If we need a usability and/or interface design help again (bleh, the last one was less than exciting) I will push for using her for sure.  Check out her site (linked above) for more stuff from her.

The day (and conference really) ended with parties.  We went to the Sourceforge Open Source Awards party.  phpBB won best tool for communication.  Gag me with a chicken bone.  I guess it has a large install base.  But, MySpace has lots of users too.  That does not mean its not a black eye on the internet.  Ok, MySpace is worse than phpBB for sure.  But, c'mon, I write Phorum.  I am biased (see above keynote =).  It was a popularity contest and I guess there are more kiddies to vote for them than say Pidgin which is what I voted for.  With all the trouble they have had with their name, I wonder if "Gaim" would have gotten more votes.  (see other keynote on branding =).  The phpBB team may need to see the branding keynote from this morning.  It talked about how it takes a generation to change perception about a brand.  Most people I talked to here have a negative reaction to the phpBB brand.

The rest of the night we just hung out at the party hosted by Jive Software. We use OpenFire from those guys.  I am not a big Java user on the server.  Its just one more different thing to admin in a company that is 99% GNU C apps on the servers.  But, Openfire does a damn good job with XMPP.

In closing, O'Reilly Open Source Convention was great.  I got some great ideas of stuff we should be doing.  I got confirmation of things that we are already doing.  And most important, IMO, we got to share with others how we solve problems.  As  Gopal said in his caching talk, sometimes is better to stop doing stuff and tell others what you are doing (paraphrase).

O'Reilly Open Source Conference Day One

Day one is complete.  Portland is great as always.  Its really day 1 1/2 since we got in at 1PM yesterday.  That allowed us to go to the MySQL/Zend party last night.  Great party by those guys.  Touched based with old friends and made some new ones.

I kind of session hopped today.  Of note, I attended Andi Gutmans PHP Security talk which really had little to do with PHP.  Like Larry Wall's onion metaphor, Andi presented an onion metaphor for security.  I stopped in for a while on the SOLR talk.  It looks neat.  I like that it is a REST interface to Lucene.  If we were not using Sphinx already I might take a longer look.  But, we like Sphinx and, SOLR and Lucene are Java.  Not that there is anything wrong with that, we just don't use Java a lot, so its just one more thing that would be out of the norm.  I admit I spent a good bit of time in what is being called the "hallway track" working on some code.  Work does not stop just because you are at a conference.

I got to hang out with Jay Pipes of the MySQL Community team a good bit.  We talked about the MySQL forums (which or course runs Phorum) and how they want to improve them.  They would like to see tagging, user and post rating and some other things.  Some good things will come out of that.  Hopefully they have some of the tagging stuff done already at MySQL Forge and can contribute that code to Phorum, saving us time.

I hosted the Caching for fun and profit BoF.  It was not packed, but it was a good time.  The MySQL BoF was at the same time, so we lost some folks to that I am sure.  They had beer and pizza.  Brad Fitzpatrick did come by and contribute.  Thanks Brad.  It was mostly the same stuff you get on the memcached mailing list.  "How do we expire lots of cache at once?"  Questions about different clients.  Stuff like that.  It kind of turned into a memcached BoF, but I tried to share the dealnews experience with the attendees including our MySQL Cluster pushed caching.

I have met many readers of both dealnews and this blog (hi to you) while here.  Glad to know that both my professional work and my personal work are of use to folks.  The demographic at this conference is dead on for dealnews.  Maybe I can get them to sponsor it next year.  That would be cool.

I say every year that I want to present "next year".  Something always keeps me from doing it.  Usually its just not having time to prep for it.  By the time I think about it, the call for papers has passed.  I really want to get it done this time.  We shall see I suppose.

We went to the Sun party tonight.  It was a good time.  There was beer that was free as in beer.  More hanging with friends and talking about all kinds of stuff.  Now, all you Slashdotters sit down.  I saw people from the PostgreSQL and MySQL teams drinking beer and having fun together.  OMGWTFBBQ!!!1!!  See, the people that really matter in those projects don't bicker and fight about which is better.  They just drink beer and have a good time together.

Anyhow, I will blog more after day 2.  There won't be a day 3 as I have to catch an 11:30 flight back home.  That is usually how it goes.  Not sure why they book anything on Friday really.  Even O'Reilly has its "after party" on Thursday night.  Its late, and I need sleep.

MySQL cluster and all dump 1000

So, I had written a while back: "We currently have a DataMemory of 4GB and IndexMemory of 2GB. Based on the crude methods we have to monitor it, I think we are at about 40% capacity." Boy, I was wrong.

After that post, I started looking at this more in detail because we were considering buying more RAM "just in case". I figured out how to use the super secret command "all dump 1000". The command is not documented in the MySQL documentation that I could find. I did find it in the NDB API documentation before writing this post however. Not sure why I could not find it before.

For those that still don't know how to use it, simply type "all dump 1000" from your management console. Then check your cluster log files on the management server. You will see something like this:

2007-07-20 17:56:47 [MgmSrvr] INFO -- Node 2: Data usage is 46%(90672 32K pages of total 196608)
2007-07-20 17:56:47 [MgmSrvr] INFO -- Node 2: Index usage is 1%(2006 8K pages of total 131104)
2007-07-20 17:56:47 [MgmSrvr] INFO -- Node 3: Data usage is 46%(90724 32K pages of total 196608)
2007-07-20 17:56:47 [MgmSrvr] INFO -- Node 3: Index usage is 1%(2039 8K pages of total 131104)
2007-07-20 17:56:48 [MgmSrvr] INFO -- Node 4: Data usage is 43%(86153 32K pages of total 196608)
2007-07-20 17:56:48 [MgmSrvr] INFO -- Node 4: Index usage is 1%(2016 8K pages of total 131104)
2007-07-20 17:56:48 [MgmSrvr] INFO -- Node 5: Data usage is 46%(90672 32K pages of total 196608)
2007-07-20 17:56:48 [MgmSrvr] INFO -- Node 5: Index usage is 1%(2007 8K pages of total 131104)

Anyhow, I ran that and low and behold I saw that we were at about 93% capacity. As you can see above, we have made some changes. This got me to really digging as to what the difference could have been. As far as I can tell, our use of TEXT fields in our data was causing the issue. We have several fields in our data structure that hold data larger than 256 bytes. However, they hardly ever are more than 600 bytes. Based on what I read, it seems that the data after the first 256 bytes would be stored in 2k chunks. So, our litte ~300 byte extras were being stuck into 2k chunks. That caused a huge amount of wasted space.

So, the first thing I realized was that we could use 6GB DataMemory instead of 4GB. The machines have 8GB of RAM in them. We were using only about 200MB of IndexMemory. That leaves well over 1GB of the system.

The second thing I did was carefully analyze our data. I went back to school and started counting bytes. Its easy to get lazy as a developer these days. We just use a data type we know will work and it really is no big deal. But, not in this case. I changed int to mediumint and some to even small and tiny. I realized that there were two TEXT fields I could eliminate altogether. I looked at the varchar fields and made them only as long as they need to be. In the end, we ended up at the numbers you see above, about 46% capacity or 2.76GB. We were at 3.72GB before. The removal of the two TEXT fields may have had more to do with the improvement than anything.

This was all with MySQL 5.0. 5.1 will bring better storage of the large data. For example, all data over 256 bytes, for a row, will be stored together in 2k chunks rather than one 2k chunk per column. That will likely save us 1GB of DataMemory. The other feature of 5.1 is Disk Data Tables. We are currently testing those and I will blog about them when I have more information. Early numbers look good though. We have just set our largest table (the one with the TEXT fields) as a disk data table and our DataMemory is down to just 280MB. Yeah, that is an MB. Our COO is a bit concerned with the performance hit we may see with data on disk.

Oh, one more pointer about all dump 1000. For a quick way to grab my current usage, I use this command from my prompt (on a Mac):
ssh user@mgmserver "ndb_mgm -e 'all dump 1000' && tail -n 1000 /var/lib/mysql-cluster/ndb_1_cluster.log | fgrep usage | tail -n 8";;
Use at your own risk. It may make your server catch fire. =)

RAID is dying?

There is a bunch of posts on Planet MySQL this week about RAID.  This comment from Kevin Burton really kind of made me go "huh?".
You’re thinking too low level. Who cares if the disk fails. The entire shard is setup for high availability. Each server is redundant with 1-2 other boxes (depends on the number of replicas). If you have automated master promotion you’ll never notice any downtime. All the disks can fail in the server and a slave will be promoted to a new master.

Monitoring then catches that you have a failed server and you have operations repair it and put it back into production as a new slave.

Someone has to think low level.  The key phrase in there is  you have operations repair it and put it back into production as a new slave.  This tells me all I need to know.  Kevin later states that his company does in fact not operate their own equipment, but uses a provider for all their hosting.

At this point, I think this is a philosophy argument and not a real world application argument at this point.  Sure, I guess if I am Google or Yahoo I can do this.  But, for the mass majority of web sites running out there, having 4 data centers and "operations" at your beck and call is not a reality.  For real people, having a server go down is pain in the ass. Why should I want to spend a full day of labor rebuilding a server because a $200 part broke or  just got corrupted.  It takes 10 minutes to start a rebuild and maybe another 10 minutes to install a new drive if the rebuild fails.

His other argument is about performance.  Sure, its debatable whether RAID is faster or slower.  It probably depends on the application.  If your RAID is a bottle neck for your application, then you need to address that. For us, its far from the bottleneck so why bother with the downtime of having one (of our 30, not 1000) servers down.

BTW, would you rather admin 30 servers or 1000?  I think 30.

I should add that we only use RAID on servers that are used for data storage.  Losing data sucks.  For web servers we don't use RAID.  They do fit the model that Kevin describes.  We have a lot of them.  If one goes down, its ok.  Maybe Kevin's application can fit all its data on one web node.  Don't know.  I just know its right for us and I don't see a future where I won't want it on our servers.   We are even using RAID in our MySQL Cluster servers. Why?  Because I don't want to have to wait a day to get a storage node back up and running for a $200 part.

Five months with MySQL Cluster

So, the whole world changed at dealnews when Yahoo! linked us. We realized that our current infrastructure was not scaling very well. We had to make a change.

The Problem

Even though we were using all sorts of cool techniques, the server architecture was really still just a bunch of web servers all serving the same content. In addition to that, our existing systems as the time used a pull method. When a request came in, memcache was checked, if the data was not there, it was fetched from our main MySQL server. So, when there is no data in the cache or when it expires, this was very bad. Like when Yahoo! hit us. Some cache item would expire and 60,000 users would hit a page and each page would try and create the cache item.

The Solution

I was tasked with two things. Find a way to handle something like the Yahoo burst and finding a way to store the data we need to generate our web pages that was highly available and would scale. For bursting, I wrote a proxy using apache, mod_rewrite, php and memcached. I have reasons I did it this way that are not relevent to this post. Maybe more on that later.

For the data solution, I considered several things: MySQL replication, writing my own replicating memcached client, and other exotic ideas. One of the semi-exotic ideas for us was MySQL Cluster. We had not used it at all. Some things about it made us gun shy. But, we tested it and were very happy with the results.

Initial Test

With the help of Gentoo, getting a cluster up and running was really, really easy. In fact, it seemed too easy. We ran a cluster on some dev boxes at first. We did some generic testing using the PHPTestSuite from the guys at MySQL Performance Blog. What we found was that while the cluster appeared slower at low concurrent connections, it scaled much better than InnoDB (our prefered storage engine) when the concurrent connections grew.

Application Testing

So, we moved to the next step, testing our application. We discovered early on with cluster that we would have to redesign our application. Our DB was highly relational. Almost no data could be put on the site without data from other tables. We used a lot of joins. We learned (later) that joins in the cluster are not a good idea. Neither are sub-selects. So, we wrote some proof of concept scripts for our application. We were very happy. Very few issues were found. Nothing anywhere near show stopping.

Installation

We ordered our servers. Six new Dell dual-core, dual processor Opterons with a lot of memory. Two would become SQL nodes and the other four would be storage nodes. Our data set is not that large compared to a lot of companies. So, we configured the cluster with 4 replicas. Our main goal is high availability and scalability. I could find nothing in my tests or in the manual that indicated this would be bad for scalability and it should be great for HA.

We rewrote our application (basically, our public web site) to use the new cluster and its new table design. We hit our first snag when we tried to seed the data in the cluster. We got errors from the cluster about its transaction logs not being big enough to handle the inserts. Through the manual, forum posts, the mailing list archives and some blogs I was able to find the correct settings for our needs. I remembered back when I first installed the cluster thinking it was too easy. I now realize that getting a cluster running is easy. Making it run well, is a whole other story.

The second snag was with joins. Our test bed for the cluster was not a cluster. We used a group of servers using InnoDB to test against. That was a mistake. Joins did not work at all with the cluster. We had to back up, rewrite some code and redo some tables. In the end, the design is probably faster on InnoDB or cluster.

Everyday Use

We started using the cluster for every day use about a month ago. I guess 5 months is not bad for starting from nothing to live in production. We have been slowly moving applications to it. We take care each time to monitor the cluster and see that its not throwing new errors. So far, so good. We have about 80% of our page views (40% of our page views are our front page) and about 50% of our end user applications using the cluster now. We are doing caching at the proxy level for a lot of this. But, when tested, the new architecture is much more reliable even without the caching proxies. Some things like our forums will never translate to the cluster. But, they have their own dedicated systems already and are non-critical for our business. They could be shut down if there was a problem with them.

Administration

MySQL Cluster is a whole new animal. Its not like monitoring mysqld, apache or other stuff we already use. It took me a while to get the hang of rolling restarts, brining nodes up and down after crashes, etc. We have had just one crashed node since we switched over to production use. The cluster stayed up and kept serving content. We have written a Nagios monitor to keep track of the nodes' status. It uses ndb_mgm and reports any problems to us.

Feedback

Now, as the title says, I have only been using MySQL Cluster for 5 months. If you are reading this and have more experience and are thinking "What a moron!", please tell me. We are still learning.

Update:

Ronald Bradford had some questions on his blog for me.  I figured I would just answer them here.

You didn’t mention any specific sizes for data, I’d be interested to know, particularly growth and how you will manage that?

We currently have a DataMemory of 4GB and IndexMemory of 2GB.  Based on the crude methods we have to monitor it, I think we are at about 40% capacity.  We are using MySQL Cluster purely as a data store for content on our web site.  So, we can trim the data store down significantly.  If it does not appear on the site, its not in cluster.

You also didn’t mention anything about Disk? MySQL Cluster may be an in-memory database but it does a lot of disk work, and having appropriate disk is important. People overlook that.

Yes, we have U320 15k SCSI drives.  We do use RAID 1 on our servers contrary to some opinions.  We see a lot of drive failures.  About one every 4 months.  Sucks to lose a whole machine just because a $200 drive failed.

You didn’t mention anything about timings? Like how does backups for example compare now to previously.

Well, we don't currently back up the cluster data as it is being copied from our main database already.  Maybe that is a mistake, I don't know.  But, I can't come up with a reason to backup data that is just a copy of another database server.  Also, I have written a PHP class that does parallel writing to multiple servers using transactions.  Everything we write to the cluster also gets written to an "oh shit" mysql server that users InnoDB.  So, in the event we have a total cluster failure, F5 BIG-IP load balancers will send mysql traffic to the InnoDB server.

You didn’t mention version? 5.1 (not GA) is significant improvement in memory utilization due to true varchar support, saving a lot of memory, but as I said not yet production software.

Yeah, I am drooling over 5.1.  But, we are using current Gentoo stable, 5.0.38 I believe.  5.1 looks superior in many many ways.  I can't wait to upgrade.

Quick script to check user bandwidth usage

A buddy needed a quick report to see if one of his users was slamming his site. I got a little carried away and wrote a PHP script (plus some awk and grep) to make a little report for him. I am sure it is full of bugs and will bring your server crashing down. So, use at your own risk.

$ ./bwreport.php -h
Usage: bwreport.php [-d YYYYMMDD] [-u URI] [-i HOST/IP] [-r REGEXP] [-v]
-d YYYYMMDD Date of the logs to parse. If no date provided, yesterday assumed.
-i IP/HOST Only report log lines with IP/HOST for host part of log line
-r REGEXP Only report log lines that match REGEXP. Should be a valid grep regexp
-u URI Only report log lines with URI match to URI
-v Verbose mode

http://www.phorum.org/downloads/bwreport.php.gz

Vixie Cron and the new US DST

So, the new DST changes in the US caused a small stir among system administrators recently. We got all of our servers updated and verified they were working before the even. Or so we thought.

I noticed today that our 3PM Eastern newsletters arrived in my inbox at 3PM alright. However, I am in Central time. My immediate assumption was that we missed the server that sends that email out. Logging in I found the time correct on the server. It had received the appropriate updates thanks to portage. So, what happened? I looked at /etc/crontab and all was fine. I then looked at the system log where cron jobs are logged. Oddly, that log line said the job started at 15:00. I knew that was not correct. I started looking around at other cron jobs on other servers, especially ones that wrote files to disk. Sure enough, every server I checked was doing things an hour behind except one. It just so happens we had restarted cron on this server however last week. We had to shut it down to keep it from causing some errors while we updated the server.

So, long story short, we restarted cron on all the servers. That seems to be the only thing needed. These servers have been running (and crond as well) long before even the announcement of the DST change. I guess vixie cron can't handle time zone rules changing after he starts. For the record, we are using the latest stable version in Gentoo's Portage.