Altering several things in a MySQL table at one time

Sun, Feb 22, 2009 06:00 PM
To some kick ass DBA or someone that works on MySQL internals, this is probably a no brainer.  But, for just joe schmoe power user types, we wonder about these things.  We could not find the answer on the internet, so I decided to blog about it.  We expected the result we found, but I think it's good information.

Have you ever had to make changes to several indexes or columns in a table?  With Phorum, we keep a series of scripts that run queries to upgrade the database as we make changes.  Recently I had to delete about 12 keys and add about 7 back in their place.  I initially thought to make two sql statements.  One to delete indexes and one to create new ones.  But, Thomas, one of the other developers, wondered if that was really any better.  So, I decided to run some tests.

First I altered a table with 70k rows, dropping one key.  That took 16 seconds.  I then added a key to that same table.  Again, about 16 seconds.  So, I then dropped a key and added a key in one query.  Again 16 seconds.  So, I decided to go for the ultimate challenge.  I dropped 12 keys and added back 12 keys all in one query.  TADA!  16 or so seconds.  The table was an InnoDB table, so I repeated after converting the table to MyISAM.  Again, all the alters took about the same time, regardless of the number of changes.

Update


I was told in the comments that the data was not big enough for a real test.  I hold to the fact that because, as Eric Bergen pointed out, the most time is taken in copying the data, the data size is not that important here.  However, I tried it out on a table with 750k rows just to see.  Dropping 4 keys on a table with 12 keys took 1min 46sec.  Adding 4 keys to the same table, (now with 8 keys) took 1min 52sec.  Adding 4 keys and dropping 4 keys in one alter table took 1min 49sec.  So, looks like I was right.  For fun, I decided to drop all the keys except the primary.  That only took 28sec.  That is what I guessed.  The slow part is copying the table to a new table.  That includes filling in indexes.  It is much faster to write a table with indexes than it is a table with 12 indexes.  However, that was not my use case and not the reason for my testing.  But, it is good information to attach to this blog post.
7 comments
Gravatar for Eric Bergen

Eric Bergen Says:

This is because alter table copies the entire table to a temporary table and makes changes on the fly. The bulk of the time spent in alter table is copying all the data. Here is a blog post I wrote about how alter table works under the hood as well as some of the basic problems:

http://ebergen.net/wordpress/2007/05/07/how-alter-table-locks-tables-and-handles-transactions/

Gravatar for Brian Moon

Brian Moon Says:

Yeah, I understood the how, but not the how long. I was not sure how doing multiple operations would affect the speed of that table rebuild. I am drooling over the (I know limited) online alter table possibilities in MySQL 5.1. Adding a value to an ENUM without a full table rebuild.... Brilliant!!

Gravatar for Mark R

Mark R Says:

70k rows? Your tables are rediculously small. How can it take 16 seconds to do anything with such a tiny table?

Try doing ALTER TABLE on significant tables, that takes time.

With such a micro-tiny database indexing is hardly worthwhile so you may as well just drop the indexes.

Gravatar for Brian Moon

Brian Moon Says:

It's a table I had handy and it's the box we have for our humble open source project. Email me and I will give you my mailing address. You can send over a new database server that meets your lofty specifications. We have installed user bases with tens of millions of rows. So, the speed of doing more than one operation in an alter vs. doing one operation per alter is interesting regardless of the data size, IMO.

Gravatar for Mark R

Mark R Says:

I understand that in this example you're trying it with small data, but I usually find that problems only show up when you use production-size data (or bigger if possible, to cater for some growth).

Creation of production-size data is sometimes nontrivial, especially when you want to ensure that the row size, index spread etc is similar to production and have consistent data so your application can use it. I've spent a lot of time writing test-data creation tools for this.

happy ALTERing

Mark

Gravatar for Justin Swanhart

Justin Swanhart Says:

What storage engine are you using? If you are using MyISAM, consider setting the value of myisam_repair_threads >1. This will cause MyISAM to build the new indexes in parallel after copying the data.

You seem to be adding lots of indexes. Are you using composite indexes as covering indexes to reduce i/o by storing frequently scanned columns into the key buffer?

Gravatar for Brian Moon

Brian Moon Says:

I am using InnoDB on all of my production databases. No, we are not using covering indexes. We just have lots of ways that the data is accessed. We have 12 keys not including the primary key. This is actually down in this version from about 16 keys. I went through a query optimization and reorganization a couple of months ago. So, there were lots of key changes. We only do stuff like this in BC breaking releases (for us 5.3, 5.2 is stable).

Trust me, I have been at MySQL Conference, sitting at tables with multiple MySQL Support staff looking at different parts of the schema for Phorum. MySQL runs Phorum and have been told any query and/or schema changes that they found useful. If I could limit the keys more I would. These are the keys needed to make every aspect of the application fast.

Comments are disabled for this post.