Backing up MySQL Databases

Posted by Sheldon Finlay on April 30, 2009

Most people backup their web site files. But with so many sites being database-driven it is equally important to make regular MySQL database backups. If you are running a blog, a forum or any database-driven web site and your database gets nuked, well there goes all your content. How long will it take to restore it and get your site back online? If you have a backup, the data loss should be minimal and restoration of your backup should take no time at all.  If you don’t have a backup you are probably screwed and all your work has disappeared into the ether.

What options do you have for creating a MySQL database backup? There are plenty and many of them can be automated using shell scripting or cronjobs so you can set them and forget them (until you need them!). Let’s look at a few manual backup methods. In a later post I will talk a little bit about automating backups.

PhpMyAdmin

Most of you are probably familiar with PhpMyAdmin, the PHP web-based administration backend for MySQL. PhpMyAdmin has a tab called export where by you can backup your entire database or just selected tables. The default export settings are usually perfect, although you can tweak them according to your needs, although you should always export as SQL, as that is the format your database will expect if you ever need to restore it. Lastly, you want to save the file to your local computer and to save space and transfer time, compress it in the format of your choice.

mysqldump

mysqldump is a command line utility which pretty much does the same thing as PhpMyAdmin except it’s shell based and not web based. So you will need a shell account on the server where your database resides. The basic syntax for using mysqldump is:

mysqldump -u [user name] -p[password] [database name] > [dump file]

So you will need to provide your database username, database password, database name and finally a filename which will be the schema file containing the entire database structure and data.

If your database is large you might want to compress it to save space and transfer time. You can backup the database and compress the database in one swoop like so:

mysqldump –-user [user name] –-password=[password] [database name] > [dump file] | gzip > [dump file]

Now, to restore a database with mysqldump you would issue do something like this:

mysql -u [username ] -p[password] [database name] < [dump_file]

This same command could also be used for a schema file which you created with PhpMyAdmin. You would of course need to upload the schema file to your account. I find working with schema files through the shell a lot faster than working with PhpMyAdmin. PhpMyAdmin can also choke on your schema, particularly if you are saving binary info to your database.

mysqlhotcopy

mysqlhotcopy is another command line utility for creating MySQL database backups. It has many advantages over mysqldump:

  • It’s very fast. So it’s suitable for large databases which would be both time consuming to backup as well as restore.
  • It makes exact copies of the actual files which make up the database tables.
  • It ensure data integrity by locking and flushing the tables while they are being copied. Other database backup methods can export a backup while a record is being written which can lead to a corruption in the backup.

Now, despite it’s overwhelming positives, mysqlhotcopy comes with one large downfall which makes it not practical for everyone. To restore the files, you need to have access to the actual files which make up your database. If you are on a shared web host or don’t have root access on your database server you won’t be able to restore the files yourself. You might be able to ask the web host to copy the files and flush the database for you. But you should check before you use this method, lest you get stuck with a backup which you can’t restore.

So how do you use mysqlhotcopy? It’s a shell command similar to mysqldump:

mysqlhotcopy -u [username] -p[password] db_name [/path/to/new_directory]

What this will do is lock and flush the tables in your database and make an exact copy of those files to the directory your specified. To restore your database, you simply copy these files into your database directory and flush or restart MySQL. The location on my server would be /var/lib/mysql/table_name/, but yours might be different.

So this covers some manual backup methods. Manual backups are better than nothing, but they still leave something to be desired. It can be time consuming to run backups manually and would be better to have them run automatically at regular intervals. It would also be ideal to have some sort of backup rotation where you could choose between restoring yesterday’s backup,  last week’s backup or anything in between. This is a topic I will try to cover in a future post. But for now, just start backing up your databases.

Adding Indexes to Your Database to Improve Performance

Posted by Sheldon Finlay on April 24, 2009

Working with a relational database is easy, just define your tables and fields and you’re good right? Well, not quite. It’s important to visual how your tables are going to relate and how your queries are going to interact with your database structure. This is where having the proper indexes on your table will speed up your queries dramatically and make your application perform like a champ (at least on the database side).

Indexes? In My Database?

Let’s jump right into it and look at how MySQL accesses records in a single database table. Say you have a simple users table which is structured like this:

CREATE TABLE `users` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(255) DEFAULT NULL,
`email` varchar(255) DEFAULT NULL,
PRIMARY KEY (`id`)
)

Now, let’s say we have a few thousand users in this table and we do a simple query like this:

SELECT * FROM users WHERE name = "John Smith"

MySQL has absolutely no idea where this record is found in the table. It doesn’t know if there is one record for John Smith or a dozen. So it has to do a full table scan reading though every single row in the database looking for a match in the name column. Now this isn’t too bad if your table has only a small number of records. But once your table rows grows to a certain point, say a couple thousand records, this non-indexed table can become quite slow. So what can we do?

An Index in Time Saves Nine

The first general guideline of indexing is to add an index to any field you are querying on. In this case we are querying on the name field. So if we add an index to name field we’ll see a huge performance gain and our table can scale larger without increasing the query time, thus making our application a lot more responsive.

Let’s see this in action: I have created a database table using the structure above and populated it with about 7 million random records. (yes, this is a bit data set, but not uncommon). If I run the query above without an index on the name field it looks like this:

2298 rows in set (3.59 sec)

Ugh! It took 3.59 seconds to run that query and return the results. Now you might not think that 3+ seconds is a lot, but trust me, it’s horrible. When you’re dealing with a production application which is serving hundreds of queries to the database a second this lag is going to compound even greater and your application is going to grind to a halt. This is so full of fail it ain’t funny.

Now, lets run the same query on the same 7 million records but with an index added to the name field:

2298 rows in set (0.02 sec)

Now that’s what I am talking about. That’s a huge performance increase: from nearly 4 seconds down to a couple tenths of a second. This, my friend, is a good thing. You want your application to be lean and it’s queries to the database to be lightning fast. Indexes will help your achieve the latter.

So what happened here? How did MySQL’s performance improve just by adding an index to a queried field? Well, it helps to understand a little bit about what an index is. Think of your database as a large book. An index is sort of like the index of a book. Without an index, if you wanted to find out on what pages a particular topic appeared you would have to leaf through every single page scanning for that information. Fortunately, a book index has a listing of where those topic appear so you can save yourself a lot of work by just referring to the index. For a MySQL database, when you add in index, the database goes through and builds a listing of the records for that field for lightning fast access so it doesn’t have to scan the whole database. When records are added or removed from the database the index is also updated.

Now an index does add some bloat to a table. For the table I’ve been using above the non-indexed version weighed in at 350MB and the indexed version was 25MB larger at 375MB. Not a huge increase, but you definitely don’t to go around adding indexes to all your fields. Only add indexes to fields that will benefit from them. Here are some general guidelines:

  • Add an index to any field you are querying regularly (as illustrated above).
  • Add an index to any field you are joining on. The performance benefit here can be enormous as complex joins across multiple tables can be very slow.
  • Add an index to any field you are going to be adding  a GROUP BY or ORDER BY clause on.

Now, keep in mind if your table has a primary key, then you get an index for free on that primary key. No need to add an index to a primary key. However, if you are doing a join and have two tables, for example and addresses with a foreign key of user_id like so:

CREATE TABLE `addresses` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`user_id' int(11) NOT NULL,
`name` varchar(255) DEFAULT NULL,
`email` varchar(255) DEFAULT NULL,
PRIMARY KEY (`id`)
)

You will want to add an index to user_id in order to speed up a simple join query like the one below:

SELECT * FROM users, addresses WHERE users.id = addresses.user_id AND users.id=842414;

So that’s the basics of database indexes for MySQL. This is a big topic and I have only touched upon the basics. Keep in mind that every situation is different and there are no hard and fast rules to adding indexes apart from not adding them with reckless abandon. Observe how your application is using the database and what queries are being generated and try to determine which fields are being queried on or joined on and add indexes to those and see if our application’s database performance improves.

Errors While Installing MySQL Gem 1

Posted by Sheldon Finlay on February 04, 2009

I was installing the MySQL ruby gem this evening and ran into some errors. After some hunting around I found the solution:

gem install mysql -- --with-mysql-config=/usr/lib/mysql/mysql_config

Here’s the errors I was getting hit with. Hope this helps others who might encounter a similar problem.

Building native extensions.  This could take a while...
ERROR:  Error installing mysql:
        ERROR: Failed to build gem native extension.

/opt/ruby-enterprise-1.8.6-20090201/bin/ruby extconf.rb install mysql
checking for mysql_query() in -lmysqlclient... no
checking for main() in -lm... yes
checking for mysql_query() in -lmysqlclient... no
checking for main() in -lz... yes
checking for mysql_query() in -lmysqlclient... no
checking for main() in -lsocket... no
checking for mysql_query() in -lmysqlclient... no
checking for main() in -lnsl... yes
checking for mysql_query() in -lmysqlclient... no
*** extconf.rb failed ***
Could not create Makefile due to some reason, probably lack of
necessary libraries and/or headers.  Check the mkmf.log file for more
details.  You may need configuration options.

Provided configuration options:
        --with-opt-dir
        --without-opt-dir
        --with-opt-include
        --without-opt-include=${opt-dir}/include
        --with-opt-lib
        --without-opt-lib=${opt-dir}/lib
        --with-make-prog
        --without-make-prog
        --srcdir=.
        --curdir
        --ruby=/opt/ruby-enterprise-1.8.6-20090201/bin/ruby
        --with-mysql-config
        --without-mysql-config
        --with-mysql-dir
        --without-mysql-dir
        --with-mysql-include
        --without-mysql-include=${mysql-dir}/include
        --with-mysql-lib
        --without-mysql-lib=${mysql-dir}/lib
        --with-mysqlclientlib
        --without-mysqlclientlib
        --with-mlib
        --without-mlib
        --with-mysqlclientlib
        --without-mysqlclientlib
        --with-zlib
        --without-zlib
        --with-mysqlclientlib
        --without-mysqlclientlib
        --with-socketlib
        --without-socketlib
        --with-mysqlclientlib
        --without-mysqlclientlib
        --with-nsllib
        --without-nsllib
        --with-mysqlclientlib
        --without-mysqlclientlib

Gem files will remain installed in /opt/ruby-enterprise-1.8.6-20090201/lib/ruby/gems/1.8/gems/mysql-2.7 for inspection.
Results logged to /opt/ruby-enterprise-1.8.6-20090201/lib/ruby/gems/1.8/gems/mysql-2.7/gem_make.out