The article discusses how to setup MySQL Replication between two Amazon EC2 instances.
It walks you though setting up replication for an empty database server. Adding a slave to a server already full of data is a different article.
It is assumed that you already know the basics of starting EC2 instances, connecting to them via SSH and editing files in Linux using vi/vim etc.
For this tutorial, I am using the Amazon built machine image ami-2b5fba42 which is Fedora 8 base image.
Overview
In this tutorial, we will:
- Launch two EC2 instances, a Master and a Slave
- Install MySQL Server and tools onto each machine (must be the same version of MySQL on both)
- Configure MySQL on each so that each has a unique server ID and keeps its data in an EC2 friendly place
- Create a user on the master for replication and configure the slave to use it.
- Synchronize the master/slave replication logs
- Test it all works
Configuring the MySQL Master (shown in blue)
The steps for configuring the master are as follows:
- Launch an EC2 instance using your favourite method (I like the ElasticFox Firefox extension)
- Install MySQL Server and Client Tools (via Yum)
- Edit /etc/my.cnf to alter the data folder and edit replication settings
- Configure MySQL to run at boot (in case we restart the instance)
- Start MySQL and configure to run at boot
- Create a MySQL user for replication
First, launch an instance of the Fedora machine, and connect using SSH.
This gives us a base Fedora instance.
Next, install MySQL and configure it to start when the machine boots (in case we decide to restart it)
Install MySQL Server and tools:
yum install -y mysql mysql-server
Rename the old config file (if you're interested in keeping it) and edit the /etc/my.cnf:
mv /etc/my.cnf /etc/my.cnf.old
vi /etc/my.cnf
To look something like this:
[mysqld]
# replication settings for MASTER
server-id = 1
# data folder
datadir = /mnt/mysql
# switch on binary logging (required for replication)
log-bin = mysql-bin
# system stuff
user = mysql
socket = /var/lib/mysql/mysql.sock
[mysqld_safe]
log-error = /var/log/mysqld.log
pid-file = /var/run/mysqld/mysqld.pid
Now we can configure MySQL to start at boot, start it now and connect.
chkconfig --level 2345 mysqld on
service mysqld start
mysql
Now we can create a replication user account:
GRANT REPLICATION SLAVE ON *.* TO 'ReplicationUser' IDENTIFIED BY 'ReplicationPassword';
FLUSH PRIVILEGES;
We're done. Next, the slave...
Configuring the MySQL Slave (shown in green)
This is an almost identical sequence of steps, with a couple of minor differences.
- Start another Amazon instance in the same availability zone (if you want speed and no bandwidth cost) or in a separate zone if you are more concerned about availability.
- Install MySQL Server and Client Tools (via Yum)
- Edit /etc/my.cnf to alter the data folder and edit replication settings
- Configure MySQL to run at boot (in case we restart the instance)
- Start MySQL and configure to run at boot
- Tell the MySQL Slave who/where the Master is (using internal DNS name)
- Start replication on the slave
Once the slave instance is running, connect using SSH and install MySQL and tools (just like the master):
yum install -y mysql mysql-server
Rename the old config file (if you're so inclined) and edit the /etc/my.cnf:
mv /etc/my.cnf /etc/my.cnf.old
vi /etc/my.cnf
To look like this:
[mysqld]
# replication settings for SLAVE
server-id = 2
# data folder
datadir = /mnt/mysql
# switch on binary logging (not essential but handy if you want to replicate from this slave in the future)
log-bin = mysql-bin
# system stuff
user = mysql
socket = /var/lib/mysql/mysql.sock
[mysqld_safe]
log-error = /var/log/mysqld.log
pid-file = /var/run/mysqld/mysqld.pid
Now we can configure MySQL to start at boot, start the service going and connect:
chkconfig --level 2345 mysqld on
service mysqld start
mysql
On the Master, we need to determine the binary log's starting position. This is the byte offset that our Slave should start reading from. On the master, type the following:
mysql> SHOW MASTER STATUS;
+------------------+----------+--------------+------------------+
| File | Position | Binlog_Do_DB | Binlog_Ignore_DB |
+------------------+----------+--------------+------------------+
| mysql-bin.000003 | 319 | | |
+------------------+----------+--------------+------------------+
1 row in set (0.00 sec)
Next, we want to connect the SLAVE to the MASTER. For this, we need the replication user details we created earlier and the internal (or private) DNS name for the MASTER, this should be available from ec2-dim or the ElasticFox instances list. We also need the file and position we get from the master in the step above. This is done with the MySQL CHANGE MASTER command:
CHANGE MASTER TO
MASTER_HOST='domU-11-22-33-44-55-66.compute-1.internal',
MASTER_USER='ReplicationUser',
MASTER_PASSWORD='ReplicationPassword',
MASTER_LOG_FILE='mysql-bin.000003',
MASTER_LOG_POS=319;
We can now start the slave's replication process:
START SLAVE;
Testing To See If Replication Is Working
Now replication is working, we can issue commands against the Master, and check to see if the Slave replicates the result.
To do this, use two terminals side by side, one connected to the Master and one to the Slave.
mysql> CREATE DATABASE HelloWorld;
Query OK, 1 row affected (0.00 sec)
mysql> use HelloWorld
Database changed
mysql> CREATE TABLE Message (Content VARCHAR(100) NOT NULL PRIMARY KEY);
Query OK, 0 rows affected (0.01 sec)
mysql> INSERT INTO Message (Content) VALUES ('Howdy slave');
Query OK, 1 row affected (0.00 sec)
On the Slave, we can now verify if the changes made to Master have been replicated:
mysql> SELECT * FROM HelloWorld.Message;
+-------------+
| Content |
+-------------+
| Howdy slave |
+-------------+
1 row in set (0.00 sec)
Congratulations! You have a working replication.
Summary
There are a number of things you can use your slave for:
- Taking backups without locking the Master: If you cannot afford the downtime on your Master to take a backup (backups must lock tables/databases for consistency) you can use STOP SLAVE to pause replication while you take a backup from the Slave. Resuming replication will pickup where it left off. This prevents the master from slowing down or having to lock tables while a consistent backup is taken.
- As part of a scale out strategy: You can modify your application to read data from one of a number of available slaves. This is suitable for read intensive applications with slow-changing data.
- As a replication master: You can daisy-chain slaves off other slaves in either a long bus configuration or in a tree style hierarchy. Just because you can seems to be a good enough reason to me.
- Run tests against like live data: If you break the replication link, you have a complete copy of your live databases to test your latest version against.
- As a stunt-double Should disaster strike, and something bad happen to your master, you can change you application to use the Slave instead. If you enabled binary logging on the slave, then you are already in a position to attach new slaves to that and promote it to the new master. How easily your application can switch over to the new master is a design issue you must consider when encountering problems connecting to the master.
Handy Links
bookmark this on del.icio.us
© 2008 Copyright Tim Hastings