Scale Hacking: Cloud Computing, Software and System Performance

Oct 30, 2011

Adding One More MySQL Slave to Cloud Based System (Amazon AWS)

MySQL Replication tends to be complex.

Creating a new slave to support your system expansion is not easier: coping large data files over the LAN, recovering replication, etc.

Fortunately, virtualization and SAN help us do it easier and faster.

Lucky us, Cloud Computing make it even easier.

Adding a New Slave is Easier than Ever

Select an existing slave that can be reboot w/o affecting the service.

Take the slave out of the production environment.
Stop the MySQL slave and perform a read lock: SLAVE STOP;FLUSH TABLES WITH READ LOCK;
Create an AMI based on this instance using AWS console or API.
Recover the instance upon AMI creation completion and get the slave back to production: UNLOCK TABLES;SLAVE START;SHOW SLAVE STATUS\G

Create a new slave based on the AMI

Launch a new slave using the AWS console or API.
Change the server's /etc/my.cnf file to fit a new slave instance:

Change the server-id=N value to a new numeric value.
Set relay-log='server-relay-bin' to the original slave server relay log name. (if you want to make it even easier, configure in the first slave's my.cnf relay-log=mysqld-relay-bin, and you'll be able to avoid this step).

Restart the MySQL instance: sudo /etc/init.d/mysql restart
Recover the MySQL: UNLOCK TABLES;SLAVE START;SHOW SLAVE STATUS\G
Add this new slave to the production environment.

Bottom Line

Expanding your business today is easier w/o the need for file copy or any other tedious action.

Keep Performing,

Moshe Kaplan

Oct 3, 2011

MySQL CentOS and Amazon Web Services (AWS)

Many Linux experts prefer CentOS. The reason is clear, Redhat's RHEL is probably the best Linux server distribution and CentOS is very much a like (except for the yearly fees).
This distribution (and Fedora as well) is very common for MySQL installation, since there are many predefined packages for it. This is unlike other distributions such as Ubuntu.

Therefore, when you will perform your database migration to the cloud, I recommend that you will first verify your needs. The following items will help you implement your systems right:

Select the Linux flavor

CentOS: Detailed options are available bellow. Don't forget to verify the recent CentOS version: Usually there are two (an early major version such as 6.0 and an old advanced minor version such as 5.7) and you should select the right version for you.
Fedora: Some free and good AMIs are available.
RHEL: There is a premium AMI for it
Ubuntu: No formal MySQL distribution exists to this flavor.
Debian.

Select the Storage Type. Amazon AMIs can store their information on one of two storage type:

Instance storage: this is a server attached disks storage that is great from plain performance issues (no network traffic and latency for storage). Yet, it lacks the option to extend the disk seamlessly to a larger size and backup must be done through the network instead on an instant disk duplication (shorter downtime).
EBS: this is Storage Area Network based disks that easily can be extended and backed up with minimal server downtime. Yet, this storage costs both in money (~$0.15/GB per month) and network latency.

Select the right image for you:

RightScale Image: (last update Jan 2009. If you don't use RightScale, don't forget to remove their management tools and don't forget to update your version to a recent one. RightScale has the two types of images.
Amazon Linux AMI: The AWS image is based on CentOS image, yet it includes some tweaks. The plus is having an Out of the Box solution with Amazon support. The minus is that it's not a 100% RHEL compatible anymore.

Install your database (MySQL probably) and web server (Apache probably) on the machine.

The recommended MySQL version is MySQL 5.5 with major performance enhancement since MySQL 5.1.

Bottom Line

Now, you just need is to choose the configuration that most fit your needs.

Keep Performing,

Moshe Kaplan

Moshe Kaplan is an entrepreneur and a cloud systems consultant

Sep 29, 2011

MySQL Unit Testing and Re-factoring Verification

If your software is highly dependent on a database, data intensive processes re-factoring verification can be a pain in the a$$. Yet, you probably would like to avoid a production change that will result in a total mess in your data.

Verification and Automation is Easier w/MySQL
MySQL helps us verifying the change by using a MD5 hashing. Please notice that the following works only on the Linux platform.

MD5 Hash Based Verification Step By Step

Backup the baseline table before running any process: CREATE TABLE class_baseline SELECT * FROM class;
Run the original process.
Copy the results of the old process to a new table : CREATE TABLE class_original SELECT * FROM class;
If needed, TRUNCATE the table or restore the baseline table results TRUNCATE class; INSERT INTO class SELECT * FROM class_baseline;
Run the modified process.
Change the pager to MD5. The result is a single short MD5 hash instead of the regular screen output: pager md5sum -.
Perform a select statement on the original table. Don't forget to drop from the SELECT statement any auto enumerator or time stamp fields that are being changed in any table modification/insert. If your process re-factoring changes the insert order, don't forget to perform an ORDER BY to avoid the case of same content/different order: SELECT class_name, class_location from class_original ORDER BY class_name, class_location;
Perform the same on the modified process result: SELECT class_name, class_location FROM class ORDER BY class_name, class_location;
Compare the two results. If your re-factoring did change the data itself, you should receive identical HASH results.
Recover to regular screen output mode using the nopager command.

Bottom Line

Re-factoring is much simple with MD5 hashing

Keep Performing,

Moshe Kaplan

Aug 31, 2011

Sharding is Now COTS

I wrote here a lot in the past regarding SQL Server and MySQL sharding. I wanted to update you that ScaleBase, which is led by the industry veterans Liran Zelkha and Doron Levari announced that ScaleBase 1.0 is now available.
Scalebase provides a fine solution to MySQL sharding by implementing an SQL load balancer in front of your servers.
This product is available as a service and as a downloadable product. ScaleBase 1.0 supports:

Read/Write splitting
Transparent Sharding
High Availability

If you need a proper solution for a large size MySQL database, yet you don't have the time or mind to implement sharding by yourselves, you should take a look at their product.

Keep Performing,
Moshe Kaplan

Jul 10, 2011

MySQL Statement Profiling Is Easier Now!

If you has always wanted to know why your SQL statement runs so slow and you are a MySQL DBA, you probably used the explain command to understand the estimated execution plan.

Yet, how could you know what part of the execution will take most time when you'll process the SQL Statement? Will it be the I/O, the sorting, data sending or freeing items?

One Step Ahead
The answer is simpler than you think: MySQL Profiler, a built in command line in MySQL from version 5.0, that let us know exactly why our statement take so long.

How to Do That?

Decide how many queries do you want to profile: the default number is 15 and can be changed to 1 through 100 using the profiling_history_size session variable.
Enable the MySQL Profiler.
Perform the query.
See the exact running time.
Analyze the results.
Disable the MySQL Profiler.

Give Us Some Code!
mysql> SET PROFILING=1;
mysql> SELECT * FROM buffer_log WHERE buffer_num=1 ORDER BY end_date DESC limit 100;
mysql> select sum(duration) from information_schema.profiling where query_id=1;
mysql> show profile for QUERY 1;

+--------------------------------+----------+
| Status | Duration |
+--------------------------------+----------+
| starting | 0.000022 |
| checking query cache for query | 0.000064 |
| Opening tables | 0.000011 |
| System lock | 0.000003 |
| Table lock | 0.000029 |
| init | 0.000027 |
| optimizing | 0.000008 |
| statistics | 0.020821 |
| preparing | 0.000023 |
| executing | 0.000004 |
| Sorting result | 0.291548 |
| Sending data | 0.000317 |
| end | 0.000004 |
| end | 0.000003 |
| query end | 0.000003 |
| storing result in query cache | 0.000007 |
| freeing items | 0.000011 |
| closing tables | 0.000003 |
| logging slow query | 0.000001 |
| cleaning up | 0.000004 |
+--------------------------------+----------+

mysql> SET PROFILING=0;

In this case you can easily see that sorting is our major issue. If you may run an explain command you could easily see that there is no index on end_date field.

More Tips by Ben and JRRL3

It does not require admin permissions.
It should be run from command line (and not from external tools like phpMyAdmin).

Bottom Line
Root cause analysis is much simpler now.

Keep Performing,
Moshe Kaplan

Jun 16, 2011

What should I do when MySQL storage files are getting too large?

As MySQL is the internet and cloud de-facto database, we dedicate another post regarding what should you do before getting to the money time.

Some methods including stored data compression and splitting the InnoDB storage file are covered in this post.

Compressing the File

InnoDB data compression is possible in a way that is very similar to disk compression with the pros and cons of it. This method is defined in the table declaration statement and explained in the MySQL documentation. Will it improve MySQL performance? Fortunately, the answer is yes! a recent post by MySQL@Facebook team shown a 10% boost in response time and improvement in the cache hit ratio.

Splitting the Storage File

The default InnoDB configuration is using a single file named ibdata1. This file stores both data and configuration. The bad news is that this file always expands and cannot be shrinked.

Usually people notice this fact when the database is getting too large and they have some storage or backup issues. Usually in these cases, the system is already in production and cannot suffer downtime. Therefore, you should take care of splitting this file in the early days of the system rather when you are in the money time.

What to do?

Configure the MySQL to save the data in various files, one per table. Please notice that this will work only to tables that will be created from now on. Perform that by adding for following lines to the my.cnf file:

[mysqld]
innodb_file_per_table

Stop the server for maintenance and perform one of the following three methods:

Convert all InnoDB tables to MyISAM and back
Export only InnoDB tables, drop them, delete ibdata1 and import InnoDB tables.
Export all databases, delete ibdata1 and import everything back.

I recommends you to choose option 2 and perform it according to the following procedure: export-delete-import:

mysqldump to the whole database.
Stop the MySQL daemon.
Delete the ibdata + ilog files.
Delete the database folders (all except for MySQL).
Change my.cnf file if needed.
Restart the MySQL daemon.
Import the dumped file back to the system.

Bottom Line

Some times, it wise to perform tasks earlier in order to avoid complex issues in the future.

Keep Performing,

Moshe Kaplan

May 9, 2011

Monitoring and Taking Care of MySQL Replication

If you get your slave into production soon, your should make sure that you know how take care of it when something goes wrong.

Knowing that the slave is stuck can monitored using open source tools like Nagios and its MySQL plugins. In general it is wise to have such a tool to monitor all your system and business aspect including the web server and login rates.

How do I know that my slave is stuck or not?

Connect to MySQL on the slave using permitted user.
Run the show slave status\G command.
The result should include the following lines with a Yes value. If any of them is No, there is a problem:

Slave_IO_Running: Yes
Slave_SQL_Running: Yes

It is stuck, What should I do?

Connect to the file system with a permitted user.
Check the values of the replication error log and take care according to it (tail -n 50 /var/lib/mysql/responder3.spd.co.il.err).

Bottom Line
Get into production!

Keep Performing,

Moshe Kaplan

Scale Hacking: Cloud Computing, Software and System Performance

Pages

Oct 30, 2011

Adding One More MySQL Slave to Cloud Based System (Amazon AWS)

Oct 3, 2011

MySQL CentOS and Amazon Web Services (AWS)

Sep 29, 2011

MySQL Unit Testing and Re-factoring Verification

Aug 31, 2011

Sharding is Now COTS

Jul 10, 2011

MySQL Statement Profiling Is Easier Now!

Jun 16, 2011

What should I do when MySQL storage files are getting too large?

May 9, 2011

Monitoring and Taking Care of MySQL Replication

ShareThis

Intense Debate Comments

Ratings and Recommendations

Tags