Scale Hacking: Cloud Computing, Software and System Performance: January 2014

Jan 13, 2014

Create Your Postgres Cluster in 6 Steps

Setting a Postgres cluster (or even a Master-Slave configuration) was not an easy task before version Postgres 9.0.

In the following lines you will find a short guideline that will help you create you Postgres HA solution:

1. Install postgres on both machines
sudo yum -y install postgresql-server
sudo service postgresql initdb
sudo service postgresql start
sudo sudo -u postgres psql

2. Stop the Postgres instance on both machines, delete the /var/lib/postgresql folder at the slave, and recover it by copy the files from the master.

3. Do initial configuration on both machines to enable logging:
sudo mkdir /var/log/postgresql/
sudo chown -R postgres:postgres /var/log/postgresql/
sudo vi /var/lib/pgsql9/data/postgresql.conf
> log_directory = '/var/log/postgresql'

4. Configure the Master and Slave to enable replication
sudo vi /var/lib/pgsql9/data/postgresql.conf @ Master (enable logging and listening)
> listen_addresses = '*'
> wal_level = hot_standby

> max_wal_senders = 3

sudo vi /var/lib/pgsql9/data/pg_hba.conf @ Master (enable access to the server, tune it, if your cluster is not in a secure environment):
> host replication all 192.168.0.2/32 trust
> host all all 0.0.0.0/0 trust

sudo vi /var/lib/pgsql9/data/postgresql.conf @ Slave (turn on standby)
> hot_standby = on

sudo vi /var/lib/pgsql9/data/recovery.conf @ Slave (turn on standby a new file)
> standby_mode = 'on'

> primary_conninfo = 'host=192.168.0.1'

5. Finally, start your slave, and then your master on both sides:
sudo service postgresql start

6. And check replication on by creating database at the Master, and check it at the slave
master> sudo sudo -u postgres psql
CREATE DATABASE a
\l
List of databases
Name | Owner | Encoding | Collate | Ctype | Access privileges
-----------+----------+----------+-------------+-------------+-----------------------
a | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 |

P.S If you are using a CentOS and yum (and not AWS), the following link will be useful for you.

Recovering from Cluster Crash (Thanks for Yariv Rachmani for contribution)
If you find the following errors in the your Postgres error:
> 2015-01-01 19:41:14.419 IST >LOG: started streaming WAL from primary at 0/86000000 on timeline 1
> 2015-01-01 19:41:14.420 IST >FATAL: could not receive data from WAL stream: ERROR: requested WAL segment 000000010000000000000086 has already been removed

In this case the quickest procedure to recover the slave:

Stop the service: > service postgresql-9.3 stop
Backup the data folder: > cp -prf /var/lib/pgsql/9.3/data /var/lib/pgsql/9.3/data-org
Remove the data folder: > rm -rf data
Copy the data folder from master: > pg_basebackup -h -D /var/lib/pgsql/9.3/data -U postgres -P -v -x
Change permissions: > chown -Rf postgres:postgres data
Copy the configuration file from the backup:> cp -p data-org/*.conf data/
Start again the slave: > service postgresql-9.3 start
Verify the slave log file

Bottom Line
Postgres replication is not so complicated starting with Postgres 9.0. Now, it's your turn to keep your system uptime!

Keep Performing,
Moshe Kaplan

Jan 7, 2014

Monitoring Your SQL Server w/ Monitis

One of the nicest monitoring products around is Monitis.com
If you are using MySQL, you will probably find out that some of the thresholds there should be modified or tuned (or just removed, since they are not critical to your day to day operations):

Key Cache Writes Rate
Actual value: 0
Proposed range: 75-100
What should be done: If you are not using MyISAM, ignore this one. If you do use it, modify the key_buffer_size variable and increase it to accommodate the index in the memory.

Indexes Usage
Actual value: 5
Proposed range: 80-100
What should be done: Indexes usage is a key indicator to understand if you write your queries in a way that utilizes your indexes and if you defined correct indexes to support your questions.
Enable your slow log query and examine your queries.

Key Cache Hit Rate
Actual value: N/A
Proposed range: 95-100
What should be done: Seems that this is not a critical variable, since you must check it is relevant in your case. If you believe that your system business case should use most data out of cache and value is low, modify the key_buffer (MyISAM) or innodb_buffer_pool_size (InnoDB).

Table Cache Hit Rate
Actual value: N/A
Proposed range: 85-100
What should be done: See the one above

Bottom Line
Monitoring system is critical component of critical systems. However, in most cases you should verify that the default configuration matches your business case (and if it is the case, take care the alerts!).

Keep Performing,
Moshe Kaplan

MySQL and Application Servers Connection Pool

As we discussed in the past, MySQL is very lightweight in terms of connections creation and closure overhead, and therefore classic LAMP installations rarely use it.
However, some application frameworks such as Java and Ruby tend to use it and in these cases it is recommended to adjust some configurations in order to avoid long connections cuts.

The Pattern
Since MySQL default is 8 hours timeout, this pattern is usually when connection pools are configured with high number of connections (and low utilization) or when pattern usage is limited to defined hours (daytime for example).

How to Detect this Pattern?
By using MySQLTuner, you will find out that Connections Aborted parameter is too high (I've seen cases with as high as 80%).

What Should We Do?
Adjust the connections timeout, in order to avoid connection pool connections cutting:
#1 day timeout
SET GLOBAL wait_timeout = 86400
SET GLOBAL interactive_timeout = 86400

or in the my.cnf
wait_timeout = 86400
interactive_timeout = 86400

Bottom Line
Connection pools are not a native case for MySQL, but it does not mean your cannot support it wisely.

Keep Performing,
Moshe Kaplan

Jan 1, 2014

Some More MySQL Tuning

thread_handling = one-thread-per-connection

MySQL is designed for lightweight connection creation. Therefore, you may not use connection pooling. However, if you are a connection pooling fan (Java and Ruby devs, please raise your hands), don't forget to configure the MySQL for that:

http://www.mysqlperformanceblog.com/2010/10/27/mysql-limitations-part-4-one-thread-per-connection/

http://dev.mysql.com/doc/refman/5.6/en/thread-pool-plugin.html

In this case you should also avoid the thread_cache_size recommendation and use thread_cache_size = 0 (and not 8 for example).

Double Flushing and SSD

Does your data really being written to disk? or does it stuck in the OS caching?

innodb_flush_method = O_DIRECT: bypass the OS caching

innodb_flush_method = O_Sync: Makes sure disk is getting the call

innodb_flush_method = O_DSync: The last two options combined

http://stackoverflow.com/questions/2762697/mysql-innodbinnodb-flush-method

Large RAM configurations

If you have a lot of memory, and many connections (and threads), you will probably find out that your threads are waiting for your memory. In order to avoid it, you may split the InnoDB buffer pool size to smaller sections. Pros: each one manage its section, so data can be served from multiple memory sections (N times faster). Cons: you may find it memory inefficient, as data may be loaded twice to memory.

In any case, make sure innodb_buffer_pool_size/innodb_buffer_pool_instances > 1GB

Please notice that number selection should be based on actual system bottleneck.

Matching IOPS

SATA and SAS disk were providing 100 IOPS (and multipliers of it when using RAIDs). Therefore innodb_io_capacity default was 200.

If you consider migration to SSD based machines (that can provide up to 150K IOPS) you may find it useful to change it to the actual system IOPS capacity.

Please notice that if you use innodb_buffer_pool_instances, you should divide this number by the number of instances.

http://mysqlha.blogspot.com/2013/05/configuring-innodb-for-mysql-56.html

IO Threads

Consider adjusting the number of threads writing to/reading from disk to the number of your system CPU cores

innodb_read_io_threads = N
innodb_write_io_threads = N
http://dba.stackexchange.com/questions/33125/mysql-5-5-determining-correct-writeread-io-threads-on-high-end-system

Bottom Line

With MySQL endless configuration options, you can always find a great options to better tune your system.

Keep Performing,

Moshe Kaplan

Scale Hacking: Cloud Computing, Software and System Performance

Pages

Jan 13, 2014

Create Your Postgres Cluster in 6 Steps

Jan 7, 2014

Monitoring Your SQL Server w/ Monitis

MySQL and Application Servers Connection Pool

Jan 1, 2014

Some More MySQL Tuning

ShareThis

Intense Debate Comments

Ratings and Recommendations

Tags