MySQL Mate & Maté: code

The Issue

I haven't blogged in a while an I have a long TODO list of things to publish: The repository for the SNMP Agent, video and slides of my OSCON talk and a quick overview of MHA master-master support. In the meantime, here's a little fact that I didn't know from MySQL CREATE VIEW documentation:

Although it is possible to create a view with a nonexistent DEFINER account, an error occurs when the view is referenced if the SQL SECURITY value is DEFINER but the definer account does not exist.

How can this be possible?

The Problem

For a number of reasons we don't have the same user accounts on the master than we have on the slaves (ie: developers shouldn't be querying the master). Our configuration files include the following line:

replicate-ignore-table=mysql.user

So if we create a user on the master, the user definition doesn't go through the replication chain.

So a VIEW can be created in the master, but unless we run all the proper GRANT statements on the slave as well, the VIEWs won't be effective on the slaves. Example from our slave (output formatted for clarity):

show create view view3\G
*************************** 1. row ***************************
                View: view3
         Create View: CREATE ALGORITHM=UNDEFINED 
               DEFINER=`app`@`192.168.0.1` 
               SQL SECURITY DEFINER VIEW `view3` AS select 
[...]

show grants for `app`@`192.168.0.1`;
ERROR 1141 (42000): There is no such grant defined 
for user 'app' on host '192.168.0.1'

The Solution

Once again, Maatkit's to the rescue with mk-show-grants on the master:

mk-show-grants | grep 192.168.0.1
-- Grants for 'app'@'192.168.0.1'
GRANT USAGE ON *.* TO 'app'@'192.168.0.1' 
IDENTIFIED BY PASSWORD '*password_hash';
GRANT DELETE, EXECUTE, INDEX, INSERT, SELECT, 
SHOW VIEW, UPDATE ON `pay`.* TO 'app'@'192.168.0.1';

A simple copy from the master and paste onto the slave fixed it.

Conclusion

Every now developers come to me with unusual questions. In this case it was: How come I can access only 2 out of 3 views?. In cases like these, it usually pays off to not overthink the issue and look into the details. A SHOW CREATE PROCEDURE on the 3 views quickly showed that one had a different host for the DEFINER. A quick read through the documentation and an easy test confirmed the mistake. That's why I have 3 mantras that I keep repeating to whomever wants to listen:

Keep it simple
Pay attention to details
RTFM (F is for fine)

It constantly keeps me from grabbing some shears and going into yak shaving mode.

Why This Post

While testing Yoshinori Matsunobo's MHA agent I found that although the wiki has a very complete documentation, it was missing a some details. This article intends to close that gap and bring up some issues to keep in mind when you do your own installation. At the end of the article I added a Conclusions section, if you're not interested in the implementation details, but to read my take on the project, feel free to jump straight to the end from here.

My Test Case

Most of our production environments can be simplified to match the MHA's agent most simple use case: 1 master w/ 2 or more slaves and at least one more slave in an additional tier:

Master A --> Slave B
         +-> Slave C --> Slave D

As noted in the documentation, in this case the MHA agent will be monitoring A, B & C only. I found that unless you have a dedicated manager node, a slave on the 3rd tier (Slave D above) is suitable for this role. All 4 servers were setup as VMs for my evaluation / tests. It makes it easier to simulate hard failure scenarios in a controlled environment. Once this is in place the fun begins.

1st Step: User Accounts

In all the examples in the documentation it uses root to login into MySQL and the OS. I prefer to create specific users for each application, so I created a specific MySQL user for the MHA agent and used the linux' mysql user (UID/GID = 27/27 in RedHat / CentOS).

MySQL Credentials

Reviewing the code, I was able to determine that the agent requires to run some privileged commands like: SET GLOBAL variable, CHANGE MASTER TO ..., FLUSH LOGS ..., SHOW SLAVE STATUS, etc. and creates internal working tables to be used during the master fail over. The easiest way to set it up was using:

GRANT ALL PRIVILEGES ON *.* TO mha_user@'ip address'  
IDENTIFIED BY password;

This should be repeated on all 4 servers using the IP addresses for all the potential manager nodes. Yes, it would be possible to use wildcards, but I consider restricting access from specific nodes a safer practice.

The MySQL replication user needs to be set up to connect from any other server in the cluster, since any of the slaves in the group could be promoted to be master, and have the rest of them connecting to it.

Linux User

As I mentioned before I use the default RedHat / CentOS definition for the mysql user. Keep in mind that if you installed from the official Oracle packages (ie: RPMs), they may not follow this criteria and could result in mismatching UID/GIDs between servers. The UIDs/GIDs for the mysql user and group have to be identical on all 4 servers. If this is not the case, you may use the following bash sequence/script as root to correct the situation:

#!/bin/bash 
# stop mysql
/etc/init.d/mysql stop
 
# Change ownership for all files / directories
find / -user mysql -exec chown -v 27 {} \;
find / -group mysql -exec chgrp -v 27 {} \;
 
# remove old user / group and rename the new ones
# might complain about not being able to delete group.
groupdel mysql
userdel mysql 

# Add the new user / group
groupadd -g 27 mysql
useradd -c "MySQL User" -g 27 -u 27 -r -d /var/lib/mysql mysql
 
# restart MySQL
/etc/init.d/mysql start

Once the mysql user is properly setup, you'll have to create password-less shared keys and authorize them on all the servers. The easiest way to do it is to create it in one of them, copy the public key to the authorized_keys file under the /var/lib/mysql/.ssh directory and then copy the whole directory to the other servers.

I use the mysql user to run the scripts since for most distributions it can't be used to login directly and there is no need to worry about file permissions, which makes it a safe and convenient user.

2nd Step: Follow The Documentation to Install and Configure

Once all the users have been properly setup, this step is straight forward. Check the Installation and Configuration sections of the wiki for more details.

For the placement of the configuration files I deviated a little bit from documentation, but not much:

Used a defaults file: /etc/masterha_default with access only for user mysql since it includes the MHA agent password:

-rw------- 1 mysql mysql 145 Aug 11 16:36 masterha_default.cnf
The application settings were placed under /etc/masterha.d/ this way they're easy to locate and won't clutter the /etc directory.

For simplicity, I didn't include any of the optional scripts and checks (ie: secondary check) in the configurate. You may want to check the documentation and source code of these scripts. Some of them are not even code complete (ie: master_ip_failover). Unless you are implementing some of the more complicated use cases, you won't even need them. If you do, you'll need to write your own following the examples provided with the source code.

Once you have everything in place, run the following checks as the mysql user (ie: sudo su - mysql):

masterha_check_ssh: Using my configuration files the command line looks like:

masterha_check_ssh --conf=/etc/masterha_default.cnf --conf=/etc/masterha.d/test.cnf
masterha_check_repl: This test will determine whether the agent can identify all the servers in the group and the replication topology. The command line parameters are identical to the previous step.

Both should show and OK status at the end. All utilities have verbose output, so if something goes wrong it's easy to identify the issue and correct it.

3rd Step: Run the Manager Script

If everything is OK, on the MHA node (Server D in my tests) run the following command as user mysql (ie: sudo su - mysql):

masterha_manager --conf=/etc/masterha_default.cnf --conf=/etc/masterha.d/test.cnf

You have to keep in mind that should the master fail, the agent will fail over to one of the slaves and stop running. This way it'll avoid split brain situations. You will either have to build the intelligence in the application to connect to the right master when failing or use a virtual IP. In both cases you'll might need to use customized IP failover scripts. The documentation provides more details.

Read the section about running the script in the background to choose the method that best fits your practice.

You will have to configure the notification script to get notified of the master failure. The failed server will have to be removed from the configuration file before re-launching the manager script, otherwise it will fail to start.

You can restart the failed server and set it up as a slave connected to the new master and reincorporate it to the replication group using masterha_conf_host.

Conclusion

This tool solves a very specific (and painful) problem which is: make sure all the slaves are in sync, promote one of them and change the configuration of all remaining slaves to replicate off the new master and it does it fairly quickly. The tool is simple and reliable and requires very little overhead. It's easy to see it is production ready.

The log files are pretty verbose, which makes it really easy to follow in great detail all the actions the agent took when failing over to a slave.

I recommend to any potential users to start with a simple configuration and add the additional elements gradually until it fits your infrastructure needs.

Although the documentation is complete and detailed, it takes some time to navigate and to put all the pieces of the puzzle together.

I would like the agent to support master-master configurations. This way it would minimize the work to re-incorporate the failed server into the pool. Yoshinori, if you're reading this, know that I'll volunteer to test master-master if you decide to implement it.

MySQL Mate & Maté

Wednesday, October 12, 2011

TIL: Lookout For DEFINER

The Issue

The Problem

The Solution

Conclusion

Wednesday, August 17, 2011

MySQL HA Agent Mini HowTo

Why This Post

My Test Case

1st Step: User Accounts

MySQL Credentials

Linux User

2nd Step: Follow The Documentation to Install and Configure

3rd Step: Run the Manager Script

Conclusion

Tuesday, December 8, 2009

sar-sql New Alpha Release

DISCLAIMER

About Me

OSCON 2011

Followers

Blog Archive

AddThis

Twitter

Favorite Comic

Meebo

Database and OS URLs