Sunday, January 24, 2016

high availability - Get IP of node running a specific resource when demoting master nodes to slaves

I am setting up a HA cluster for a web application with 2 nodes (2 physical servers):




  • node1 (current master node)

  • node2 (current slave node)




Using Corosync & Pacemaker I was able to create the cluster and some resources agents including an IP failover and a Webserver (apache).



Resources




  • Failover resource exists on only one node at a time





Uses a python script to make API calls to my hosting provider in order to update the IP failover destination





  • WebServer resource exists (as a clone) on every available node




Standard OCF resource using Apache's server-status handler





Constraints




  • There is a constraint that says that Failover and WebServer must be running at the same time on a server in order for considering it as available.






Now I would like to create a custom resource agent (like I did for the IP failover) that will:




  • Switch the mysql instance of the current slave node into a master node

  • Switch the mysql instance of the current master node into a slave node of the new master node

  • Basically do the same for Redis instance



Ideally, the resource would be started on only one node (master), and stopped on all other nodes (slaves). Therefore, starting the resource would put the current node in master mode, and stopping it would put it in slave mode.




I made a script that can easily achieve all of these operations. Here's how it works.



Turn local node in master mode:



# /usr/local/bin/db_failover_switch.sh master


Turn local node in slave mode:



# /usr/local/bin/db_failover_switch.sh slave 123.45.67.89



The synopsis is pretty straightforward to understand.
The problem I am facing, is that I obviously need to set the master IP in order for the slave to configure MySQL and Redis services accordingly.





In case of failover, I want:





  • Resource starts on node2 which becomes master node

  • Resource stops on node1 which becomes slave node



In order to stop the resource (i.e. setting it into slave mode), I need to know the IP address (hostname will do) of the node which has the resource running.



Is there a way I can have a dynamic parameter that Pacemaker will pass to my resource agent? Or maybe can I retrieve the clusters information directly from my resource agent script to know which is the node running a specific resource?

No comments:

Post a Comment

linux - How to SSH to ec2 instance in VPC private subnet via NAT server

I have created a VPC in aws with a public subnet and a private subnet. The private subnet does not have direct access to external network. S...