For AtScale Clusters Only: Returning a Failed AtScale Database Instance to an AtScale Cluster
If an AtScale database instance in an AtScale cluster fails, you can bring that instance back into the cluster as a standby instance after resolving the issues that caused it to fail.
If the master database instance fails, the cluster fails over to one of the standby database instance on another node. Use the instructions in this topic for detecting when such a failover has happened, and for bringing a database instance back into the cluster as a standby database instance.
If a standby instance fails, you can use the instructions in this topic for detecting the failure and for bringing the database instance back into the cluster as a standby instance.
Important: The master database instance might not be running on the node that was designated as the master node during the installation of Clustered AtScale. Such a situation does not constitute a problem. The master database instance is initially created on the node that is brought up first during the installation; however, one of the other database instances in the cluster can become the master during the operation of the cluster, so long as there is only one master database instance at a time.
Checking the status of the AtScale database instances
An instance of the AtScale database runs on each node in an AtScale cluster. Only the master database instance is active. The others are on warm standby, with the master instance continuously replicating changes to the standby instances.
You can check the status of all database instances in an AtScale cluster by running the following command on any of the AtScale nodes to find out which node is currently running the master (Leader) database instance:
/opt/atscale/current/bin/database/postgres_nodes
The output is a table that looks like this:
Cluster
Member
Host
Role
State
TL
Lag in MB
atscale_postgres_cluster
atscale-01
atscale-01:10520
Leader
running
1
atscale_postgres_cluster
atscale-02
atscale-02:10520
running
1
Bringing a failed database instance back into an AtScale cluster as a standby database instance
If a database instance has failed and the problem has been resolved, do the following to bring the instance back up as a standby database instance in the cluster.
-
Make sure that the database instance is not a running member of the database cluster.
/opt/atscale/current/bin/database/postgres_nodes
The output is a table that looks like this:
Cluster
Member
Host
Role
State
TL
Lag in MB
atscale_postgres_cluster
atscale-01
atscale-01:10520
stopped
unknown
atscale_postgres_cluster
atscale-02
atscale-02:10520
Leader
running
2
-
On the failed node, move or remove the
/opt/atscale/data/database
directory. -
On the failed node, restart the
database
service./opt/atscale/bin/atscale_service_control start database
-
Make sure that the database instance is a running member of the database cluster.
/opt/atscale/current/bin/database/postgres_nodes
The output is a table that looks like this:
Cluster
Member
Host
Role
State
TL
Lag in MB
atscale_postgres_cluster
atscale-01
atscale-01:10520
running
2
atscale_postgres_cluster
atscale-02
atscale-02:10520
Leader
running
2