Shut Down and Start Up Procedures for Multi-Node AtScale Environments

The following is the shutdown and restart sequence for a Clustered AtScale Multi-Node Environment containing Clustered AtScale Engine nodes and Query Engine nodes (if present).

Note: If there are no Query Engine nodes in the environment, those steps can be skipped over

Shut Down Order - AtScale Nodes should be shut down in the following sequence.

  1. Query Engine node(s) (if present)
  2. Engine Node running Postgres in Replica / Standby mode
  3. Engine Node running Postgres in Leader node
  4. Coordinator node

Restart Order - AtScale Nodes should be restarted in the following sequence (opposite of shut down order)

  1. Coordinator node
  2. Engine Node running Postgres in Leader role
  3. Engine Node running Postgres in Replica / Standby role
  4. Query Engine node(s) (if present)

Procedure to Shut Down and Restart all Services - Clustered AtScale

S1. Log into an ssh session on one of the Engine nodes of the clustered AtScale environment and change to the atscale user (if needed). Execute the command /opt/atscale/current/bin/database/postgres_nodes

The following is what the output will look like

$ /opt/atscale/current/bin/database/postgres_nodes 2023-01-13 14:59:59,465 - WARNING - Using atscale-postgres14-cluster as consul service name instead of scope name atscale_postgres14_cluster +---------------------------------------------+---------------------------------------------------+---------+---------+----+-----------+ | Member | Host | Role | State | TL | Lag in MB | + Cluster: atscale_postgres14_cluster (7121824910042921340) --------------------------------------+---------+---------+----+-----------+ | atscale-ha-node-01.docker.infra.atscale.com | atscale-ha-node-01.docker.infra.atscale.com:10520 | Leader | running | 3 | | | atscale-ha-node-02.docker.infra.atscale.com | atscale-ha-node-02.docker.infra.atscale.com:10520 | Replica | running | 3 | 0 | +---------------------------------------------+---------------------------------------------------+---------+---------+----+-----------+

S2. From the output of the /opt/atscale/current/bin/database/postgres_nodes command, determine which AtScale host is the “Engine Node running Postgres in Leader role” and which node is the “Engine Node running Postgres in Replica / Standby role”. From the above output, the:

  • Engine Node running Postgres in Leader role = atscale-ha-node-01.docker.infra.atscale.com

  • Engine Node running Postgres in Replica / Standby role = atscale-ha-node-02.docker.infra.atscale.com

Record this information for later use in this procedure

The following steps in this procedure will shut down services in AtScale resulting in failure of all queries executed against AtScale and loss of access to the AtScale Design Center User Interface. Please make sure you are prepared for this behavior and if needed that an appropriate down time window and communication of users of service unavailability is communicated if needed.

S3. If Query Engine nodes are present in the environment, shut down services on all Query Engine nodes in the environment and only the Query Engine nodes. If there are no Query Engine nodes in the environment, skip this step. Execute the following command to stop all AtScale services on the Query Engine

/opt/atscale/bin/atscale_stop

This command has shut down all services on the Query Engine. Verify all services have stopped before proceeding to the next Query Engine or the next step.

If there are multiple Query Engine nodes, repeat Step S3. for each Query Engine node.

S4. Shut down all AtScale Services on the Engine Node running Postgres in Replica / Standby role. Run the following command on the Replica / Standby postgres node:

/opt/atscale/bin/atscale_stop

Verify all services have stopped before proceeding to the next step

S5. Shut down all AtScale Services on the Engine Node running Postgres in Leader role. Run the following command on the Leader postgres node:

/opt/atscale/bin/atscale_stop

Verify all services have stopped before proceeding to the next step

S6. Shut down all AtScale Services on the Coordinator node.

/opt/atscale/bin/atscale_stop

Verify all services have stopped.

At this time, all AtScale services have been stopped. If desired, additional verification that AtScale services see the procedure “How to verify all AtScale processes have stopped”. This procedure is typically only necessary in environments where a failure has occurred, such as out of disk space, etc., and / or when additional verification that all processes have stopped is needed or desired.

To restart all services, proceed to the next step

S7. Restart the Coordinator Node

/opt/atscale/bin/atscale_start

Wait 1 to 2 minutes and verify all services are operational before proceeding

/opt/atscale/bin/atscale_service_control status

$ /opt/atscale/bin/atscale_service_control status agent RUNNING pid 650197, uptime 0:01:38 coordinator RUNNING pid 650198, uptime 0:01:38 egress RUNNING pid 650200, uptime 0:01:38 ingress RUNNING pid 650202, uptime 0:01:38 service_registry RUNNING pid 650196, uptime 0:01:38

Was this article helpful?

0 out of 0 found this helpful