How to Scale Out an AtScale Single Instance

This document describes how to scale out an AtScale single instance by moving the AtScale metadata database into its own instance.

The reason for doing this is to minimize query performance degradation due to heavy query history in the database (postgres). 

Pre-requisites

  • AtScale 2021.x or later version
  • 2-cluster node (EC2/VM within the same subnet)
  • Same hardware preferred (16 CPU & 64G RAM)

Here is the yaml file to split the component into two hosts.

atscale.yaml

engine:

  memory: 31G

installation_location: "/opt/atscale"

service_account: "atscale"

loadbalancer_dns_name: "atscaleengine.us-west1-b.c.atscale-sales-demo.internal"

tls:

  enabled: false

  certificate: "/opt/atscale/conf/server.cert"

  key: "/opt/atscale/conf/server.key"

kerberos:

  enabled: false

  keytab: "/opt/atscale/conf/atscale.keytab"

  principal: "atscale/atscaleengine.us-west1-b.c.atscale-sales-demo.internal@REALM"

hosts:

  - dnsname: database.us-west1-b.c.atscale-sales-demo.internal

    services:

      - database

      - coordinator

       override:

      coordinator:

        id: 10

  - dnsname: atscaleengine.us-west1-b.c.atscale-sales-demo.internal

    services:

      - engine

      - modeler

      - directory

      - egress

      - agent

      - ingres

      - service_registry

      - gov_enforce

      - gov_rules

      - servicecontrol

 

Postgres enhancement can be made by modifying the patroni.yml.tpl before you run configuration.sh

/opt/atscale/version/<version>/bin/configurator/roles/database/templates/patroni.yml.tpl

           shared_buffers: 16GB

     effective_cache_size: 48GB

     maintenance_work_mem: 2GB

     checkpoint_completion_target: 0.9

     wal_buffers: 16MB

     default_statistics_target: 100

     random_page_cost: 4

     effective_io_concurrency: 2

     work_mem: 13981kB

     min_wal_size: 2GB

     max_wal_size: 8GB

     max_worker_processes: 16

     max_parallel_workers_per_gather: 4

     hot_standby: "on"

     checkpoint_timeout: 30

 

Note: the assumption is that the host has 16 CPU and 64G RAM

 

Configuration

Make sure atscale.yaml is identical on both hosts.

The first database instance must be up and running. 

run configuration.sh --first-time

At the database host

Once the database is complete with the configuration, run the second configuration at the engine host with the configuration.sh --activate

You can verify that the hosts are starting to communicate with each other by going to the consul URL http://<host>:10555

Note: You can ignore other warnings like egress and auth because you are running with one engine core component and the consul is expecting to have > 1 host for the authentication and messaging.

 

Reason

We split it into two hosts to allow Postgres to serve the metadata to AtScale without having to wait for CPU IO and Disk IO during peak hours. This works well for customers with heavy workloads and high-volume transactions (queries).

You will immediately notice the performance gain from UI operations like reviewing queries or aggregates and large models. You will also see improved performance during high concurrency/workload at peak hours and reduced latency with queries.

 

Why not split the engine and modeler into multiple hosts? The answer is you can split each of the components into three hosts. However, doing this will require a  Load Balancer (LB) or 4th host for the proxy.

Scenario 1

 



Scenario 2

 

In theory, you could add complexity for a single node environment and split like the diagrams above, but you won’t get any extra benefits from Scenario 2, and it is not recommended.

Was this article helpful?

0 out of 0 found this helpful