Deploying AtScale in Production
This document provides recommendations to run AtScale helm chart in production.
Requirements
To run AtScale in production you need a Kubernetes cluster with the following specifications.
-
The cluster should consist of at least 3 nodes with AMD64 architecture in a Multi-AZ setup.
ARM64 is not recommended for production
-
Resources per node
-
16 CPU
-
64 GB RAM
-
256 GB Storage
-
These may vary depending on your workflows and user count
High Availability
When running AtScale in production you need to have AtScale in high availability mode. The helm chart is setup to accommodate this. Every service needs to be distributed over the 3 nodes.
To do this you can specify the following configuration in your override values file.
atscale-entitlement: replicaCount: 3 atscale-engine: replicaCount: 3 gateway: replicaCount: 3 atscale-sml: replicaCount: 3 atscale-api: replicaCount: 3 postgres: pgpool: replicaCount: 3 postgresql: replicaCount: 3 # Redis requires for node tolerations and affinity to be setup. # You need to run a single master and a replica on each node # internalTrafficPolicy should be set to Local. # More details on this here https://atscale.atlassian.net/wiki/spaces/DOPS/pages/3961454608/Configuring+Redis+in+High+Availability+Mode?atlOrigin=eyJpIjoiOTkxODFiZWY4NGQ4NDdkODg2MTc3YzU1YmUxYmYwOTUiLCJwIjoiYyJ9 redis: master: count: 3 replica: replicaCount: 3 keycloak: replicaCount: 3 telemetry: replicaCount: 3This will deploy AtScale and depending on your cluster pod distribution policy every node in every AZ will have a single pod of every service.
Recommendations
We ship Minio in our chart for Loki’s object storage. When running in high availability it is preferable that you use an external S3 / Object Storage provider to store logs.
loki: loki: externalStorage: s3: bucketnames: "" access_key_id: "" endpoint: "" secret_access_key: "" region: "us-west-2"If you are not running on aws you can set up loki with any kind of provider that loki uses. More examples in Loki’s docs.
You can completely override our loki configuration by using the loki.loki.existingConfigmap: "loki-config"
Override the name with your predefined config map.
Backing up Postgres
One of the most important things to remember when running AtScale in production is that its highly recommended for you to keep backups of your database. This will ensure you can recover from a disaster without losing your data. A backup can be easily created by running the following code.
export PGPASSWORD=$(kubectl get secret -n <NAMESPACE> postgres-default-user -o json | jq -r '.data.password' | base64 -d) kubectl port-forward -n <NAMESPACE> svc/postgres-pgpool <DESTINATION_PORT>:10518 pg_dumpall -h 127.0.0.1 -p <DESTINATION_PORT> -U postgres --no-password > backup.dumpThis can be setup in a pipeline, cronjob or a systemd timer job and can be run daily.
Volume Backups
Postgres in AtScale keeps data on a Persistent Volume. If you are running on a cloud provider that allows you to create backups on your PVs you should enable them. A quick restore of your volumes can restore your database data in case of disaster.
Managed Services
If you don’t have the capacity to host and maintain your own postgres or redis instances. We advice you to take a look into what managed solutions your cloud provider has to offer.
When running Postgres in a managed solution its is imperative to make sure the managed service is the same Major and Minor version as the one deployed in AtScale. Patches can differ.
-
AWS has
-
RDS - managed Postgres.
-
You should setup Postgres in clustered mode with a proxy infront
-
-
Elastic Cache for Redis
-
-
Azure has
-
Azure Database for PostgreSQL
-
Azure Cache for Redis
-
-
Google Cloud has
-
Cloud SQL for PostgreSQL
-
Memorystore for Redis
-
Redis and Postgres are core dependencies they need to be highly available and accessible by AtScale
Overriding Configs
When using a managed service you can override the default secrets AtScale creates with your own.
atscale-entitlement: externalRedis: existingSecret: "redis-secret" existingSecretHostKey: "host" existingSecretPortKey: "port" existingSecretPasswordKey: "password" existingSecretSslEnabledKey: "sslEnabled" existingSecretSslCertKey: "" existingSecretSslPrivateKey: "" existingSecretSkipVerifyKey: "" externalDatabase: waitForDB: 60 existingSecret: "atscale-postgres" existingSecretHostKey: "host" existingSecretPortKey: "port" existingSecretUserKey: "user" existingSecretDatabaseKey: "database" existingSecretPasswordKey: "password" existingSecretSslModeKey: "" existingSecretSslCertKey: "" existingSecretSslPrivateKey: "" atscale-engine: externalRedis: existingSecret: "redis-secret" existingSecretHostKey: "host" existingSecretPortKey: "port" existingSecretSslEnabledKey: "sslEnabled" externalDatabase: existingSecret: "atscale-postgres" existingSecretHostKey: "host" existingSecretPortKey: "port" existingSecretUserKey: "user" existingSecretPasswordKey: "password" existingSecretDatabaseKey: "database" existingSecretSslEnabledKey: "sslEnabled" atscale-sml: externalRedis: existingSecret: "redis-secret" existingSecretHostKey: "host" existingSecretPortKey: "port" existingSecretSslEnabledKey: "sslEnabled" atscale-api: externalDatabase: existingSecret: "atscale-postgres" existingSecretHostKey: "host" existingSecretPortKey: "port" existingSecretUserKey: "user" existingSecretPasswordKey: "password" existingSecretDatabaseKey: "database" existingSecretSslEnabledKey: "sslEnabled" keycloak: externalDatabase: existingSecret: "keycloak-postgres" existingSecretHostKey: "host" existingSecretPortKey: "port" existingSecretUserKey: "user" existingSecretDatabaseKey: "database" existingSecretPasswordKey: "password"Upgrading
Currently when upgrading it is advised to follow the AtScale documentation Release Notes | AtScale Documentation and upgrade versions in order as manual steps might be required in some upgrades of AtScale.