Introduction
Critical systems are typically required to go through change management. Change management is a structured approach to implementing change in an organization. This article briefly describes change management as a process designed to understand, plan, implement, and communicate changes to a production system.
After initial subsystem deployment, what is unknown can cause the system to crash or become unusable. In an immediate effort to stabilize the system, changes are rapidly made to correct the glaringly obvious issue. However, over time, this rapid deployment of changes is no longer the solution, and a much calmer deliberate approach must be taken not to upset the stability.
A change management plan is designed to reduce the risk of introducing unwanted variables into a production environment. Proper implementation would require mirroring each production environment to validate any changes made. That is unnecessary in the virtualized/cloud environment, but the lessons still apply.
A production mirror must be created to implement a proper change management system in a virtual environment. That environment must apply an automated upgrade step, followed by a scalability test, and subsequently approved. This is why DevOps is a major issue within change management in a cloud environment.
With a proper change management process, you can validate the following:
- Scalability of existing production environments holds in a) software upgrades and 2) model or configuration upgrades
- Pressure from incremental changes continues to follow expected scale curves.
- New software upgrades meet security or stability requirements.
Change management within the AtScale community typically includes, at a minimum:
- Validating any model changes meet user requirements
- Model changes do not exploit the performance of the underlying data source
- Software upgrades continue to meet the customer’s expectations (net new functionality can change expectations)
Roles within the Organization
While out of the scope of this document, roles and responsibilities play a large part in change management. For this document, a few roles will be referred to. Please ask your AtScale representative for more information about these and additional roles.
- End User - the Analyst or set of individuals who will eventually consume the AtScale subsystem. Within an organization, this can be a self-service individual or a targeted data scientist responsible for mining data. The only distinction between this and the “Data Architect” role is that the data architect will modify the AtScale software in lower environments.
- Data Architect - the individual or set of individuals responsible for modifying the AtScale software. This is loosely defined for those who do performance tuning, as some of the tuning is modeling, while others are technical. This document performs ALL performance tuning and settings changes as the Architecture role. This will become clear later when discussing the separation of roles for promotion (the same DBA who performs a change in a lower environment for tuning purposes should not be the person who promotes that change to upper environments).
- Change Management Administrator - Ultimately, this is the person or persons responsible for promoting changes from one environment to another. It is strongly suggested that the person performing these changes has no affiliation with the Data Architecture role to preserve good hygiene.
- IT Administrator - This role specifically highlights the hardware vs. software changes in environments. This person plays a peripheral role in supporting the AtScale subsystem and will have change management protocols that must be adhered to.
Development Environment
In the AtScale context, Development Environments are just that, sandboxes or playgrounds for data architects to test new features, modify models and validate functionality. By no means should they be considered for promotion.
Deployment Architecture
Figure: Development environment deployment options and usage.
In most modern systems, development is done on the individual developer’s laptop or desktop. This is perfectly fine, as long as proper version control systems are in place. Developers will need to have a complete, if truncated, environment. All the system pieces must be part of the overall process to converge branches.
Because development leads production, it is acceptable to have newer versions in development than are being deployed in production; however, because development is distributed across multiple machines, all development must be done on the same versions of all tools. Developers should be encouraged to update to a new version once the entire development group is ready to update.
Your development server may also be your source code control repository. If so, it will be where unit tests are run either upon check-in or at pre-determined times.
Careful planning should be paid to
Testing
Once the code passes its unit tests, the testing server is where integration testing occurs. On a pre-determined schedule, your testing server should—hopefully automatically—check out all the code, refresh the database and then execute a bottom-up testing script. All unit tests are run, then integration and regression testing are performed to ensure all the pieces fit together and nothing previously working was broken.
Acceptance
The acceptance testing server should always mirror your production server in services and revision levels. When the code is ready for release, it is checked onto the Acceptance server. Some teams
Also re-run the integration/regression testing suite at this point as one final sanity check before the client sees the code. Acceptance testing for web-based applications should be where your client sees and tests the code using their testing plan.
Acceptance is also where you test your version-migration scripts.
Production
The final phase of the development process is to stage your code into production. If all goes well, this should be a fully-automated process—or, at least, a highly automated operation.
Forms of Data for Migration
In the generic sense of Change Management, software systems typically classify data into three different categories: environmental, meta, and fact data. AtScale, as a subsystem, is
Promotion Process
Figure: Overview of the promotion process.
Full Promotion (DevOps Promotion)
A Full Promotion process involves a complete setup of a new environment. From AtScale’s perspective, this is irrelevant to the hardware; it could be considered an “overwrite” of an existing instance of AtScale or a net-new setup of an environment.
New environment setup is becoming more prevalent with virtualized hardware environments such as Kubernetes and cloud environments. Often, customers will opt to learn to do full promotion only and use it over partial promotion methods.
A full promotion includes:
- Moving all Environmental, Metadata, and Fact Data to a new environment.
- Altering Environmental Data to suit the new environment.
Partial Promotion
A partial move includes some, but not all, data movement. This is typically performed when an environment already exists.
User Acceptance
Acceptance of an intelligence subsystem implies two concepts:
- The system provides valuable information.
- Usage of this subsystem enhances the user’s understanding or reduces the complexity of finding the answer in another way.
These internal measurements come in the form of user surveys. While low on the score, we still see a large adoption. This implies that AtScale has achieved #1, valuable information, but needs more #2, complexity reduction.
From the feedback, this is likely attributed to the historical outages and some performance issues that continue today. Outages are unacceptable and ruin the perception of the value of the subsystem. Performance, while nagging, can grow to be as prohibitive over time.
Training
The internal efficiency of any software used should concern any organization. If a platform has been adopted, but the efficiency is low, its value may not exceed its usefulness.
A simple combatant to this is to cross-train all levels of users. For AtScale, this means cross-training (or training) AtScale Administrators so that they all have the same expectations of the software or deployment (see Change Management). Data engineers would benefit from cross-training for best practices on model development (see Performance). End users will benefit from the analytics created from one model and be able to express it across another. In this, AtScale has seen success after success.
To create a synergistic model, the opportunities exist:
- Formal Training (AtScale)
- Informal (internal) training
- Lunch and Learns
- Project Roadmap Presentations
Disk Based Settings
YAML Settings
- TLS Disk Locations
- Hostnames
- Mixed
- Kerberos Keystore
- Cores/Memory
Security
- Certificates
- Keytab
- MapR Keystore
Other
- Third-Party JDBC Drivers
AtScale Application Settings
Security
- Environmental
- SAML Configuration
- Mixed
- Directory Setup (LDAP or Okta)
Configuration Settings
- Environmental
- Agent IP or Hostname and Port
- Manager IP or Hostname and Port
- Design Center IP or Hostname and Port
- Load Balancer URL
- Vanity URLs
- Tableau Server / Integrations and Settings
- Mixed
- Email Server Settings
- Engine Load Balancer URL
- CORS Referrer Origins
- Webhooks
Engine
- Environmental
- Host and Port
- Auth URI, host, port
- External message bus, schema registry, search index hosts, and ports
- Load balancer MDX and SQL URLs
- Modeler Host and Port
- Mixed
- DFS Account Keys
- Filesystem install paths
- Kerberos domain, host, keytab
- SSL Keystore Path and Password
Other
- Environmental
- Licenses
- Mixed
- Data Warehouse Connection Information
Appendix B: Non-Project-Based Metadata
- Manually created: Users, Groups, Roles
- Manual Assignments of roles
- Organizations
- Global Settings - Options
- Global Settings - Configuration
- Organization Settings - Options
- Engine Settings
- Any custom engine settings
- Overridden Cube Settings
- Aggregate Settings