The rapid growth of cloud computing left a space for standardization, leaving the DevOps approach kind of in the dark. The “you built it, you run it” approach became the de facto standard in the industry. Unfortunately, it didn’t scale enough and required specialized skills. Moreover, searching for people with this expertise was challenging and expensive.
According to multiple State of the Cloud surveys, 91% of businesses use public cloud, 76% of companies are already multi-cloud, and the trend is only upward. The amount of information we can retain hits the limits. We see increasing cognitive load caused by emerging technologies, tools, frameworks, and new methodologies. That means a lot of code is being actively produced and maintained. That’s why the right infrastructure testing strategy becomes even more crucial today. In our best efforts, we believe we are doing everything to provide the best possible standard. But are we?
In this article, we’ll introduce you to infrastructure testing, provide test cases, and offer advice on how to set up your tests to achieve sustainability and resilience in the cloud.
Why is infrastructure testing a new practice?
To assure quality, we write tests with all of the application code. So why is writing tests for infrastructure, not a standard practice? Thinking about edge cases for infrastructure requires not only vast knowledge about the proper way of utilizing it but also an understanding of the underlying problems of the used solutions. As the Cloud Team at VirtusLab, we would like to share with you a few long-term observations and our solution for infrastructure testing.
This will help you to pave the way and give some insights to upgrade your infrastructure testing strategy. See how your organization can benefit from it.
These are our observations on end-to-end infrastructure testing:
- Infrastructure testing takes a long time and includes templating, which introduces more complexity. We do not want random errors during resource provisioning.
- Static code analysis tools find bugs in the configuration before a much longer test deployment, which saves lots of time in case of a misconfiguration
- End-to-end tests (E2E) can produce resources that affect consecutive tests, so it’s important to provide necessary sandboxing and reduce potential blast radius.
- New versions of Infrastructure as Code (e.g. Terraform providers) and cloud APIs are released, some containing breaking changes. It’s not always possible to foresee how they will affect dependent resources without prior testing.
- Properly written tests, with design patterns in mind, can be easily reused in spite of the cloud provider
- Properly structured and verbose tests can reduce the troubleshooting time from hours to minutes.
The right way to start the infrastructure testing flow
Infrastructure is written in stone in configuration files, which come in many different formats such as YAML, JSON, etc. They all have a way of checking spelling and indentation.
The infrastructure testing process can be as simple as generating actual files from templates and evaluating the correctness of the outcome in terms of values and formatting. It can also be as complicated as dry-running configuration files with their dependencies.
We can choose from a variety of linters and code snippets that take care of different aspects of code quality:
- Simple linters – standalone or IDE-integrated, it doesn’t matter. They provide basic syntax error detection and ensure readability. Examples of simple linters are built-in terraform fmt and terraform validate.
- More advanced static code analysis tools – these are tests that check for security vulnerabilities, which are a result of misconfiguration, e.g. checkov or tfsec for Terraform configuration files.