By David Mattia
January 6, 2021•9 min read
It’s 2am. You finally have a Lambda function running in a VPC. And all it took was 40 Terraform resource blocks and 8 hours of your life. You breathe a sigh of relief, knowing that you are the embodiment of DevOps.
The next day, after getting a good night of sleep, you realize that maybe your hello-world-test Lambda function isn’t as unique as you thought it was, and maybe somebody else has created a Terraform module that could have saved you a few hours with a more production-ready result.
One year later, you look back fondly on your days of not knowing what the heck you were doing, as you now publish new Lambda functions in minutes with a trusty Terraform module. But is this module something you wrote, or is it open-source? And what difference does that make?
This post makes an argument that you should try to use open-source Terraform modules when you can, and that you’re likely able to much more often than you might think.
At Transcend, we use open-source Terraform modules as much as we can to ensure we are always building our software on secure, tested infrastructure, built with best practices. While we still have a few small private modules that we use internally, over 60% of our over 500 used modules are open-source.
Terraform has taken over as a dominant force in the infrastructure-as-code world. It’s easy to get started with, open-sourced, and is incredibly powerful across dozens of cloud providers.
Terraform lets you create resource blocks that define a cloud resource, such as a server, IAM user, or database. As an extension, you can also group resources together into reusable modules.
To use a module, you create a new block in any terraform file and specify the block type as module, and point the source field to any module on the open-source registry.
module "vault" { source = "hashicorp/vault/aws" version = "0.14.0" # insert the 4 required variables here }
In this example, the module points to the v0.14.0 version of Hashicorp’s official Vault module. Pinning to a specific version is crucial with external modules, as it allows you to pin to a specific point in time.
I like to think of modules as a function. The source is the name of the function you want to call. The variables you pass in are the parameters, and the output blocks it defines are the function result. The function body is updating cloud resources to match your defined resource configuration.
While coding your own Terraform modules is tempting for those that like to get in the weeds and learn, there are some key reasons why leveraging existing open-source work is beneficial in the long run.
If each company could just copy-paste each others’ infrastructure and we could all have well-secured, easily maintainable, highly performant infrastructure at the press of a button, we wouldn’t need DevOps or Operations engineers. In the real world, things aren’t that easy as each company has its own requirements.
But even so, we often choose to rely on the same underlying technologies. Depending on your requirements, you might choose to deploy some apps directly onto servers like an EC2, or maybe you will containerize your app and deploy it to ECS or a Kubernetes cluster, or maybe your app works best on a serverless platform like Lambda. And in each of these cases, you still have many options to customize exactly how you want your infrastructure.
With nearly infinite different ways you can deploy and secure your applications, it may seem like using pre-built modules won’t work for your exact use case. But by accepting that you’ll still likely use services like VPCs, Lambda, Autoscaling groups, ECS/EKS, etc, we must admit that the level of “pre-built”-ness we need exists on a spectrum.
You wouldn’t use “we’re all unique” as justification for not using Terraform, or for not using Kubernetes, so we need to carefully evaluate if that argument applies for Terraform modules. And that leads us to…
Use (and re-use) building blocks. (photo by Xavi Cabrera on Unsplash).
Maybe you don’t want a single “web-app” module like this one that manages all of your infrastructure for an app for you. For new applications, it might be a good option, but maybe you don’t want your logs going to CloudWatch, as you use something else. Maybe you want to use Kubernetes because you have an existing cluster. Maybe you have an existing CI/CD pipeline and don’t want to use Codepipeline. Maybe you want to use Istio for service registration instead of AWS App Mesh.
But the all-in approach is just one end of the spectrum, and the other is not using modules at all, and just defining each resource you need in a tedious manner that could end up with hundreds/thousands of lines of code required per application.
At Transcend, we’ve found the happy medium of using modules that aim to handle all resources within a specific AWS service (or sometimes a small handful). This means that we don’t search for full web-app solutions, but for modules related to VPCs, Lambda, IAM, etc.
As an example, here is a usage of the VPC module from terraform-aws-modules:
module vpc { source = "terraform-aws-modules/vpc/aws" version = "v2.62.0" name = "demo-vpc" cidr = "10.0.0.0/16" azs = ["eu-west-1a", "eu-west-1b"] public_subnets = ["10.0.1.0/24", "10.0.2.0/24"] private_subnets = ["10.0.11.0/24", "10.0.12.0/24"] # Have one NAT Gateway per AZ to give private subnets access to the external internet enable_nat_gateway = true single_nat_gateway = false one_nat_gateway_per_az = true # Support VPC Flow Logs enable_flow_log = true create_flow_log_cloudwatch_log_group = true create_flow_log_cloudwatch_iam_role = true flow_log_max_aggregation_interval = 60 }
If you are familiar with VPCs on AWS, this likely reads fairly easily regardless of if you’ve seen this module before. It creates a private cloud named demo-vpc that has public and private subnets in the eu-west-1 region. The private subnets will have external internet access through a NAT gateway that is shared across AZs to save money. All network traffic will be logged to a new Flow Logs log group.
This short module declaration creates 25 unique resources in the AWS cloud and is highly customizable. Not only is it far quicker to write this module than it is to write out all 25 resources manually, but it is far less error-prone.
An excellent example of a module that makes use of other open-source modules is the terraform-aws-atlantis module used to create a CI/CD pipeline for your infrastructure code.
It creates a VPC, load balancer, ACM certificate, security groups, and an ECS service all using open-source modules. That likely made writing and maintaining that module much easier than using all custom resources, but the composition of all of those modules is what makes for the biggest time saver of all: You can directly use their pre-built solution if you want to, while also customizing it extensively.
For larger applications like Atlantis, Hashicorp Vault, or Transcend Sombra, the initial setup can be daunting. But the Terraform community made a reusable module for creating Atlantis clusters, Hashicorp paid Gruntwork to create a secure implementation of Vault, and at Transcend we created a module for our Sombra service that can be self-hosted if our customers want complete control over the encryption of their data.
In particular I love the Hashicorp Vault module, as it let us scaffold up a working Vault cluster in a day or two, and we can be highly confident in the security of our cluster. For a server that’s main job is to hold our most sensitive secrets, that is a massive confidence boost.
When Vault came out with support for DynamoDb as a storage backend for secrets, we were immediately interested due to the global redundancy and high availability offered by Dynamo. Within a few weeks, the Vault module was updated to support the new backend, and migrating was easy.
If we had written our own Vault module, we would have had to change over the backend configuration ourselves, instead of just changing what flags/variables we passed to the module.
In the above example, you’ll notice that we enabled VPC Flow Logs. At one point, we were having some troubles with one of our VPC’s security groups and weren’t sure why some traffic wasn’t hitting it’s endpoint.
After researching flow logs and recognizing that they could help us debug, we added…
enable_flow_log = true create_flow_log_cloudwatch_log_group = true create_flow_log_cloudwatch_iam_role = true
…to our module, which handles creating a log group, giving the VPC access via a new IAM Role to publish the logs, and configures the VPC to write out Flow Logs to that log group.
These three lines create five resources in Terraform per VPC, and with dozens of VPCs throughout Transcend, it was wonderful to already have this functionality built into the modules before we needed it. A single PR with a small diff added Flow Logs support to all of our VPCs, all without needing to look up any documentation on specific resources.
The high amount of customization many open-source modules offer makes them great for production-readiness. Instead of searching for one module that does everything, you can compose the smaller modules to quickly scaffold almost any infrastructure you want.
Whether you use open-source modules or homemade, you’ll likely want to add features to the module at some point.
As an example, at Transcend we were using a Cloudposse module to manage our frontend CloudFront distributions. We noticed that the S3 buckets it stored the website asset files in were not encrypted, but as a best-practice at Transcend we encrypt all of our buckets.
We forked the repo, opened a pull request that added support for encrypting the bucket, and within a few days the change was merged. Our experiences with submitting PRs has generally been great, with quick turnaround times. Using that module as an example, we have created 10 pull requests, all of which merged fairly quickly.
But sometimes, open-source maintainers might not operate on the same schedule as you do, and you’ll want to use your changes right away.
In this case, pointing your module to a forked repo is very easy:
# TODO - https://github.com/terraform-aws-modules/terraform-aws-rds-aurora/pull/157 # source = "git::git@github.com:terraform-aws-modules/terraform-aws-rds-aurora?ref=v2.29.0" source = "git::git@github.com:callaingit/terraform-aws-rds-aurora?ref=feature/157
In this case, the Aurora database module we were using had an Issue open that required some discussion on how it should be implemented. An existing open pull request worked for our needs, so we updated our code to point at the PR instead of the official repo.
Once that PR merges, we can very easily switch back to using the official module by uncommenting the line and updating the ref query parameter, which specifies the git tag we want to point at.
In the VPC example above, notice how it only took one line to declare that we wanted private subnets. If we wanted an even more private subnet for something like a database, it would be as easy as adding database_subnets = ["10.0.21.0/24", "10.0.22.0/24"].
By making this so easy, I’ve noticed that the “testing-the-waters” applications at Transcend end up being very close to what we want in production most of the time.
When the difference between a publicly exposed database and properly hidden database is one line of code, it becomes far more likely that your developers will use the best practices right away, as opposed to doing something like testing out all resources in the default VPC to begin with and adding a TODO to “add security” at some point.
Don’t close your modules for maintenance. (photo by Coby Shimabukuro on Unsplash)
It’s a Monday morning, I don’t have too many meetings this week, and I’m feeling good about wrapping up a project I’ve been working on for a few weeks. And then a peer asks for some quick help updating an old Terraform module that I created to add support for a new feature. But wait, that module is still using Terraform 0.12, and Terraform 0.13 has some features you think would help an awful lot. So you get to work on the upgrade…
All of a sudden it’s Friday. You successfully upgraded a few modules and added some features, but you never got around to the big project you were hoping to finish. You wonder where all your time went.
When Transcend quadrupled in size a few months ago, we went from having a small group of people touching the infrastructure to dozens. Having a majority of our infrastructure modules be open-sourced was a lifesaver — when people had problems they could open their requests to someone that wasn’t me. And that makes me very happy, as I have time to do things other than maintaining modules all day long!
Maintaining modules isn’t just adding new resources, but should also include updating documentation and automated testing of all changes. Just like application code, infrastructure code needs tests to ensure it works in a variety of situations and that it is secure. Many open-source projects will come with automated tests, or at least examples that are manually tested before each release.
One of the core tenants of DevOps is that you shouldn’t silo work into being the responsibility of specific organizations or individuals. When you choose to create modules, you should think very carefully about if the module you create will be easily modifiable by others or if it will add a burden of ownership to a specific group of employees.
If you have a policy that doesn’t allow using any open-source modules, then your developers are limited to only using the pre-existing modules you have (or creating new ones from scratch). That will place additional burdens on you, as those modules will need to work for all of your company’s use cases, or you will constantly need to be creating new modules. At that point, Terraform will feel a lot more like “Dev and Ops” than “DevOps”.
If I’ve convinced you that open-source modules are worth checking out, the next step is to filter out the good repositories.
The first step is to head on over to the module registry. From there, you can search for any keyword of your choice, and click through some of the top results.
Some names to look out for that I’d highly recommend are Cloudposse and Terraform-AWS-Modules, both of which put great effort into having high-quality modules that are easily extensible. They are so non-opinionated that I often end up with a mishmash of those two organizations’ modules glued together, without needing to care that they are not made by the same company.
I would also like to give a shout out to the modules created by Gruntwork.io, as we’ve had great experiences with them, but some of them require a subscription to access (which we have found to be highly worth it).
Open-source modules will save you time, improve your security, and give you greater confidence in your infrastructure. At Transcend, they have saved us hundreds of hours, and we highly recommend them when the use case is correct.
By using Open-source, it’s very likely that you’ll end up contributing to those modules as well at some point, helping us all to build more secure systems.
By David Mattia