I have a large AWS deployment that comprises three main "Terraform projects" each with multiple modules and a ton of variables.
The three projects build on top of one another but we rarely use them independently. I would like to improve their modularity and encapsulation so that we can wire the components together in more ad-hoc ways to meet the needs of customers' different requirements.
An example might be wiring a customer's data lake solution into the system instead of instantiating a Redshift cluster in AWS.
Do you have any general guidelines and approaches that you have found work well in the field?
Its interesting that you mention this because I am actually dealing with a very similar situation at my current employer. We have a massive stack that deploys hundreds of resources with tons of dependencies between those resources. Its a nightmare to deploy, and also this codebase has been evolving over the years and it could really do with some refactoring. And of course any refactoring must be done with production deployments in mind -- we cannot have downtime if changes are made to the infrastructure. Finally, parts of the code are more static than others. The base infrastructure might change infrequently, but some of the application centric infrastructure change frequently.
The solution that we came up with is breaking up the mono Terraform repo in six or seven smaller modules stored in separate repos that are more manageable. We made the division based on cohesiveness and functionality. For example, we have only module that is for ec2 instances, one for database, one for security groups and so forth. Code that changes frequently tends to be isolated from code that changes infrequently. Then we have a CI/CD pipeline of sorts that goes ahead and deploys each of these modules, and passes required outputs around when needed. Even though the code is part of the same project, it important for us to be able to deploy the modules independently of each of other. Otherwise there would be unnecessary reads and plans would take too long when generating execution plans. So each module is deployed using a minimal deployment harness that contains only the following code:
What we essentially did at my company is take this pattern one step further. Instead of having a root module that references modules, we simply deploy those modules as independent workspaces. Attached is a picture of the process. We first deploy the module with no dependencies, then the one with the next fewest dependencies, until all modules are deployed. Since these modules are actually independent workspaces they require their own terraform.tfvars and produce outputs. the outputs get saved to a global data store and read by the next module. You could also do the same thing with the terraform remote data source https://www.terraform.io/docs/language/state/remote-state-data.html. If you are using Terraform cloud you can use a run trigger to automatically queue the next workspace to run. This also allows the workspaces to be run independently of each other, which makes Terraform execution plans much faster since there are fewer resources to read.