Cloud Provisioning¶
This guide covers provisioning cloud infrastructure for ubTrace on AWS using OpenTofu (or Terraform). It is intended for platform engineers who need to set up a production-grade environment with managed databases, caches, search, and compute resources.
Note
This guide covers infrastructure provisioning only. For deploying the ubTrace application itself, see Installation (Docker Compose) or use the Helm chart for Kubernetes deployments.
Overview¶
The infra/terraform/ directory contains modular OpenTofu/Terraform
configurations that provision all cloud resources ubTrace needs on AWS:
Module |
Resources |
Purpose |
|---|---|---|
|
KMS key, CloudTrail, IAM roles, SSM secrets |
Encryption, audit logging, identity |
|
VPC, subnets, NAT, security groups |
Network foundation (3-tier: public/private/data) |
|
2× RDS PostgreSQL 16 |
App database + Keycloak database |
|
ElastiCache Redis 7 |
Session cache with TLS |
|
OpenSearch (Elasticsearch-compatible) |
Full-text search and analytics |
|
EKS cluster or EC2 instance |
Container orchestration or VM |
|
EFS (shared) or EBS (single-node) |
Persistent storage for the data pipeline |
|
ALB, ACM certificate |
HTTPS load balancing with path routing |
Two deployment modes are supported:
EKS mode — managed Kubernetes cluster for enterprise deployments. Use the ubTrace Helm chart to deploy the application workloads.
EC2 mode — single VM with Docker Compose for simpler or air-gapped environments. The offline bundle can be loaded directly onto the instance.
Prerequisites¶
OpenTofu >= 1.5.7 (or Terraform >= 1.5.7)
AWS CLI configured with credentials that have permission to create VPCs, RDS instances, ElastiCache clusters, EKS clusters, and IAM roles
An S3 bucket and DynamoDB table for remote state (see Bootstrap State Backend)
Quick Start¶
cd infra/terraform
# 1. Bootstrap the remote state backend (one-time)
./scripts/bootstrap-state.sh my-ubtrace-state eu-central-1
# 2. Initialize
tofu init \
-backend-config="bucket=my-ubtrace-state" \
-backend-config="key=staging/terraform.tfstate" \
-backend-config="region=eu-central-1" \
-backend-config="dynamodb_table=ubtrace-terraform-locks" \
-backend-config="encrypt=true"
# 3. Review changes
tofu plan -var-file=environments/staging.tfvars
# 4. Apply
tofu apply -var-file=environments/staging.tfvars
Important
Always review the plan output before applying. Infrastructure changes can be destructive — especially for stateful resources like databases and search indices.
Bootstrap State Backend¶
Before the first tofu init, create the S3 bucket and DynamoDB table that
store Terraform state and provide locking:
./scripts/bootstrap-state.sh [bucket-name] [region] [dynamodb-table]
Defaults: bucket ubtrace-terraform-state, region eu-central-1,
table ubtrace-terraform-locks.
The script creates:
An S3 bucket with versioning, KMS encryption, and public access blocked
A DynamoDB table (
PAY_PER_REQUEST) for state locking
Deployment Modes¶
EKS (Kubernetes)¶
Set deployment_mode = "eks" in your tfvars file. This provisions:
A managed EKS cluster with a node group
EBS CSI driver and VPC CNI addons
EFS file system with access points for the pipeline folders
Kubernetes Secrets encrypted with a customer-managed KMS key
After tofu apply, configure kubectl and deploy the Helm chart:
# Configure kubectl
$(tofu output -raw kubeconfig_command)
# Deploy with Helm (example)
helm install ubtrace deploy/helm/ubtrace \
--values deploy/helm/ubtrace/values-eks.yaml \
--set postgresql.external.host=$(tofu output -raw app_db_endpoint | cut -d: -f1) \
--set postgresql.external.password=$(aws ssm get-parameter --name "/ubtrace/${ENV}/db/app/password" --with-decryption --query Parameter.Value --output text) \
--set redis.external.host=$(tofu output -raw redis_endpoint) \
--set elasticsearch.external.host=$(tofu output -raw opensearch_endpoint)
EC2 (Docker Compose)¶
Set deployment_mode = "ec2" in your tfvars file. This provisions:
An EC2 instance (Ubuntu 24.04) with an IAM instance profile
An ALB with path-based routing to API, frontend, and Keycloak
EBS volumes for the pipeline folders
Target group attachments for the EC2 instance
After tofu apply, connect to the instance and deploy:
# Connect via SSM (no SSH required)
aws ssm start-session --target $(tofu output -raw ec2_instance_id)
# On the instance: load the offline bundle and start services
tar xzf ubtrace-offline-bundle-*.tar.gz
cd ubtrace-offline-bundle-*
./offline-load.sh ubtrace-images-*.tar.gz
make init
# Edit .env with the RDS/Redis/OpenSearch endpoints from Terraform outputs
make up
Environment Sizing¶
Three pre-configured environments are provided:
Environment |
Database |
Redis |
Search |
HA |
Est. Cost |
|---|---|---|---|---|---|
|
db.t3.micro |
cache.t3.micro |
t3.small.search |
No |
~$100/mo |
|
db.t3.medium |
cache.t3.small |
t3.medium.search |
No |
~$300/mo |
|
db.r6g.large |
cache.r6g.large |
r6g.large.search |
Yes (Multi-AZ) |
~$1,500/mo |
Create a custom tfvars file for your environment:
cp environments/staging.tfvars environments/myenv.tfvars
# Edit to match your requirements
tofu plan -var-file=environments/myenv.tfvars
Configuration Reference¶
General¶
Variable |
Type |
Default |
Description |
|---|---|---|---|
|
string |
|
Prefix for all resource names |
|
string |
(required) |
Environment name: |
|
string |
|
AWS region for all resources |
|
string |
|
|
Networking¶
Variable |
Type |
Default |
Description |
|---|---|---|---|
|
string |
|
CIDR block for the VPC |
|
list |
|
Availability zones (minimum 2) |
|
bool |
|
NAT gateway for private subnets. Set |
|
string |
|
Domain for ALB and ACM certificate |
|
string |
|
Existing ACM certificate ARN (skips creation) |
Security & Compliance¶
Variable |
Type |
Default |
Description |
|---|---|---|---|
|
bool |
|
CloudTrail API audit logging |
|
bool |
|
VPC network flow logs |
|
string |
|
EC2 key pair name (EC2 mode only) |
License¶
Variable |
Type |
Default |
Description |
|---|---|---|---|
|
string |
|
ubTrace license key (stored in SSM as SecureString) |
|
string |
|
Cryptolens product ID |
|
string |
|
Cryptolens access token (stored in SSM as SecureString) |
Note
License variables are optional. When provided, they are stored as SSM
Parameter Store entries under /ubtrace/<environment>/license/. For
offline licensing with .skm files, copy the file to the instance and set
UBTRACE_LICENSE_FILE in the application environment instead.
Security¶
The infrastructure is provisioned with TISAX and ISO 27001 compliance in mind:
Encryption at rest — All data stores (RDS, ElastiCache, OpenSearch, EFS, EBS, S3) are encrypted with a customer-managed KMS key
Encryption in transit — TLS enforced on Redis, OpenSearch, and the ALB
Audit logging — CloudTrail records all API calls; VPC flow logs capture network traffic; ALB access logs record HTTP requests
Least privilege — IAM roles follow the principle of least privilege; EC2 instances use SSM instead of SSH (no open ports)
Secret management — Database passwords, Redis auth tokens, and license credentials are stored in SSM Parameter Store (KMS-encrypted)
Network isolation — Three-tier subnet design: public (ALB only), private (compute), data (databases). Security groups restrict traffic between tiers
EKS hardening — Kubernetes Secrets encrypted with customer KMS key; cluster manages its own security group; API audit logging enabled
Data protection — Stateful resources (RDS, OpenSearch, EFS) have
prevent_destroylifecycle guards to prevent accidental deletion
Outputs¶
After tofu apply, retrieve connection details for application configuration:
# Database endpoints
tofu output app_db_endpoint
tofu output keycloak_db_endpoint
# Cache
tofu output redis_endpoint
tofu output redis_port
# Search
tofu output opensearch_endpoint
# Compute (EKS)
tofu output kubeconfig_command
tofu output eks_cluster_name
# Compute (EC2)
tofu output ec2_instance_id
tofu output ec2_private_ip
# Load balancer
tofu output alb_dns_name
# Sensitive values (retrieve from SSM Parameter Store)
aws ssm get-parameter --name "/ubtrace/<env>/db/app/password" --with-decryption --query Parameter.Value --output text
aws ssm get-parameter --name "/ubtrace/<env>/db/keycloak/password" --with-decryption --query Parameter.Value --output text
aws ssm get-parameter --name "/ubtrace/<env>/redis/auth_token" --with-decryption --query Parameter.Value --output text
# License SSM parameter ARNs
tofu output license_ssm_arns
Day-2 Operations¶
Rotate Database Passwords¶
tofu taint 'module.security.random_password.db_app'
tofu taint 'module.security.random_password.db_keycloak'
tofu apply -var-file=environments/production.tfvars
Warning
After rotating passwords, restart the ubTrace application to pick up the new credentials from SSM Parameter Store.
Scale EKS Nodes¶
tofu apply -var-file=environments/production.tfvars -var="eks_node_count=5"
Upgrade Kubernetes Version¶
tofu apply -var-file=environments/production.tfvars -var="eks_version=1.32"
Important
Upgrade one minor version at a time. Review the EKS version calendar and test in staging first.
Troubleshooting¶
tofu initfails with “bucket does not exist”Run
./scripts/bootstrap-state.shto create the state backend first.tofu planshows replacement of a databaseThis typically means an engine version or parameter change that requires recreation. The
prevent_destroylifecycle guard will block the apply. Review the change carefully, take a manual snapshot, then remove the guard temporarily if the replacement is intentional.- EKS nodes fail to join the cluster
Check that the node IAM role has the required policies (
AmazonEKSWorkerNodePolicy,AmazonEKS_CNI_Policy,AmazonEC2ContainerRegistryReadOnly). These are provisioned automatically by the compute module.- OpenSearch returns 403 errors
The OpenSearch domain is VPC-internal with a permissive resource policy for HTTP operations. Ensure the application is running in the private subnets and the
appsecurity group allows outbound traffic on port 443.- ALB returns 503 for all requests (EC2 mode)
Check that the target group health checks are passing. The API health check endpoint is
/api/v1/health/liveness. Verify the application is running and listening on the expected ports (API: 3000, Frontend: 3000, Keycloak: 8080).