Operations Guide¶
EKS Cluster Access¶
# Configure kubectl
aws eks update-kubeconfig --name ubtrace-<env> --region <region>
# Verify access
kubectl get nodes
kubectl get pods -l app.kubernetes.io/instance=ubtrace
Helm Upgrades¶
# Update values if needed
vim values-customer.yaml
# Upgrade
helm upgrade ubtrace deploy/helm/ubtrace \
-f deploy/helm/ubtrace/values-eks.yaml \
-f values-customer.yaml \
--wait --timeout 10m
# Rollback if needed
helm rollback ubtrace
Scaling¶
Horizontal (replicas):
helm upgrade ubtrace deploy/helm/ubtrace \
--reuse-values --set api.replicaCount=3
Vertical (node size):
Update eks_instance_type in tfvars, then tofu apply.
Backup & Restore¶
RDS (automatic):
RDS automated backups are configured via db_backup_retention (default: 7 days).
# Manual snapshot
aws rds create-db-snapshot \
--db-instance-identifier ubtrace-<env>-app \
--db-snapshot-identifier manual-$(date +%Y%m%d)
EFS:
# Enable AWS Backup for EFS
aws backup start-backup-job --resource-arn <efs-arn> --iam-role-arn <backup-role>
Teardown¶
Warning
Production resources have prevent_destroy enabled. Remove the lifecycle
block from Terraform before destroying.
# 1. Remove Helm release
helm uninstall ubtrace
# 2. Delete PVCs (if no longer needed)
kubectl delete pvc -l app.kubernetes.io/instance=ubtrace
# 3. Destroy infrastructure
cd infra/terraform
tofu destroy -var-file=environments/<env>.tfvars
Log Access¶
EKS cluster logs are sent to CloudWatch:
# View pod logs
kubectl logs -l app.kubernetes.io/component=api --tail=100 -f
# View all ubTrace logs
kubectl logs -l app.kubernetes.io/instance=ubtrace --all-containers