GitOps has graduated from conference talk concept to production standard. The core idea — Git as the single source of truth for both infrastructure and application state, with automated reconciliation to make reality match the declared state — is sound. The implementation details determine whether it works or creates a new category of operational complexity.
The pattern I’ll describe is the one we’ve refined across multiple production environments. It’s opinionated, and deliberately so — the most common GitOps failure mode is the flexibility trap, where teams try to accommodate every edge case and end up with something that’s harder to reason about than what they started with.
The Ownership Boundary: Terraform and ArgoCD
The most important design decision in a GitOps setup is the boundary between Terraform and ArgoCD. Get this wrong and you’ll have resources managed in both places, with conflicts, drift, and debugging sessions that are frustrating to navigate.
The principle: Terraform owns infrastructure. ArgoCD owns applications.
Terraform manages anything that isn’t a Kubernetes resource: cloud provider resources (VPCs, subnets, IAM, RDS, S3), DNS, and the Kubernetes cluster itself. Terraform also provisions the namespaces and cluster-level resources that application teams will deploy into — but not the application workloads themselves.
ArgoCD owns everything in Kubernetes that represents application state: Deployments, Services, ConfigMaps, HorizontalPodAutoscalers, and the application-specific resources that get created as part of application lifecycle management. ArgoCD is also where the GitOps sync loop lives — it watches the application manifests in Git and continuously reconciles the cluster state toward the declared state.
The explicit handoff point: Terraform creates the cluster and provisions the ArgoCD installation. From that point, ArgoCD manages itself through a self-referential Application definition (ArgoCD managing ArgoCD). This gives you GitOps for the GitOps tooling.
Repository Structure
The repository structure determines how well the system scales as you add applications and environments. The structure that works:
infrastructure/ # Terraform root
modules/ # Reusable Terraform modules
eks-cluster/
rds-postgres/
s3-bucket/
environments/
production/
main.tf # Refs modules, env-specific vars
variables.tf
terraform.tfvars
staging/
main.tf
apps/ # ArgoCD manifests
base/ # Kustomize base configurations
api-service/
deployment.yaml
service.yaml
kustomization.yaml
overlays/ # Environment-specific overlays
production/
api-service/
kustomization.yaml # Patches base with prod settings
staging/
api-service/
kustomization.yaml
argocd/ # ArgoCD Application resources
applications/
production/
# ArgoCD App -> apps/overlays/production/api-service
api-service.yaml
staging/
api-service.yaml
Kustomize for Kubernetes manifests, Terraform modules for infrastructure — this is the combination that handles environment promotion cleanly. The base application configuration lives once; environment-specific patches override what needs to differ.
The ArgoCD Application Definition Pattern
Each application in ArgoCD is represented by an Application resource. A well-structured Application definition:
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: api-service-production
namespace: argocd
finalizers:
- resources-finalizer.argocd.argoproj.io
spec:
project: production
source:
repoURL: https://github.com/your-org/apps
targetRevision: main
path: overlays/production/api-service
destination:
server: https://kubernetes.default.svc
namespace: api-service
syncPolicy:
automated:
prune: true
selfHeal: true
syncOptions:
- CreateNamespace=false
- PrunePropagationPolicy=foreground
Two things worth noting: prune: true means resources removed from Git are deleted from the cluster. This is GitOps working correctly, but it surprises teams that manually create resources. Everything must be in Git. selfHeal: true means ArgoCD corrects drift automatically — if someone manually edits a resource in the cluster, ArgoCD reverts it. This is the behavior you want, but make sure the team understands it before it surprises them in a production incident.
Secrets Management: The GitOps Complication
The one place GitOps gets complicated is secrets. You cannot commit plaintext secrets to Git. The two patterns that work in production:
External Secrets Operator (ESO) syncs secrets from external sources (AWS Secrets Manager, HashiCorp Vault, GCP Secret Manager) into Kubernetes Secrets. You commit an ExternalSecret resource to Git — which references the secret by name but doesn’t contain the value — and ESO handles retrieval and sync. This keeps the GitOps model intact (everything is in Git) while secrets values live in a managed secret store.
Sealed Secrets encrypts secrets with a cluster-specific public key. You commit the encrypted SealedSecret resource to Git; only the cluster can decrypt it. Simpler operationally than ESO, but key rotation requires re-encrypting all sealed secrets. Works well for smaller environments.
Avoid kubectl create secret for anything beyond initial bootstrap. Imperatively created secrets will be pruned by ArgoCD or will conflict with GitOps reconciliation.
Terraform State and the CI/CD Integration
Terraform state needs to live somewhere that all CI/CD runs can access it and that won’t be lost. S3 with DynamoDB locking is the standard for AWS-based infrastructure:
terraform {
backend "s3" {
bucket = "your-org-terraform-state"
key = "production/terraform.tfstate"
region = "us-east-1"
dynamodb_table = "terraform-state-lock"
encrypt = true
}
}
The CI/CD integration for Terraform: pull requests trigger terraform plan, the plan output is posted as a PR comment, and merges to main trigger terraform apply. The plan-before-apply model prevents surprises and creates a review checkpoint.
At Dell, we managed CI/CD platforms serving hundreds of engineers with this model. The plan-as-PR-comment pattern is particularly valuable at scale — reviewers can see exactly what infrastructure changes are proposed before code merges.
Drift Detection and Reconciliation
The value of GitOps is that drift is automatically corrected. In practice, drift detection needs to be made visible — not just corrected silently.
ArgoCD’s UI shows sync status for every application. Out-of-sync applications appear immediately. Set up alerting on out-of-sync status: a Prometheus alert that fires when any application has been out of sync for more than 10 minutes is the right starting point. Route this to your on-call channel.
For Terraform-managed infrastructure, drift detection requires running terraform plan and checking for non-empty output. Schedule this as a periodic CI/CD job — daily is usually sufficient. Alert when drift is detected. This catches cases where someone has made manual changes to cloud infrastructure that Terraform doesn’t know about.
The Rollback Story
One of the strongest arguments for GitOps is clean rollback: a bad deployment is reverted by reverting the Git commit, and ArgoCD reconciles to the previous state within minutes. This is true, and it works.
The complication: database schema migrations. Application rollback is clean. Database schema rollback is painful. The operational guidance: design migrations to be backward-compatible wherever possible. Additive migrations (adding columns, adding tables) are safe to roll back from. Destructive migrations (dropping columns, renaming tables) require more careful sequencing.
Test the rollback story before you need it in production. Revert a commit in staging. Verify ArgoCD reconciles. Verify the application works. This should be a recurring exercise, not a theoretical claim.
Our DevOps and automation practice implements this pattern across multiple client environments. The specific tools are almost secondary — the discipline of treating infrastructure as code, the clean ownership boundary between Terraform and ArgoCD, and the commitment to Git as the source of truth are what make it work. Related: if your team is evaluating Kubernetes and container platforms alongside this GitOps workflow, the two topics are tightly coupled in practice.