The 30-40% cloud overpayment figure sounds like marketing from a FinOps vendor. It isn’t. Every structured cloud cost audit I’ve run on an organization that hasn’t actively managed its cloud spend has surfaced savings in that range. The specific sources vary, but the aggregate consistently reaches that threshold.
The reason isn’t negligence. Cloud pricing models are genuinely complex, cost visibility tools are designed to show you what you spent rather than where you’re wasting, and cloud resources are significantly easier to create than to delete. Waste accumulates as a natural byproduct of normal development and operations activity.
Here’s where the waste consistently lives and how to find it.
The Idle Resources Problem
The single largest source of cloud waste in most environments: resources that are running but serving no purpose. The common categories:
Forgotten development and staging instances. Engineers spin up instances for a project, finish the work, and shut down the instances — except the one that they forgot. Or the staging environment for a feature that shipped 8 months ago, which nobody uses anymore but nobody deleted. A medium EC2 instance left running indefinitely costs $700-1,200/year. Ten of those is $10,000 in waste.
The fix: Tag every resource with Environment, Owner, and Project. Create a scheduled audit that lists all resources without these tags. Create lifecycle policies that automatically stop (not terminate — stopped instances preserve data) resources in non-production environments during nights and weekends.
Unattached storage volumes. EBS volumes created with an EC2 instance and not automatically deleted when the instance terminates. AWS changed the default to delete volumes on termination, but existing infrastructure provisioned before that change may still have volumes detaching on instance termination and persisting indefinitely.
The fix: A monthly CloudWatch query for unattached EBS volumes takes two minutes to run. Add it to a monthly operational checklist. The AWS console also surfaces these in the EBS volume list filtered by “available” state.
Idle load balancers. Application Load Balancers have a base charge regardless of traffic. An ALB with no registered targets or with near-zero request count is waste. This pattern appears after application migrations where the load balancer was decommissioned but not deleted.
The fix: AWS Cost Explorer can show ALB costs by resource. Cross-reference against current traffic patterns to identify ALBs that have low or no usage.
The Overprovisioning Problem
The second largest source: instances that are significantly larger than the workload requires. Provisioning decisions are made under uncertainty (what will the load be?), with appropriate conservatism (don’t undersize and cause a performance incident), and are then rarely revisited.
The consequence: many production instances run at 10-30% average CPU utilization on instances sized for peak loads that may occur a few times per month.
The specific pattern I find most often: RDS instances provisioned during initial setup at db.r5.2xlarge “to be safe,” running a medium-traffic application that consistently uses 15% CPU and 40% memory. Downsizing to db.r5.large saves $200-400/month with no performance impact.
AWS Compute Optimizer and the equivalent tools on other cloud providers analyze historical utilization and recommend right-sizing. Run these tools, review the recommendations, and test right-sizing in staging before applying to production. The recommendations are conservative — they maintain headroom — and are generally safe to apply.
The RDS-specific consideration: Downsizing an RDS instance requires a maintenance window and causes a brief interruption (typically 1-5 minutes). Plan this during low-traffic periods. Multi-AZ instances can be resized with shorter interruption using the standby promotion mechanism.
The Reserved Instance Coverage Gap
On-demand pricing is the most expensive way to pay for compute you know you’ll run continuously. Reserved Instances and Savings Plans provide 30-40% discounts in exchange for 1-3 year commitments.
Most organizations are running a significant fraction of their compute on on-demand pricing for resources that have been running continuously for months or years. This is pure waste — the commitment they’re avoiding by using on-demand is a commitment they’ve already effectively made.
The coverage calculation: Pull Cost Explorer data filtered to EC2 instance costs, broken out by on-demand vs. reserved vs. spot. For any instance that’s been running continuously for the past 3 months, on-demand is the wrong pricing model.
The approach that works without over-committing: Compute Savings Plans (not Reserved Instances specifically) provide ~25-30% discount with more flexibility about which instance types and regions the discount applies to. Start with Savings Plans covering the base load that’s been consistent for 6+ months, then layer Reserved Instances for specific large instances where the commitment is clear.
The Data Transfer Tax
Cloud providers charge for outbound data transfer at rates that surprise organizations once they scale. AWS charges approximately $0.09/GB for data transferred to the internet. For applications serving significant data — video, large file downloads, data exports — this becomes a major cost center.
The fix: A CDN fronting anything that serves repeated content to the same or different users. CloudFront data transfer costs are approximately $0.01-0.02/GB after cache hits — 5-9x cheaper than origin transfer. For a site serving 10TB/month: the difference is $900/month vs. $100-200/month.
The second data transfer problem: Cross-AZ traffic within a region. AWS charges $0.01/GB for data transferred between availability zones. For applications with services in different AZs that call each other frequently, this adds up. Examine cross-AZ traffic in the AWS Cost Explorer data transfer view and evaluate whether services that communicate heavily should be co-located in the same AZ (accepting the reduced resilience) or whether the traffic can be reduced.
The Snapshot Accumulation Problem
Automated snapshots accumulate without automatic deletion unless lifecycle policies are explicitly configured. RDS automated backups are managed with retention periods; manual snapshots are not.
A 500GB database that has been running for 2 years with weekly manual snapshots has 104 snapshots. At $0.095/GB/month, that’s $9.88/month per snapshot or $1,027/month for the accumulated snapshots — for database backups that almost nobody would recover from snapshots more than 30 days old.
The fix: RDS snapshot lifecycle policies delete manual snapshots after a retention period. EBS snapshot lifecycle manager automates the same for EBS snapshots. Review both in any cost audit.
The Audit That Surfaces the Rest
The manual audit above catches the high-frequency waste categories. For a comprehensive picture:
- Export Cost Explorer data for the past 90 days by service
- Identify the 10 largest cost drivers
- For each: is current utilization justified? Are there optimization options?
- Run AWS Trusted Advisor (Business support tier) or equivalent for provider-generated recommendations
- Run Compute Optimizer for right-sizing recommendations
This audit, done thoroughly, will surface 20-40% savings in most environments that haven’t actively managed cloud costs. The first audit is the highest-yield; subsequent quarterly audits surface incremental waste from new resources.
Our cloud migration and cost optimization practice runs structured cloud cost audits as a standard engagement. The deliverable is a prioritized list of savings opportunities with implementation complexity and risk ratings — not just a list of what’s wasteful, but a plan for acting on it safely. Related: the FinOps practices that prevent ongoing waste accumulation are closely linked to DevOps and automation tooling — tagging policies enforced at deploy time, automated resource cleanup, and cost anomaly alerting.