Most AWS bills are higher than they need to be — not because AWS is trying to squeeze you, but because the wrong pricing model is running the wrong workload.
AWS gives you three primary ways to pay for compute: On-Demand, Reserved Instances, and Spot Instances. Each one exists for a reason. Each one can save you money when used correctly and waste money when it is not. This breakdown goes through the real numbers, the real trade-offs, and the practical decision-making that sits behind smart cloud spending.
On-Demand is the simplest pricing model. You spin up an EC2 instance, AWS charges you by the second (or by the hour for some OS types), and when you shut it down, the billing stops. There is no commitment, no contract, and no upfront cost. It sounds perfect, and for certain workloads, it genuinely is. For everything else, it is the most expensive option on the menu.
On-Demand makes sense when you cannot predict your workload, when a project is short-lived, or when your team is in the middle of experimenting with a new architecture. Development and testing environments that get created and destroyed frequently belong here. So do proof-of-concept projects and anything experiencing sudden, unpredictable traffic spikes where scaling instantly matters more than saving money.
The flip side is that running production workloads or steady databases on pure On-Demand is like renting a hotel room every night of the year instead of signing a lease. The flexibility is real, but you pay a premium for it every single hour.
On-Demand also acts as your safety net in a blended strategy. When Reserved capacity runs out or Spot instances get reclaimed, On-Demand absorbs the overflow. It is not the enemy. It just should not be the whole answer.
Reserved Instances (RIs) are not actually a different type of server. Under the hood, the instance is identical to an On-Demand one. What you are purchasing is a billing commitment. You agree to use a specific instance type for one or three years, and in exchange AWS gives you a significantly lower rate. That is the entire trade-off: predictability for savings.
There are two flavours to know. Standard RIs lock you into a specific instance family, operating system, and region for the full term. They offer the deepest discounts, up to 72 percent off On-Demand pricing, and you can list unused ones on the Reserved Instance Marketplace if your needs change. Convertible RIs give you a bit of breathing room. You can exchange them mid-term for a different instance family, operating system, or tenancy, though the discount sits closer to 54 to 66 percent depending on the payment option.
Within either RI type, you choose how to pay. All Upfront means paying the entire term cost at purchase, which gives you the maximum discount of up to 72 percent and removes any month-to-month charges. Good if your cash position allows it and your workload is genuinely stable. Partial Upfront requires roughly half the cost at purchase and the rest spread across monthly charges, landing savings around 65 percent. It is a sensible middle ground for teams that want meaningful discounts without tying up all their budget. No Upfront asks for nothing at purchase and instead charges a discounted monthly rate for the duration, with savings around 36 percent. The most financially flexible option, though it offers the smallest discount and still locks you into the term.
One thing to keep in mind: a 3-year commitment always saves more than a 1-year one. But committing to three years on an instance type that your architecture may not even use in year two is a real risk. Over-committing is the single most common RI mistake, and it costs companies real money.
Spot Instances are AWS's way of selling spare compute capacity. When demand is lower than supply in a given availability zone, that idle capacity gets offered at a steep discount. How steep? Typically 60 to 90 percent below On-Demand rates. A workload that costs $0.192 per hour on demand can run at around $0.020 to $0.038 per hour on Spot. The catch is that AWS can reclaim that capacity at any time with just a two-minute warning.
That two-minute notice is not a technicality. AWS sends a termination notice through CloudWatch Events and instance metadata, and your application has 120 seconds to save state, flush logs, drain connections, or hand off work to another instance. If your architecture cannot handle that, Spot is the wrong tool.
Batch processing and ETL jobs are the natural fit. A job that can be interrupted, checkpointed, and resumed is essentially risk-free on Spot, and the savings compound quickly at scale. Machine learning training jobs, large-scale simulations, media rendering, CI/CD pipeline workers, and data analytics pipelines all belong in this category. Containerised applications running on Kubernetes or ECS handle Spot interruptions particularly well because the orchestrator simply reschedules the affected pods to another available node.
What does not belong on Spot is your production database, your payment processing service, or any stateful application where a two-minute interruption translates directly into customer impact or data loss.
of Spot requests run to completion. The historical interruption rate across most instance types is typically less than 5 percent.
Think of Spot like a standby flight seat. Deeply discounted, but the airline can bump you if a full-fare passenger needs your seat. If your job can survive a restart, Spot is almost always the smartest compute spend on AWS.
The historical interruption rate across most instance types is typically less than 5 percent, meaning 95 percent of Spot requests run to completion. But that number depends heavily on how you request capacity. Diversifying across multiple instance families, such as requesting m6i, m5, and m7i simultaneously, gives AWS more capacity pools to draw from and significantly reduces your interruption probability. Spot Fleet and Auto Scaling Groups with mixed instance policies do this automatically. Enabling checkpointing on long-running jobs ensures that even an interrupted job resumes from where it left off rather than starting over.
| Pricing Model | Monthly Cost | Annual Cost | Savings vs On-Demand |
|---|---|---|---|
| On-Demand | $1,402 | $16,824 | Baseline |
| Reserved 1yr All Upfront | $771 | $9,252 | ~45% saved |
| Reserved 3yr All Upfront | $502 | $6,024 | ~64% saved |
| Spot (~$0.038/hr avg) | $278 | $3,336 | ~80% saved |
| Attribute | On-Demand | Reserved | Spot |
|---|---|---|---|
| Cost vs On-Demand | Full price | Up to 72% off | Up to 90% off |
| Commitment | None | 1 or 3 years | None |
| Availability | Guaranteed | Guaranteed (zonal) | Not guaranteed |
| Interruption risk | None | None | 2 minute warning possible |
| Best workload fit | Dev/test, spiky, new projects | Production, databases, steady services | Batch jobs, ML training, CI/CD pipelines |
| Billing model | Per second used | Fixed whether used or not | Per second at market rate |
| Flexibility | Maximum | Low to medium | High, with caveats |
The honest answer is that you should not be choosing just one. Every mature cloud infrastructure uses all three. The question is what percentage of your workload belongs in each bucket, and that comes down to a handful of straightforward questions about your workloads.
Teams that have figured out cloud cost optimisation do not sit in one pricing tier. They architect a three-layer coverage model and assign workload classes to their optimal pricing tier. The savings compound across all three layers simultaneously.
One important nuance: AWS now actively recommends Savings Plans over Reserved Instances for most EC2 and Lambda workloads. Compute Savings Plans offer up to 66 percent savings with significantly more flexibility. They apply automatically when you change instance type, shift between regions, or move from EC2 to Fargate. EC2 Instance Savings Plans save up to 72 percent but require commitment to a specific instance family in a specific region. For most teams, the flexibility of Compute Savings Plans is worth the slightly smaller discount.
Over-committing. AWS Cost Explorer and Trusted Advisor generate RI recommendations, but those recommendations are point-in-time snapshots. They do not update when your infrastructure changes mid-migration, when you are planning to downsize, or when seasonal workloads shift your baseline. A general rule of thumb: start with 70 percent coverage via commitments, run the rest on demand for two to three months, then gradually increase coverage as you understand your actual baseline. Never commit 100 percent on day one.
There is no single correct answer to which pricing model is right. The right question is: what is this specific workload doing, and how predictably does it need to do it? Steady, always-on infrastructure belongs on commitments. New, unpredictable, or burst workloads belong on On-Demand. Fault-tolerant, restartable jobs belong on Spot.
What kills cloud budgets is not using the wrong model once. It is leaving the wrong model in place indefinitely. An On-Demand database that has been running for 18 months of steady traffic is not a flexibility choice anymore. It is money left on the table every single month.
The teams that consistently manage cloud costs well are not doing anything magical. They understand their workload patterns, they commit where the patterns are stable, they build resilience where they want the deep discounts, and they review regularly. The three pricing models are tools. Like any tool, the skill is in knowing which one to reach for.
The three pricing models are tools. Like any tool, the skill is in knowing which one to reach for.