One in Five Dollars You Spend on S3 Has Nothing to Do With Storage

Most FinOps teams track S3 as a storage cost. They look at how much data they hold, compare it to last month, and move on. That framing misses a significant share of what S3 actually charges for.

Across dozens of AWS customers we analyzed, representing more than $10M in monthly S3 spend, 20% of that bill was API costs, not bytes. All of it came from requests to read, list, and query data, generated by systems that were never designed with S3's cost model in mind.

The data sitting in those buckets is the same. The storage rate is identical. What differs is how each system touches it.

Same data. Very different bills. A well-designed system runs at 1.1x the raw storage rate, while a coordination-heavy system runs at 12.45x.

S3 charges for more than what you store

S3 pricing has two components: storage, and requests. Every time a system reads, lists, or queries data, there is a per-operation charge on top of what you pay per GB. Request costs scale with how your systems were built to interact with S3, not with how much data you have. Most cost dashboards only show you one of those two levers clearly.

The request line is there in your cost data, but it rarely comes with context. A bucket running at 12 times the raw storage rate looks like a number, not a problem, until you know which system is driving it and why.

What this looks like in practice: Spark and Cortex

Apache Spark is a well-known example: as deployments scaled, the way Spark interacted with S3 at job commit time became a meaningful cost driver, and it took years of engineering investment to address.

Cortex, the CNCF time-series database built to scale Prometheus, took the opposite path. Designed for S3 from the start, it holds data in memory, batches writes, and maintains an internal manifest so downstream components never need to query S3 to discover state. The result is a system that runs IO-intensive workloads on S3 without generating unnecessary request overhead.

The gap between the two is not a performance difference. It is a cost difference, and it is entirely architectural.

Ariel Lichterman, Cloud Researcher at PointFive, covered both cases at KubeCon. Watch the session here.

The pattern is concentrated, but common

In the 3,277 buckets over $100 per month that we analyzed, 14% were running at 5x or more above the raw storage rate. Those buckets account for 27% of total API cost and $645k per month in identifiable overhead. The worst offender: per-event log ingestion, running at 12 times the raw storage rate on average.

Most environments contain at least one system in this category. It is rarely one anyone thinks of as a cost problem.

How PointFive handles it

PointFive's CEPM framework pinpoints which buckets are running at disproportionate request cost and quantifies the dollar impact at the workload level. That breakdown is not available in AWS Cost Explorer by default.

Getting it fixed is the other part. The person who finds the problem is rarely the person who can change how a system writes to S3. PointFive connects those two sides: the dollar story for FinOps and the technical context for engineering, in a single workflow.

If you want to know what this looks like in your environment, talk to us.

For the technical detail, Ariel's KubeCon session is worth your time.

One in Five Dollars You Spend on S3 Has Nothing to Do With Storage

S3 charges for more than what you store

What this looks like in practice: Spark and Cortex

The pattern is concentrated, but common

How PointFive handles it

About PointFive