Compute Disks: The Hidden Cost Optimization Opportunity

Introduction

While organizations focus on optimizing their Azure compute costs, storage expenses often fly under the radar. Yet, storage can represent a significant portion of your cloud bill, particularly when using inappropriate disk types for your workloads. Azure offers various disk options with different performance characteristics and pricing models, but the complexity makes optimization challenging for most engineering teams.

In this paper, we'll explore how smart disk selection drives big savings without performance trade-offs, and why automation is essential for finding these opportunities at scale. We'll do this through the lens of CEPM, PointFive's always-on discipline for measuring and improving cloud efficiency.

Understanding the Role of Azure Managed Disks

Azure Managed Disks serve as the persistent block storage for virtual machines in Azure, equivalent to AWS's Elastic Block Store (EBS) in the Amazon ecosystem. Unlike ephemeral storage (which disappears when VMs are deallocated) or blob storage (which is object-based and accessed over HTTP), managed disks provide traditional block-level storage that mounts directly to virtual machines with standard file system interfaces.

This positioning in the storage hierarchy is crucial to understand:

Memory/RAM: Fastest access with microsecond latency, but volatile and expensive; limited by VM size.
Managed Disks: Persistent block storage with millisecond latency; direct VM attachment.
Blob Storage: Object storage with HTTP-based access; higher latency but lower cost for large datasets.

For most applications, the choice of disk tier significantly impacts both performance and cost. While RAM is limited by VM size selections and blob storage serves different use cases entirely, disk storage is where most organizations have substantial optimization opportunities without application architecture changes.

Azure provides several disk types, each with its own performance tier and billing model, listed here from lowest to highest performance:

Standard HDD: Lowest cost option suitable for many production workloads with sequential I/O patterns or modest performance requirements.
Standard SSD: Balanced option with consumption-based transaction pricing that works well for most production workloads without strict latency SLAs.
Premium SSD: Traditional fixed-capacity pricing model that Azure defaults to when creating VMs. Ties performance to disk size.
Premium SSD v2: Alternative to Premium SSD with independently configurable capacity, IOPS, and throughput. Includes 3,000 IOPS and 125 MB/s baseline at no extra cost.
Ultra Disk: Highest tier with sub-millisecond latency for mission-critical applications with strict performance SLAs.

What makes optimization difficult is that these tiers don't just differ in performance, they follow entirely different billing models. Premium SSD is billed at a fixed rate based on provisioned size, with optional paid on-demand bursting, while Premium SSD v2 uses a provisioned performance model with separate charges for capacity, IOPS, and throughput beyond the baseline. Standard SSD and Standard HDD tiers have consumption-based pricing for actual I/O operations (transactions). While they have a lower base price, high-transaction workloads can cause Standard tiers to become more expensive than Premium SSD v2 in some cases.

This multi-dimensional pricing complexity creates significant challenges for most engineering teams. While familiarity with the key concepts outlined in this post will enable better decision-making, the intricate pricing models and workload-specific considerations often exceed what engineers can manage alongside their primary responsibilities.

That's why PointFive's Cloud Efficiency Posture Management (CEPM) platform is crucial — continuously scanning your Azure estate, analyzing real disk-usage patterns, and marrying them to Azure's complex pricing so you get clear, resource-level recommendations. PointFive precisely calculates potential savings for each disk in your environment and suggests specific actions to optimize costs while maintaining performance, turning abstract knowledge into actionable insights without additional overhead for your team.

Common Misconceptions and Optimization Opportunities

The "Better Disk Equals Better Performance" Fallacy

One of the most prevalent misconceptions in cloud infrastructure is the belief that upgrading to higher-tier storage automatically solves performance issues. This flawed assumption leads organizations to routinely overprovision disk resources, wasting substantial budget without addressing actual bottlenecks.

The reality is far more nuanced:

Many workloads don't have strict latency SLAs. For applications without specific latency requirements — such as batch processing, background jobs, content repositories, log storage, backups, and many web applications — Standard HDD typically provides more than adequate performance. Azure's marketing positions Standard HDD for "non-critical workloads," but this understates its capabilities. Many production workloads are perfectly suited to these lower-cost tiers.
The true bottleneck is rarely disk performance. When applications experience slowdowns, disk I/O is often presumed to be the issue. However, performance bottlenecks can originate from multiple sources including network latency, database query efficiency, memory/CPU constraints, or application code design. Proper diagnostics should identify the actual bottleneck before investing in more expensive storage.
Infrastructure-level latency vs. operational latency. It's essential to distinguish between infrastructure-level latency and the operational latency experienced by users, particularly when evaluating the role of storage tiers like Standard HDD. Infrastructure-level metrics such as IOPS, throughput, and disk read/write latency may appear limited on paper, especially for lower-tier disks. However, these figures often have minimal impact on the actual responsiveness of applications. Operational latency, the real-world performance perceived by users, depends far more on factors like application logic, database access patterns, caching, network latency, and service orchestration than on raw disk performance. This is especially true in modern stateless architectures, where Standard HDD proves to be an excellent and often underrated fit. In stateless services like containerized microservices or autoscaling web servers, the application is loaded into memory at startup and processes requests almost entirely in-memory. These services rarely perform local disk reads or writes during request handling, relying instead on external storage systems like object stores and databases. As a result, the latency characteristics of the local disk have negligible influence on operational performance.

Many organizations unnecessarily provision higher-tier disks for these scenarios, missing out on the lowest-cost option available. At just pennies per GB, Standard HDD can reduce your storage costs by up to 80% compared to Premium SSD for appropriate workloads.

It's worth noting that modern cloud architecture best practices emphasize stateless design patterns where possible. If you're building stateless microservices and containerized workloads with immutable infrastructure, you may not need expensive disks for many of your compute instances. The design principles that make applications cloud-native often align well with more cost-effective storage choices.

Throwing expensive Ultra Disk or Premium SSD at performance problems is like buying a sports car to navigate a congested city — you're paying a premium for capabilities you'll never utilize. Before upgrading disk tiers, it's essential to profile your application's actual I/O patterns and identify true bottlenecks through proper monitoring and diagnostics.

Premium SSD as the Default Choice: A Costly Mistake

Azure's decision to default to Premium SSD when creating new VMs isn't just a minor inconvenience. It's a systematic billing trap that's costing organizations significant amounts in wasted cloud spend. This default setting is questionable from a cost-optimization perspective and represents a cloud provider pricing practice that can drive excessive spend.

Consider these facts:

Premium SSD costs 1.5–2× more than Standard SSD for the same capacity.
Premium SSD forces you to pay for performance you may never use.

As mentioned, for many workloads, including web servers, API services, batch processing jobs, development environments, test systems, and numerous other application types, Standard tier disks would provide adequate performance at a fraction of the cost. Yet Azure defaults to the premium tier, potentially leading to higher costs for customers.

Even worse, Azure now offers Premium SSD v2 which provides better performance at lower costs than Premium SSD, yet still defaults to the more expensive, less flexible Premium SSD option during VM creation.

Premium SSD v2: The Cost-Saving Alternative with Strategic Considerations

Premium SSD v2 offers higher performance than Premium SSDs while also generally being less costly. You can individually tweak the performance (capacity, throughput, and IOPS) of Premium SSD v2 disks at any time, allowing workloads to be cost efficient while meeting shifting performance needs.

This is the key point for most organizations: Premium SSD v2 provides a baseline performance of 3,000 IOPS and 125 MB/s for any disk size that is offered at no additional cost. This baseline is included in the base price, and you only pay extra for additional provisioned performance beyond this baseline.

Unlike Premium SSD where performance tiers are fixed based on disk size, Premium SSD v2 allows you to independently configure your capacity, IOPS, and throughput. You provision exactly what you need and pay accordingly, rather than being forced into predefined tiers.

This flexibility also provides a practical solution to balancing performance requirements with cost optimization. When there's uncertainty about performance needs, Premium SSD v2 allows you to start with a reasonable baseline and adjust as necessary based on actual usage patterns. You can implement proper monitoring and make data-driven decisions about right-sizing, rather than overprovisioning "just to be safe."

A significant advantage of Premium SSD v2 over Standard tiers is its predictable pricing. Since you're not charged for actual transactions but rather for provisioned performance, your costs remain stable and predictable even during usage spikes. This makes Premium SSD v2 often less expensive than Standard SSD for transaction-heavy workloads, despite its higher baseline pricing.

The real-world implications of this pricing model are profound (prices referring to the US East region):

For standard database workloads (1TB storage), Premium SSD v2 costs only $83/month for the same capacity that would cost $135/month with Premium SSD (P30) — that's 38% savings automatically with no performance compromise.
For high-performance workloads, even when configuring Premium SSD v2 with the same 5,000 IOPS and 200 MB/s as a P30, you'll still pay only $96/month — a 29% reduction in cost for identical performance.
For small databases with high I/O requirements, the savings are even more dramatic. A 100GB database needing 5,000 IOPS would require a P30 disk (1TB) costing $135/month with Premium SSD. With Premium SSD v2, you pay only for the 100GB you need plus the additional performance, resulting in a total cost of just $21/month — a stunning 84% savings.

With Premium SSD v2, you can customize disk performance to precisely meet your workload requirements or seasonal demands, without the need to provision additional storage capacity. This decoupling of performance from capacity is revolutionary for cloud storage economics.

While Premium SSD v2 offers significant cost advantages, it comes with limitations to consider. Some of the main limitations include the inability to use it as an OS disk, incompatibility with Azure Compute Gallery, and support for only locally redundant storage (LRS) with no Zone-Redundant Storage (ZRS) option available. Additionally, Premium SSD v2 is currently available only in selected Azure regions.

For strategic implementation, use the smallest possible Premium SSD for OS disks, leverage Premium SSD v2 volumes for data disks, and implement ZRS at the snapshot level when necessary. The OS disk limitation for Premium SSD v2 can actually promote better architectural practices. It encourages a separation between OS and data disks, which aligns with cloud best practices. Using smaller standard OS disks with configuration automation tools can create more manageable deployments while leveraging Premium SSD v2 for data disks where performance matters most.

These workarounds allow you to capture the cost benefits while addressing the limitations. The cost differential is so significant that even with a small Premium SSD for the OS, the combined solution still represents substantial savings over using Premium SSD for everything.

Ultra Disk: The Aircraft Carrier for a Rubber Ducky

In the world of Azure storage solutions, Ultra Disk stands out as the performance heavyweight. It's a technological marvel capable of incredible speed and responsiveness. But like using a nuclear weapon to target pigeons, deploying Ultra Disk for workloads that don't truly need its capabilities represents massive overkill at a premium price. Many enterprises are paying exponentially more for marginal performance gains that deliver no measurable business value. Understanding when this premium storage option is justified, and when it's excessive, can lead to significant cost optimization without sacrificing actual application performance.

The Pricing Paradox Explained

The dramatic price difference between Ultra Disk and Premium SSD v2 is often misunderstood. A 1TB Ultra Disk configured for 20,000 IOPS and 1,000 MB/s throughput costs approximately $1,465/month — a staggering 7× more expensive than a similarly sized Premium SSD v2 with identical IOPS and throughput settings.

This price premium is not for IOPS or throughput at all. You're paying primarily for one thing: consistently lower latency.

IOPS (operations per second) and throughput (data transfer rate) are capacity metrics that don't measure speed of individual operations. Two disks with identical IOPS and throughput can have vastly different latency profiles underpinning how quickly each individual I/O operation completes. Ultra Disk delivers sub-millisecond latency that Premium SSD v2 cannot match, even when both are configured with identical IOPS and throughput values.

Latency Guarantees: What You're Really Paying For

Both Premium SSD v2 and Ultra Disks are designed to deliver sub-millisecond latencies. The key difference is in their availability targets: Premium SSD v2 aims to maintain its performance specifications 99.9% of the time, while Ultra Disks are engineered to deliver their promised performance 99.99% of the time.

To put this in perspective:

Premium SSD v2 (99.9% availability): Potential latency elevation for up to 1 minute and 26 seconds per day.
Ultra Disk (99.99% availability): Potential latency elevation for only 8.6 seconds per day.

That's the real difference you're paying for. 7× the price primarily buys you 1 minute and 18 seconds fewer of potential latency spikes per day. For many workloads, these brief performance variations go completely unnoticed at the application level or occur during non-critical periods.

Important Clarification: No Concrete Latency SLA

It's crucial to understand that while Ultra Disk is marketed for its superior latency performance, Microsoft does not provide a concrete latency SLA for this service. This absence of guaranteed latency metrics makes benchmarking not just important, but essential. Without contractual latency guarantees, the only way to verify Ultra Disk's actual performance in your specific environment is through comprehensive, workload-specific testing. This reality further emphasizes why blindly selecting Ultra Disk without validation can lead to unnecessary expenditure without confirmed performance benefits.

Value Assessment

While Ultra Disk offers theoretical maximums of up to 400,000 IOPS and 10,000 MB/s throughput (for the largest sizes), Premium SSD v2 can deliver up to 80,000 IOPS and 1,200 MB/s throughput which is sufficient for the vast majority of real-world applications. But even at comparable configurations, Ultra Disk will consistently deliver lower latency responses.

The critical question is whether this latency advantage justifies the substantial cost difference. For most workloads, the answer is no. For example, if we compare Ultra Disk's consistent sub-millisecond latency with Premium SSD v2's slightly higher but still impressive latency profile, this difference would rarely translate to measurable business value or improved user experience in most common scenarios. Such performance differences often become insignificant at the application level where other factors typically introduce much larger latencies.

Data-Driven Decision Making

The only way to conclusively determine whether Ultra Disk's lower latency is worth the premium is to benchmark it against your specific application's requirements. Many workloads currently deployed on Ultra Disk could operate with indistinguishable performance on Premium SSD v2 at a fraction of the cost.

Success requires strong alignment between FinOps and engineering teams, ideally through a consolidated platform like PointFive that presents both detailed performance diagnostics and cost analysis in one place, enabling decisions that balance technical requirements with business value.

Real-World Optimization Strategies

Effective disk optimization requires a multi-faceted approach:

Right-sizing based on actual usage. Many disks are over-provisioned, resulting in wasted capacity and performance. Analyzing actual disk utilization can identify opportunities for optimization.
Appropriate tier selection. Match disk types to actual performance requirements, not perceived needs. Many applications can operate well within Premium SSD v2's baseline performance of 3,000 IOPS and 125 MB/s.
Strategic architecture decisions. Design systems to separate OS and data disks to leverage Premium SSD v2. A 32GB Premium SSD OS disk paired with an appropriate Premium SSD v2 data disk can save substantial amounts compared to a single large Premium SSD.
Regular review and adaptation. Storage needs evolve with your applications. Set up quarterly storage reviews to ensure optimal configuration.
Monitor transaction rates on Standard tiers. For Standard SSD and HDD, monitor transaction rates closely as high transaction volumes can cause costs to exceed Premium SSD v2. In these cases, migrating to Premium SSD v2 can provide both better performance and lower costs.

Utilizing Azure's monitoring capabilities to examine actual I/O patterns over time is essential for optimization. Rather than guessing at performance needs, analyze metrics data to identify peak IOPS, average throughput, and I/O patterns, then generate right-sizing recommendations. When performance decisions are based on actual data rather than assumptions or vendor defaults, significant cost savings can be achieved.

The Need for Automated Disk Optimization

The complexity of Azure disk pricing makes manual optimization impractical. Consider the variables involved:

Different billing models across disk types (fixed vs. configurable vs. consumption-based).
Performance characteristics that vary by workload.
Changing application requirements.
Regional availability differences.
Various limitations and constraints per disk type.

This complexity necessitates an automated approach that can:

Scan your entire environment without agents.
Analyze actual usage patterns.
Understand Azure's complex billing model.
Provide accurate, actionable recommendations.

Storage is consistently one of the most challenging resources to optimize in Azure environments, due to the complex interaction between pricing models, performance tiers, and application requirements. Automation tools can help bridge the gap between cloud platform capabilities and practical implementation, making optimization more accessible and sustainable.

PointFive's Disk Optimization Capabilities

PointFive's Cloud Efficiency Posture Management (CEPM) platform delivers precise, actionable disk optimization insights generated through agentless, read-only scans across your Azure environment. These insights incorporate real disk usage patterns, VM configurations, pricing intricacies, and architectural best practices — empowering teams to make high-impact changes with confidence and control.

Below are the core disk-specific recommendation types delivered by PointFive:

1. Suboptimal Disk Type Detection

PointFive evaluates each disk in the context of:

Actual IOPS and throughput usage.
Disk size and usage patterns.
Attached VM configuration.
Azure's tier-specific pricing models and regional billing nuances.

It then runs a full pricing simulation across eligible alternatives, ensuring that only options offering net cost savings for the specific workload are surfaced. This includes both modernization (e.g., Premium SSD to Premium SSD v2) and tier downgrades (e.g., SSD to a larger HDD) that preserve performance requirements.

Importantly, PointFive often presents multiple valid disk configuration options for each case, with different latency and cost trade-offs. While infrastructure metrics are analyzed in depth, PointFive assumes the workload owner holds the business and application-layer context. That's why it provides the technical insights and simulated outcomes needed to support informed, context-aware decisions.

For example, consider a production VM running a stateless API service on a Premium SSD (P30). Based on observed I/O metrics, PointFive may recommend:

Premium SSD v2 with identical IOPS/throughput at ~40% lower cost.
Standard SSD for further savings, if occasional latency variation is acceptable.
A larger Standard HDD, if the workload is throughput-heavy but latency-tolerant.

Each option includes a simulated pricing outcome and is designed to preserve or improve efficiency. For stateless services, disk I/O latency often does not translate directly into operative or user-facing latency, making lower-cost tiers a strong fit. In contrast, stateful workloads like databases or queues may require more predictable performance. This recommendation model ensures teams can optimize based on actual workload behavior and architectural requirements.

2. Ultra Disks Used in Non-Production Environments

PointFive uses proprietary environment classification technology that leverages metadata, resource context, and usage signals to accurately determine whether a resource belongs to a production, staging, development, or QA environment. This enables the platform to automatically detect Ultra Disks deployed in non-production contexts — a common source of silent overspend.

When these disks are identified, PointFive simulates alternative configurations such as Premium SSD v2 or Standard SSD, which can reduce storage costs by up to 7×, while still delivering sufficient performance for non-critical workloads.

3. Inefficient Use of Premium SSDs for OS Volumes

Premium SSD v2 is not supported for OS disks, yet many organizations deploy oversized Premium SSDs that combine OS and data onto a single volume. PointFive identifies these cases and recommends separating the concerns into:

A small Premium SSD (e.g., 32–64 GB) for the operating system.
A Premium SSD v2 for performance-critical data workloads.

This split preserves performance while significantly reducing unnecessary premium storage spend. It also aligns with best practices in modern cloud architecture, enabling independent scaling and more maintainable infrastructure.

Conclusion: Take Control of Your Azure Storage Costs

Azure disk optimization represents one of the most overlooked cost-saving opportunities in cloud computing. By choosing appropriate disk types, right-sizing your storage, and implementing strategic architectures, organizations can reduce storage costs significantly without impacting performance.

The math speaks for itself: a typical organization with 500TB of Premium SSD storage could save over $300,000 annually by optimizing their disk selection. Even for smaller environments, the percentage savings remain consistent, making this optimization valuable for companies of all sizes.

However, achieving these savings requires visibility and insight that most teams lack. This is where PointFive's agentless cost optimization solution comes in. By scanning your environment and applying our proprietary detection system, we provide precise recommendations tailored to your specific workloads and requirements.

The knowledge gap between what Azure offers and what many engineering teams understand about storage optimization can lead to unnecessary costs. While the platform's default settings may contribute to this issue, understanding the fundamentals of your application's I/O patterns and aligning your storage choices accordingly can lead to significant cost savings.

Don't let Azure's complex disk pricing models drain your cloud budget.