Knowledge Base
Cloud Optimization Framework
Deep dives into cloud inefficiencies: how they arise, how they're billed, how to detect them, and how to fix them.
Excessive Auto-Clustering Costs from High-Churn Tables
Tables with frequent large-scale modifications cause Snowflake to constantly recluster data, resulting in substantial compute consumption for maintenance tasks.
Excessive Snapshot Storage from High-Churn Snowflake Tables
Snowflake automatically maintains previous data versions for high-churn tables, creating accumulated historical snapshot data that inflates storage costs.
Inactive and Detached Managed Disk
Managed Disks frequently remain detached after Azure VMs are deleted or reconfigured, generating unnecessary costs despite not supporting active workloads.
Inefficient Execution of Repeated Queries
Repeated query patterns executing without optimization cause compounded inefficiencies and excessive warehouse compute consumption in Snowflake.
Inefficient Pipeline Refresh Scheduling
Data refresh operations executed more frequently than downstream business usage requires waste Snowflake credits when schedules don't align with actual data consumption patterns.
Inefficient Snowpipe Usage Due to Small File Ingestion
Ingesting numerous small files through Snowpipe creates cost inefficiencies as each file incurs the same overhead fee regardless of size, straining metadata infrastructure.
Inefficient Use of On-Demand Capacity in DynamoDB
On-Demand mode is often cost-inefficient for DynamoDB tables with consistent throughput — shifting to Provisioned mode with Auto Scaling can yield substantial cost reductions.
Inefficient Workload Distribution Across Warehouses
Separate Snowflake warehouses per team often result in redundant, underutilized resources — consolidating compatible workloads can significantly lower total credit consumption.
Infrequently Accessed Objects Stored in S3 Standard Tier
Keeping large volumes of infrequently accessed data in S3 Standard results in unnecessary expenses — backups, logs, and archives are strong candidates for colder storage tiers.
Missing or Inefficient Use of Materialized Views
Materialized views that are underutilized or improperly implemented can either waste compute on refreshes or miss opportunities to save on costly repeated queries.
Retention of Unused Data in Snowflake Table
Maintaining stale records in active Snowflake tables without proper lifecycle management inflates both storage and query execution costs as compute scales with data scanned.
Suboptimal Query Routing
Inefficient query-to-warehouse routing, inadequate dynamic scaling, and failure to consolidate workloads during low-usage periods lead to unnecessary Snowflake expenses.
Suboptimal Query Timeout Configuration
Without appropriate query timeout configuration, inefficient or runaway queries can execute for extended periods, keeping warehouses active and accruing unnecessary costs.
Suboptimal Use of Search Optimization Service
Snowflake's Search Optimization can enable significant savings when selectively applied to lookup-heavy workloads, but inefficiencies occur when underutilized or unnecessarily enabled.
Suboptimal Warehouse Auto-Suspend Configuration
Overly generous auto-suspend thresholds keep Snowflake warehouses active while idle, accruing unnecessary charges that can be reduced by tightening suspension windows.
Underutilized GCP VM Instance
GCP VM instances provisioned with excess CPU or memory relative to actual needs represent cost reduction opportunities through rightsizing to smaller machine types.
Underutilized Snowflake Warehouse
Workloads running on oversized warehouse instances consume excess credits without proportional performance gains — downsizing to appropriately-sized warehouses reduces costs.