Data Engineering & AnalyticsSnowflakeData WarehouseFinOpsQuery OptimisationCloud Data

The 1,000 SQL Query: Why Your Snowflake Bill Is Spiralling

Ankush
Ankush
Chief Technology Officer, GYSP.tech
15 April 20259 min read
The 1,000 SQL Query: Why Your Snowflake Bill Is Spiralling

The Snowflake migration was a success. Query performance improved dramatically. The data team loves the separation of storage and compute. The business intelligence team can now run workloads in parallel without fighting for resources. Then the first full month's invoice arrives and someone from Finance appears at the data team's Slack channel asking what happened to the data budget.

Snowflake's credit-based pricing model is well-designed and genuinely transparent — but it exposes query patterns and configuration decisions that were harmless or invisible in an on-premise environment. A query that ran for three minutes on a dedicated cluster becomes a three-minute charge on a Snowflake warehouse. Run that query a thousand times per day and you have a billing line item that didn't exist before.

The Three Root Causes of Snowflake Cost Spirals

1. Full Table Scans on Clustered Data

Snowflake uses micro-partitioning — automatically dividing tables into compressed, immutable storage partitions — and clustering keys to determine which partitions a query needs to scan. When a query filters on a column that isn't part of the clustering key, Snowflake must scan all partitions to find matching rows. On large tables, this can be hundreds of times more expensive than a well-clustered scan. Teams that migrate queries from on-premise environments without reviewing clustering keys often discover that their most expensive queries are performing full-table scans on billion-row tables.

2. Warehouse Sizing Carried Over from On-Premise

On-premise, warehouse size was a fixed capital expense — you bought what you could afford and used it constantly. In Snowflake, warehouse size determines compute per second, and a larger warehouse runs the same query faster but costs more per second. The optimal sizing depends on the query pattern: for interactive queries where latency matters, a larger warehouse may be justified. For batch ETL jobs running overnight, a smaller warehouse that takes twice as long is often half the cost.

3. Excessive Auto-Suspension Delays

Snowflake warehouses can be configured to auto-suspend after a period of inactivity. The default in many configurations is ten minutes. If your data team runs a query, reads the results for eight minutes, then runs another query, the warehouse may never suspend — and you're billed for the full idle period between queries. Setting aggressive auto-suspension (sixty seconds for interactive workloads, even shorter for batch) and auto-resumption can eliminate a significant fraction of credits consumed by idle compute.

The most common Snowflake cost issue we diagnose: a warehouse set to XL or 2XL for a migration 'just to make sure it's fast enough' that was never resized after the migration. Running production workloads on a 2XL warehouse that a Medium would handle adequately costs four times more per credit.

The Workload Patterns That Destroy Snowflake Economics

Is your data stack slowing down your AI?

48-hour turnaround. No obligation.

Request Data Assessment
  • Exploratory queries on large tables without partition pruning: Data analysts running ad-hoc queries that scan entire fact tables because the filter column isn't in the clustering key
  • High-cardinality GROUP BY without pre-aggregation: Queries that aggregate billions of rows to produce a small summary that could have been pre-materialised in a downstream table
  • Repeated identical queries without result caching: Applications that run the same query dozens of times per minute instead of caching results at the application layer
  • Multi-statement transactions on large tables: Transactional patterns from OLTP systems that are antipatterns in a columnar warehouse
  • Unconstrained third-party BI tool queries: Self-service BI tools generating inefficient SQL against large tables without query governance

The Optimisation Levers That Matter Most

The highest-ROI Snowflake optimisation levers, in order of typical impact: First, review and optimise clustering keys on your largest, most-queried tables — correct clustering can reduce scan costs by 80% or more on tables with clear time or category-based access patterns. Second, right-size warehouses by workload type: create separate, correctly-sized warehouses for interactive, ETL, and data science workloads rather than sharing a single large warehouse across all patterns. Third, set aggressive auto-suspension and use scheduled scaling to pre-warm warehouses before known peak periods rather than leaving them running idle.

Fourth, implement result cache discipline — understand which of your queries are hitting Snowflake's result cache (free) versus computing results (charged) and architect your application and BI tool query patterns to maximise cache hits. Fifth, consider materialised views and pre-aggregated tables for commonly-run expensive queries that always produce the same result on the same data.

The Governance Dimension

Technical optimisation alone is insufficient if the organisation doesn't have visibility into which teams and workloads are consuming credits. Snowflake's Resource Monitors allow you to set credit quotas by warehouse and trigger alerts or suspensions when thresholds are reached. Without this governance layer, a single runaway query or a newly-onboarded team with poor query habits can wipe out a month's optimisation savings in a day.

GYSP's Data Engineering & Analytics practice has optimised Snowflake environments for mid-market and enterprise clients across financial services, e-commerce, and healthcare. The typical outcome of a structured optimisation engagement: 40–60% reduction in credit consumption within ninety days, with sustainable governance controls that prevent regression.

Snowflake is a phenomenal platform for analytics — but it doesn't optimise itself. Teams that migrated from on-premise warehouses without rethinking their query patterns are paying for their old infrastructure's limitations at cloud pricing.

Ankush, Chief Technology Officer — GYSP.tech
ShareLinkedInTwitter / X

Get new Data Engineering & Analytics insights in your inbox

Practical, no-fluff articles for engineers and technology leaders. New pieces delivered as they're published.

No spam. Unsubscribe any time.

Get in TouchFree Technical Brief