Back to Blog
gcp
bigquery
cost-optimization
finops
data-engineering

BigQuery Cost Optimization: The Complete Guide for Growing Companies

Most teams scan 10-100x more data than needed in BigQuery. This guide covers partitioning, clustering, scheduled query audits, slot management, and storage optimization to cut your BigQuery bill by 50-90%.

Matias Coca|
11 min read

Here's a story that keeps repeating: a team runs BigQuery on-demand for months, paying almost nothing. Then data volumes grow, a few scheduled queries get added, someone writes a SELECT * against a multi-terabyte table—and suddenly BigQuery is the #2 line item on the GCP bill.

The embarrassing part? Most of these costs come from scanning data you don't actually need. We've seen teams reduce BigQuery spend by 50-90% by fixing a handful of query patterns and table configurations. No pricing model changes. No architecture overhaul. Just understanding how BigQuery actually charges you.

This guide covers everything from quick wins you can implement today to structural decisions about slots and storage.


How BigQuery Pricing Actually Works

Before optimizing anything, you need to understand what you're paying for. BigQuery has two cost dimensions that trip people up:

Query costs (the big one)

On-demand pricing: $6.25 per TB scanned. Not per TB returned—per TB scanned. This is the critical distinction. A query that scans 500 GB but returns 10 rows still costs you $3.13.

BigQuery charges based on the columns and partitions your query touches, not the result size. This means:

Query Pattern1 TB TableCost
SELECT *Scans full 1 TB$6.25
SELECT col1, col2 (2 of 50 columns)Scans ~40 GB$0.25
SELECT col1 WHERE date = '2026-03-01' (partitioned)Scans ~1 GB$0.006
Same table. Same data. 1,000x cost difference based on how you query it.

Flat-rate pricing (slots)

Instead of paying per TB scanned, you buy compute capacity (slots). BigQuery Editions offer three tiers:

EditionBaseline $/slot/hrBest For
Standard$0.04Predictable workloads, basic needs
Enterprise$0.06Most production workloads
Enterprise Plus$0.10Mission-critical, multi-region
With Editions, you set a baseline slot count and can burst above it. You pay for baseline continuously, burst on-demand.

Rule of thumb: If your on-demand BigQuery bill exceeds $5,000/month consistently, it's worth evaluating flat-rate.

Storage costs

Storage TypePrice/GB/month
Active storage (modified in last 90 days)$0.02
Long-term storage (untouched 90+ days)$0.01
Streaming inserts$0.05 per 200 MB
Storage is usually cheap relative to query costs, but it adds up with large datasets and streaming pipelines.

The free-tier trap

BigQuery gives you 1 TB of free queries and 10 GB of free storage per month. This is generous enough that teams build habits around it—no partitioning, liberal SELECT *, no cost monitoring. Then data grows past the free tier and those habits become expensive fast.


Quick Wins: Do These First

These changes require minimal effort and often deliver the biggest savings.

1. Partition your tables

Partitioning divides a table into segments based on a column (usually a date/timestamp). When you query with a partition filter, BigQuery only scans the relevant segments instead of the entire table.

CREATE TABLE project.dataset.events
PARTITION BY DATE(event_timestamp)
AS SELECT * FROM project.dataset.events_raw;

A 1 TB table partitioned by date, queried for one day's data, might scan 3 GB instead of 1 TB. That's a 99.7% cost reduction on that query.

Require partition filters to prevent accidental full-table scans:

ALTER TABLE project.dataset.events
SET OPTIONS (require_partition_filter = true);

Now anyone running SELECT * FROM events without a date filter gets an error instead of a surprise bill.

2. Cluster your tables

Clustering sorts data within partitions by specified columns. When you filter or aggregate on clustered columns, BigQuery skips irrelevant data blocks.

CREATE TABLE project.dataset.events
PARTITION BY DATE(event_timestamp)
CLUSTER BY user_id, event_type
AS SELECT * FROM project.dataset.events_raw;

Column order matters. BigQuery clusters hierarchically—put the most commonly filtered column first. If most queries filter by user_id, put it before event_type.

Clustering typically reduces bytes scanned by 30-60% on top of partitioning savings.

3. Stop using SELECT *

This is the simplest change with the biggest impact. Every SELECT * scans every column in the table, even if you only need three fields.

Replace:

-- Scans entire table: ~500 GB = $3.13
SELECT * FROM project.dataset.events
WHERE date = '2026-03-01'

With:

-- Scans 3 columns: ~30 GB = $0.19
SELECT event_timestamp, user_id, event_type
FROM project.dataset.events
WHERE date = '2026-03-01'

For exploration, use the BigQuery Console's "Preview" tab instead of running queries. Preview is free—it reads directly from storage without executing a query.

4. Use query dry runs

Before running an expensive query, check how much data it will scan:

-- In BigQuery Console, click "More" → "Query Settings" → check "Dry run"
-- Or use the API/CLI:
-- bq query --dry_run 'SELECT ...'

The Console also shows estimated bytes scanned in the top-right corner before you run any query. Make it a habit to glance at this number.


Audit Your Scheduled Queries

Scheduled queries are the silent killer of BigQuery budgets. Someone sets up a query to refresh a dashboard, it runs every hour, and nobody checks whether anyone actually looks at that dashboard.

Find your most expensive queries

Use INFORMATION_SCHEMA.JOBS to identify the biggest cost drivers:

SELECT
  user_email,
  query,
  COUNT(*) AS execution_count,
  ROUND(SUM(total_bytes_processed) / POW(2, 40), 2) AS total_tb_scanned,
  ROUND(SUM(total_bytes_processed) / POW(2, 40) * 6.25, 2) AS estimated_cost_usd
FROM region-us.INFORMATION_SCHEMA.JOBS
WHERE creation_time > TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 30 DAY)
  AND job_type = 'QUERY'
  AND state = 'DONE'
GROUP BY user_email, query
ORDER BY total_tb_scanned DESC
LIMIT 20;

This shows your top 20 most expensive queries over the last 30 days. Common findings:

  • Dashboard refresh queries scanning full tables every 15 minutes when daily would suffice
  • ETL jobs doing full-table rebuilds instead of incremental updates
  • Ad-hoc exploration queries that got accidentally scheduled

Do the math

A single scheduled query can be surprisingly expensive:

Scan SizeFrequencyMonthly Cost
10 GBEvery hour (24/day)$45
50 GBEvery hour$225
100 GBEvery 15 min (96/day)$1,800
500 GBEvery hour$2,250
That 100 GB query running every 15 minutes? It's costing $1,800/month. If nobody's looking at that dashboard more than once a day, switching to daily refresh saves $1,740/month.

What to do

  1. List all scheduled queries: BigQuery Console → Scheduled Queries
  2. Cross-reference with dashboard usage: Is anyone actually looking at the output?
  3. Reduce frequency: Most dashboards don't need sub-hourly refresh
  4. Add partitioning: Make the scheduled query scan only new data, not the full table
  5. Replace with materialized views: For aggregations, let BigQuery handle incremental updates automatically

Slot Management: On-Demand vs Editions

If your on-demand BigQuery spend is consistently above $5,000/month, flat-rate pricing through BigQuery Editions might save money.

Evaluate your slot usage

First, understand your current slot consumption:

SELECT
  TIMESTAMP_TRUNC(period_start, HOUR) AS hour,
  MAX(period_slot_ms / (TIMESTAMP_DIFF(period_end, period_start, MILLISECOND))) AS max_slots_used,
  AVG(period_slot_ms / (TIMESTAMP_DIFF(period_end, period_start, MILLISECOND))) AS avg_slots_used
FROM region-us.INFORMATION_SCHEMA.JOBS_TIMELINE
WHERE period_start > TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 7 DAY)
GROUP BY hour
ORDER BY hour;

This shows your hourly slot usage pattern. Look for:

  • Steady baseline: Flat-rate makes sense—you'll pay less per slot than on-demand equivalent
  • Spiky usage: On-demand might be cheaper—you only pay when queries run
  • Mix of both: Use Editions with autoscaling—set a low baseline and burst for peaks

BigQuery Editions autoscaling

Editions let you set a baseline slot count and a maximum. You pay for baseline 24/7 and burst capacity only when used:

SettingWhat It Means
BaselineSlots always available, always billed
Max reservationUpper limit for burst capacity
AutoscalingBigQuery adds slots between baseline and max as needed
Start conservative: Set baseline at your P50 slot usage and max at your P95. Monitor for a month, then adjust.

Commitment discounts

If your usage is stable, slot commitments reduce costs further:

CommitmentDiscount
None (on-demand)0%
Annual~20-30% off list price
3-year~40-50% off list price
Only commit to your true baseline. Burst capacity should stay uncommitted.

Storage Optimization

Storage costs are usually a fraction of query costs, but they can sneak up on large datasets.

Automatic long-term pricing

BigQuery automatically moves tables untouched for 90 days to long-term storage at half price ($0.01/GB vs $0.02/GB). You don't need to do anything—this happens automatically.

But you should actively manage storage for:

Table and partition expiration

Set expiration on temporary or staging tables:

-- Expire table after 7 days
ALTER TABLE project.dataset.temp_results
SET OPTIONS (expiration_timestamp = TIMESTAMP_ADD(CURRENT_TIMESTAMP(), INTERVAL 7 DAY));

-- Expire partitions older than 90 days
ALTER TABLE project.dataset.events
SET OPTIONS (partition_expiration_days = 90);

Without expiration, temporary tables sit in storage indefinitely. We've seen teams paying for years of "temporary" data.

Materialized views vs repeated queries

If you're running the same aggregation query repeatedly (for dashboards, reports), use a materialized view instead:

CREATE MATERIALIZED VIEW project.dataset.daily_stats
AS SELECT
  DATE(event_timestamp) AS event_date,
  COUNT(*) AS event_count,
  COUNT(DISTINCT user_id) AS unique_users
FROM project.dataset.events
GROUP BY event_date;

BigQuery maintains materialized views incrementally—only processing new data. This is dramatically cheaper than re-scanning the full table every time.

Streaming insert costs

If you're using the streaming API (insertAll), you're paying $0.05 per 200 MB. For high-volume pipelines, consider:

  • Batch loading instead of streaming (batch loads are free)
  • Storage Write API (significantly cheaper than the legacy streaming API)
  • Only use streaming when you genuinely need real-time data availability


Monitoring and Governance

Optimization without monitoring is a one-time fix. Build guardrails to prevent costs from creeping back.

Custom query cost controls

Set per-user or per-project limits to prevent runaway queries:

-- Set maximum bytes a single query can scan (project-level)
ALTER PROJECT my-project
SET OPTIONS (
  default_query_job_timeout_ms = 300000,  -- 5 min max
  maximum_bytes_billed = 1099511627776    -- 1 TB max per query
);

Or per-query:

-- Add to individual queries
SELECT *
FROM project.dataset.large_table
OPTIONS (maximum_bytes_billed = 10737418240);  -- 10 GB limit

If the query would exceed the limit, it fails before scanning anything—no cost incurred.

Labels for cost attribution

Add labels to BigQuery jobs to track costs by team, project, or pipeline:

-- Query with labels
SELECT *
FROM project.dataset.events
OPTIONS (labels = [("team", "analytics"), ("pipeline", "daily_report")]);

Then query your billing export to see costs broken down by label. This is essential for understanding who is driving BigQuery costs and why.

Budget alerts

Set BigQuery-specific budget alerts in the GCP Console:

  1. Billing → Budgets & alerts
  2. Filter to BigQuery service only
  3. Set threshold at 80% and 100% of expected monthly spend
  4. Route alerts to your engineering Slack channel, not just finance


Putting It All Together

This week (30 minutes)

  • [ ] Run the INFORMATION_SCHEMA.JOBS query to find your top 20 most expensive queries
  • [ ] Check for SELECT * in any scheduled or production queries
  • [ ] Set maximum_bytes_billed on your project as a safety net
  • [ ] Review scheduled queries—kill or reduce frequency on any feeding unused dashboards

This month (2-3 hours)

  • [ ] Add partitioning and clustering to your top 5 most-queried tables
  • [ ] Enable require_partition_filter on large partitioned tables
  • [ ] Set table expiration on staging/temporary tables
  • [ ] Replace repeated aggregation queries with materialized views

This quarter

  • [ ] Evaluate Editions vs on-demand based on your slot usage analysis
  • [ ] Implement labels on all BigQuery jobs for cost attribution
  • [ ] Set up BigQuery-specific budget alerts
  • [ ] Review storage for datasets that can use partition expiration

Expected results

OptimizationTypical Savings
Partitioning + clustering50-90% on query costs
Eliminating SELECT *60-80% per query
Scheduled query audit$500-5,000+/month
Materialized views70-90% vs repeated queries
On-demand → Editions (if applicable)20-40%
Storage cleanup10-30% on storage costs

Need help analyzing your BigQuery costs? GCP FinOps helps growing companies identify and eliminate cloud waste without enterprise complexity.


Related Articles:

Written by Matias Coca

Building GCP cost optimization tools for growing companies. Questions or feedback? Let's connect.

Ready to optimize your GCP costs?

See exactly where your cloud spend goes with our cost optimization dashboard.