Optimizing Snowflake (5) — Queuing and spilling, your worst enemies

Johan
4 min readSep 11, 2024

--

Photo by Jeremy Yap on Unsplash

When executing queries, two things can happen that you want to avoid at all time. Why: because they will cause totally unpredictable performance:

  • Queuing
  • Spilling

Queuing

This will happen when a Snowflake warehouse is out of resources for all queries to be executed. First, how can we identify queuing ?

  • look at the warehouse utilization chart in Snowsight. Go to Admin-Warehouses:

Choose a warehouse, click on it, and you will see this chart. If everything is well, you will only see the blue bars. If there are blocked queries (very rarely see this within Snowflake) or queued queries, very often see this after initial creation of warehouses, you will see the non-blue colors when more and more people start using the warehouse.

When you do not see the chart, your role lacks monitor privileges on the warehouse:

Make sure this command is executed by someone with SYSADMIN privileges. Even better, make this GRANT part of your WAREHOUSE provisioning scripts:

USE ROLE SYSADMIN;
GRANT MONITOR ON WAREHOUSE www TO ROLE rrr;

Now that we have identified queuing on the warehouse level, let’s implement a fix: a multi-cluster warehouse which will automatically scale (spin up instances) when queuing comes around the corner.

Some background about queuing in Snowflake: When you create a warehouse in Snowflake, some default settings are applied:

CREATE WAREHOUSE WH_REPORTING;
SHOW WAREHOUSES LIKE '%WH_REPORTING%';

Let’s now alter this warehouse, let’s also apply the quick wins from the previous article:

ALTER ACCOUNT XXX999 SET ABORT_DETACHED_QUERY = TRUE;
ALTER WAREHOUSE WH_REPORTING SET AUTO_SUSPEND = 30;
ALTER WAREHOUSE WH_REPORTING SET STATEMENT_TIMEOUT_IN_SECONDS = 600;
ALTER WAREHOUSE WH_REPORTING SET MIN_CLUSTER_COUNT = 1;
ALTER WAREHOUSE WH_REPORTING SET MAX_CLUSTER_COUNT = 4;
ALTER WAREHOUSE WH_REPORTING SET SCALING_POLICY = STANDARD;

How can you identify the account name for the first ALTER statement:

SELECT CURRENT_ACCOUNT();

As you can see, there is one parameter setting that is not applied:

ALTER WAREHOUSE WH_REPORTING SET STATEMENT_QUEUED_TIMEOUT_IN_SECONDS = 600;

Reason for that is that we want to fix queuing with the other parameters, thus not forcing queued queries to be aborted.

About the scaling policy: we can choose wither STANDARD or ECONOMY. In my period of working with Snowflake I have never set a scaling policy to ECONOMY, as queuing will not disappear with that setting. STANDARD in my experience will fix queuing in almost all cases.

So, what will the result be:

We will see that clustered warehouses are actually doing their work:

Plus, we can check in the warehouse_events_history table how the cluster utilization over time is functioning:

select timestamp::date datum, * 
from snowflake.account_usage.warehouse_events_history
where 1 = 1
and cluster_number is not null

Resulting in a very nice chart:

This is actually a very nice case, since there are days that the cluster is maxing out to 10 nodes. This will require further investigation into the specific cases which I will not adress in this article but in future articles.

In the next article, we will digg into the other worst enemy, spilling.

--

--

Johan

I am a freelance data engineer. My main focus is Snowflake, but I am always eager to learn new things. Python and AWS are my main side interests