Snowflake caching part 1

Johan
3 min readJan 26, 2023

--

Today I was presenting Snowflake to some smart people, there I got the question: when you run a query twice, and the second run comes from cache, but after that a record is inserted in one of the tables ….. what happens with the next query: will it use the result cache or not.

My answer: I have to figure this out. I can guess what will happen, let’s run a test to see how Snowflake behaves.

First, I created a table with some records:

6 billion rows into the new table

Then, I ran a query (select *) against the new table, running it twice to make sure it is cached in the result cache of Snowflake:

Running !!!!

First run, 16 minutes:

results

Run again, results from cache:

cache FTW
77 ms instead of 16 minutes. Really powerful, this caching.
before the insert

Then, insert some records

after the insert, 11 new records

Run the same query again:

No usage of result cache

To conclude: if, after the first query, the results cache is available for a query, and in the meantime inserts happen in the table, the result cache will not be used. This makes sense, and now we have really tested it as well. The good thing: we do not need to worry about a cache being outdated since Snowflake understands this and ignores the result cache. On the other hand, this makes the usage of the result cache less likely. Perhaps when you have a nightly batch updating tables it will work but with records streaming in during the day, the cache will most like not be used.

This post was called “Snowflake caching part 1”. In part 2 I want to dive into PowerBI filters with regards to the use of the Snowflake cache. In part 3 I will try to summarize some conclusion on how useful this Snowflake caching really is, as it might not be that useful after all. Stay tuned !!

--

--

Johan

I am a freelance data engineer. My main focus is Snowflake, but I am always eager to learn new things. Python and AWS are my main side interests