Snowflake Dynamic Tables
This private preview (PrPr) feature offers a nice Change Data Capture (CDC)-like feature on Snowflake tables, that you would normally solve with Streams&Tasks.
To set this up is really easy (for a first demo). First, create a ‘source’ table:
create or replace table t1
as
select 1 as sleutel
Now, create a dynamic table that ‘receives’ all records from table t1 on a CDC basis, with a lag of 1 minute:
create or replace dynamic table t2
lag = '1 minute'
warehouse = <your warehouse>
as
select sleutel
from t1;
Then, start doing some dummy inserts into table t1:
insert into t1 (select 5);
insert into t1 (select 6);
Initially, no records are added to table t2, only records 1–5 are in t1:
But, after 1 minute, the records are also in table t2:
To conclude this first part of the demo on dynamic tables, it is surely very easy to set up. You only need the PrPr option enabled by your Snowflake Account-team. That might be the largest part of your effort :)
If the option is not enabled, you get this message when executing the script:
After this, I started thinking about more advanced CDC capabilities like entering the current_timestamp() into t2 at the moment of loading the record, using this code to create the dynamic table:
create or replace dynamic table t2
lag = '1 minute'
warehouse = <you warehouse>
as
select sleutel
, current_timestamp() as tijd
from t1;
When I execute this code, I get an error message:
Based on the error above, I tried some additional approaches, where 2 new errormessages showed up:
This makes sense, and is easy to fix: select specific columns instead of *. This is, in general, a very good idea to select specific columns:
- for a columnstore database such as Snowflake, select * is the worst possible query profile
- a select * can break if the underlying table definition changes
This is a more serious issue and causes me to come to a possible halt here as far as dynamic tables is concerned: my end-goal in this exercise was to fill t2 with the timestamp of the insert. Tried 2 approaches:
- the current_time() as part of the dynamic table DDL -> not allowed
- the current_time() as part of a view -> not allowed
Need to figure out how to get this working …… Any ideas ?
The link to the Snowflake blog on dynamic tables: Dynamic Tables: Delivering Declarative Streaming Data Pipelines with Snowflake