Exploring the Transformation Layer: A dbt Primer

John A. Wilson November 2023 Case Study | Analytics Engineering | dbt | Snowflake

What the dbt quickstart with Snowflake taught me about building analytics you can actually trust

Case Study | Analytics Engineering | dbt | Snowflake

November 2023

"dbt helps you solve hard problems once." — Tristan Handy, Founder, DBT

Seeking simplicity and control, many businesses in recent years have migrated to a tool called dbt for analytics engineering, or more specifically what is called orchestration. This enables less technical analysts to manipulate data in a way that allows for version control, problem-solving, and change management. The problem is that the layer between the source system and the dashboard — the transformation layer — in a lot of organizations becomes tangled and hard to update.

In November 2023 I went through the dbt quickstart training with Snowflake, partly out of curiosity and partly because I'd been thinking about better ways to structure the data work I do for operators in senior living and healthcare. The training itself is intentionally simple. Its demo models are almost trivially basic. What made it concrete for me was a set of ideas about how analytics SQL should be organized that I'd been circling for a while without a clean framework for expressing.

What dbt Actually Does

Every transformation in dbt is a .sql file that lives in a models/ directory — which means it lives in version control. The first demo model from the quickstart looks like this:

{{ config(materialized='table') }}

with source_data as (
    select 1 as id
    union all
    select null as id
)

select *
from source_data

The {{ config(...) }} block at the top declares how the model gets materialized in your warehouse. Switching from a table to a view is a one-line change — explicit, versioned, and visible to anyone who opens the file. The second model introduces ref(), which is how dbt manages dependencies between models:

select *
from {{ ref('my_first_dbt_model') }}
where id = 1

By referencing another model this way, dbt builds a dependency graph across your entire project and resolves execution order automatically. And paired with each model is a schema.yml file where you declare tests — unique, not_null — that run every time you build. For operators running analytics across systems like PointClickCare and Yardi, where census numbers and financial KPIs have real operational consequences, that kind of automated validation isn't a luxury.

dbt

Takeaways

What the quickstart made clear is that the three most common ways analytics layers fall apart — broken dependencies, silent data quality regressions, and logic that lives only in someone's head — all have straightforward answers once you have a framework that treats SQL like software. Version control, ref(), and schema tests don't solve every problem, but they address the ones that cause the most pain. The CTE-first style dbt encourages is worth internalizing independently: transformation logic written as a series of named steps is just easier to maintain than the equivalent buried in nested subqueries. It's a small discipline that compounds over time.

As a side note, I have always respected the team at DBT for starting as data consultants like me (they were formerly Fishtown Analytics). They listened to their customers and eventually became a world-class product that revolutionized their industry. May we all be so lucky.

If you're interested in seeing all the code yourself, visit my github: https://github.com/armistead/dbt_demo

If you're working through how to bring more structure to your analytics layer — whether you're on Snowflake, DBT, or something else entirely — I'd be glad to compare notes. Reach out at hello@terlina.com.

← Back to Blog