Staging model

Also: staging layer · stg

The first dbt layer: one staging model per source table that does light cleanup — renaming columns, casting types, basic standardization — with no joins or business logic.

Staging models are the foundation of a tidy dbt project. The rule is one staging model per source object, doing only cosmetic work: consistent column names, correct data types, trivial cleaning (trimming strings, standardizing booleans). No joins, no aggregations, no business logic.

The payoff is that every downstream model builds on a clean, predictable interface instead of raw source quirks. When a source system renames a column, you fix it in exactly one staging model and everything downstream keeps working.

Convention is to prefix them `stg_`, group them by source, and materialize them as views since they're cheap and rebuilt constantly.

Why it matters

Staging is the seam that decouples your warehouse from messy, changing source systems. A disciplined staging layer means a source schema change is a one-file fix, not a project-wide fire drill.

It also makes every downstream model simpler and more reviewable, because they start from clean, consistently-named inputs.

Common mistakes
  • Doing joins or business logic in staging — that belongs in intermediate/mart models.
  • Creating more than one staging model per source object, which muddies the one-to-one mapping.
FAQ
Should staging models be views or tables?
Usually views — they're cheap, always fresh, and rebuilt constantly. Materialize as tables only if a staging model is queried heavily enough that view overhead matters.

Learn this by building, not memorizing.

Definitions get you the vocabulary. The platform gets you the skill — graded exercises, real projects, and a portfolio capstone.