Staging model
Also: staging layer · stg
The first dbt layer: one staging model per source table that does light cleanup — renaming columns, casting types, basic standardization — with no joins or business logic.
Staging models are the foundation of a tidy dbt project. The rule is one staging model per source object, doing only cosmetic work: consistent column names, correct data types, trivial cleaning (trimming strings, standardizing booleans). No joins, no aggregations, no business logic.
The payoff is that every downstream model builds on a clean, predictable interface instead of raw source quirks. When a source system renames a column, you fix it in exactly one staging model and everything downstream keeps working.
Convention is to prefix them `stg_`, group them by source, and materialize them as views since they're cheap and rebuilt constantly.
Staging is the seam that decouples your warehouse from messy, changing source systems. A disciplined staging layer means a source schema change is a one-file fix, not a project-wide fire drill.
It also makes every downstream model simpler and more reviewable, because they start from clean, consistently-named inputs.
- Doing joins or business logic in staging — that belongs in intermediate/mart models.
- Creating more than one staging model per source object, which muddies the one-to-one mapping.
- Should staging models be views or tables?
- Usually views — they're cheap, always fresh, and rebuilt constantly. Materialize as tables only if a staging model is queried heavily enough that view overhead matters.
Learn this by building, not memorizing.
Definitions get you the vocabulary. The platform gets you the skill — graded exercises, real projects, and a portfolio capstone.
