Skip to content
data

Medallion Architecture

A layered data lakehouse design with Bronze (raw), Silver (cleansed), and Gold (aggregated) tiers.

Also known as

Lakehouse Architecture Bronze Silver Gold

Medallion Architecture is a data design pattern used in modern data lakehouses that organises data into three progressive quality tiers:

TierAlso calledContents
BronzeRaw / LandingIngested data in its original form — no transformations, append-only.
SilverCleansed / EnrichedDeduplicated, validated, and joined data ready for analysis.
GoldAggregated / BusinessBusiness-level aggregations, KPIs, and ML feature stores.

Why it matters

  • Auditability — the Bronze layer preserves the raw source of truth for replay and debugging.
  • Incremental quality — each hop adds value without discarding history.
  • Decoupling — downstream consumers query Gold without coupling to upstream raw formats.

Typical implementation

Orchestration via Apache Airflow DAGs triggers Spark or dbt jobs that transform data from one tier to the next, writing to Delta Lake or Apache Iceberg tables stored in object storage (S3, MinIO, GCS).

See also

Data Lakehouse Apache Spark dbt
← Glossary