Skip to content

Weather Pipeline - Reference Implementation

Reference pipeline implementation for multi-layer architecture

Overview

The Weather Pipeline is a working reference implementation of the DEX framework's medallion architecture. It serves as a template for future projects (like CareerDEX).

Status: Reference implementation (actively refined) Purpose: Learn DEX patterns, validate architecture, reuse for other projects

📚 Documentation

What You'll Learn

This reference implementation demonstrates:

  • Data Ingestion: Fetching external data (weather API)
  • Bronze Layer: Raw data storage and schema validation
  • Silver Layer: Data cleaning, validation, transformations
  • Gold Layer: Feature engineering and aggregations
  • ML Integration: Training models on processed data
  • Airflow Orchestration: DAG configuration and 3-hour cycles
  • Monitoring: Data quality checks and metrics
  • Testing: Unit and integration test patterns

Quick Start

cd src/weatherdex

# Explore package modules
python -c "import weatherdex; print(weatherdex.__all__)"

Key Files

  • core/ - Data fetching and transformation logic
  • ml/ - ML model training and prediction
  • config/ - Configuration and schema definitions
  • notebooks/ - Analysis and exploration notebooks

How to Adapt for Your Project

  1. Copy the structure: src/{project_name}/
  2. Update configs in config/schema_definitions.py
  3. Replace fetchers in core/fetch_*.py
  4. Implement business logic transformations
  5. Add ML models in ml/
  6. Create Airflow DAG similar to weather pipeline

Documentation Hub: See docs/docs-hub.md