Skip to content

Quickstart

Get a DataEngineX pipeline running in under five minutes.

1. Install

pip install dataenginex
# or from source:
git clone https://github.com/TheDataEngineX/dataenginex && cd dataenginex
uv sync

2. Create a config file

dex.yaml:

project:
  name: my-first-pipeline
  version: "0.1.0"

data:
  sources:
    raw_users:
      type: csv
      path: data/users.csv
  pipelines:
    clean_users:
      source: raw_users
      transforms:
        - type: filter
          condition: "age > 0"
      destination: silver.users

3. Validate

dex validate dex.yaml

4. Use DexEngine in your application

from dataenginex.engine import DexEngine

engine = DexEngine("dex.yaml")

# Run a pipeline
result = engine.run_pipeline("clean_users")

# Query sources
schema = engine.source_schema("raw_users")
sample = engine.source_sample("raw_users", limit=5)

# Inspect warehouse
layers = engine.warehouse_layers()
tables = engine.warehouse_tables("silver")

# Check history
runs = engine.store.list_pipeline_runs(limit=10)

5. Run Examples

Each example is a standalone script:

uv run python examples/01_hello_pipeline.py   # Minimal pipeline
uv run python examples/03_quality_gate.py     # Quality checks
uv run python examples/04_ml_training.py      # ML training + registry
uv run python examples/05_rag_demo.py         # RAG pipeline
uv run python examples/06_llm_quickstart.py   # LLM providers

# Build your own FastAPI app on top (see example 02)
uv run --with fastapi --with uvicorn python examples/02_api_quickstart.py

6. Run Tests

uv run poe test       # Full test suite
uv run poe lint       # Lint with Ruff
uv run poe typecheck  # mypy strict
uv run poe check-all  # All of the above

Next Steps

  • Architecture — medallion layers, backend registry, DexEngine
  • Development Guide — editor config and workflow
  • API Reference — auto-generated module docs
  • examples/ directory — full list of runnable examples

DEX Studio

DEX Studio is the optional web UI that loads a dex.yaml and provides a single control plane for Data / ML / AI / System. It uses dataenginex directly as a library — no separate server process needed.

pip install dex-studio
dex-studio                               # serve on http://localhost:7860
dex-studio --config /path/to/dex.yaml   # open a specific project

See dex-studio for details.