Security Scanning¶
DataEngineX uses a multi-layered security scanning pipeline across CI/CD workflows.
Tools¶
| Tool | Purpose | Workflow |
|---|---|---|
| Trivy | Repository misconfiguration & secret scanning | security.yml |
| Trivy | Container image vulnerability scanning | docker build (local) |
| CodeQL | Static analysis (Python + GitHub Actions) | security.yml |
| pip-audit | Python dependency vulnerability audit | poe security |
| CycloneDX | SBOM generation (attached to releases) | release-*.yml |
| Dependabot | Automated dependency updates | .github/dependabot.yml |
How It Works¶
On Every Push / PR (security.yml)¶
- Trivy repo scan — scans the filesystem for misconfigurations and leaked secrets (SARIF uploaded to GitHub Security tab).
- Trivy misconfig gate — fails the build if HIGH or CRITICAL misconfigurations are found.
- CodeQL — static analysis for Python code and GitHub Actions workflows.
Local Container Scan¶
- Trivy container scan — run manually to scan the built Docker image for OS and library vulnerabilities:
On Release (release-dataenginex.yml)¶
- CycloneDX SBOM — generates a Software Bill of Materials in CycloneDX JSON format.
- SBOM upload — attaches the SBOM as a release asset on GitHub Releases.
Local / CI (poe tasks)¶
Reviewing Results¶
- GitHub Security tab — Trivy SARIF results and CodeQL alerts appear under Security → Code scanning alerts.
- GitHub Releases — each release includes an
sbom-*.jsonCycloneDX artifact. - CI logs — Trivy table output is visible in workflow run logs.
- Dependabot — automated PRs for vulnerable or outdated dependencies (pip + GitHub Actions, weekly).
Acting on Findings¶
- CRITICAL / HIGH vulnerabilities — fix immediately. Trivy and pip-audit will block the build.
- MEDIUM / LOW vulnerabilities — triage in the next sprint. Document accepted risks.
- SBOM distribution — share the CycloneDX JSON with downstream consumers for supply-chain transparency.
- Dependabot PRs — review and merge promptly; they appear as standard pull requests.