This guide defines how all Python code for dfo must be structured, formatted, named, and organized.
It is optimized for:
- simplicity
- long-term maintainability
- clear modular boundaries
- predictable abstractions
- minimal future refactoring
- strong typing + Pydantic
- DuckDB-based data flow
- CLI-first workflows
This document incorporates patterns and lessons learned from fo.ai, FNX, and your consulting engineering workflow.
- No hidden magic
- No implicit side effects
- No unpredictable behavior
- Every function’s purpose should be obvious at a glance
- Max file size ≈ 200–250 lines
- Max function ≈ 30–40 lines
- Break functions early and often
- One file → one responsibility
Each layer in the project has exactly one responsibility:
| Layer | Responsibility |
|---|---|
core |
config, auth, domain models |
providers |
cloud SDK calls only |
discover |
collect raw inventory; write to DuckDB |
analyze |
pure FinOps logic; write to DuckDB |
report |
generate human/JSON outputs |
execute |
apply actions, log to DuckDB |
db |
DuckDB read/write utilities |
cli |
orchestrate user commands |
Follow dependency direction:
core → providers → discover → analyze → report → execute → cli
↑ ↓
+------------------------------ db ------------------+
- All external or cross-layer data must use Pydantic models
- No passing raw dicts between layers
- Models ensure safety and readability
- Always
snake_case.py - Functional names, not clever ones:
compute.pyidle_vms.pyjson_report.py
CamelCase- Only for:
- domain entities
- provider wrappers
- future pipeline engine
snake_case- Always verbs:
list_vms()get_cpu_metrics()insert_inventory()analyze_idle()
All caps, prefixed with DFO_:
DFO_IDLE_CPU_THRESHOLD
DFO_DRY_RUN_DEFAULT
DFO_DUCKDB_FILE
# Standard library
import os
import json
# Third-party
import duckdb
from azure.identity import DefaultAzureCredential
# Internal
from dfo.core.config import settings
from dfo.providers.azure.compute import list_vms
❌ from x import *
✔ Explicit imports only.
Avoid:
def foo():
from x.y import barThis includes:
- VM inventory
- Idle analysis findings
- Action records
- Keep models small (< 12–15 fields)
- Break into submodels if needed
Example:
cpu_avg: float
tags: dict[str, str]Never manually JSON serialize.
All DB operations must go through:
dfo/db/duck.py
Version-controlled .sql for tables:
schema.sql
Always define columns in INSERT statements.
Convert rows → Pydantic models.
Azure provider modules may only:
- call SDK
- normalize results
Not allowed:
- analysis
- cost estimation
- DB writes
- remediation decisions
No global variables.
No cached state except Azure credentials.
This layer collects raw data only.
Discover writes vm_inventory.
Analysis functions:
- take inputs
- return results
- do not mutate state
- do not call Azure
- do not write logs
Use Pydantic models for findings.
Reports only read from:
vm_idle_analysis
- console rendering (Rich)
- JSON output
Future:
- HTML
- CSV
- charts
Execution does not inspect raw Azure data.
Default:
dry_run=True
Record in vm_actions:
- vm_id
- action
- status
- executed_at
Examples:
dfo azure discover vms
dfo azure analyze idle-vms
dfo azure report idle-vms
dfo azure execute stop-idle-vms
CLI orchestrates:
- settings load
- layer calls
- DB interactions
Auth failures stop immediately.
Always surface meaningful errors.
Good:
Authentication failed: invalid client ID.
Bad:
Something went wrong.
- Use Python logging
- Default INFO
- DEBUG for troubleshooting
- No prints inside modules (except CLI presentation)
Mirror Azure provider:
providers/aws/
Follow:
read → analyze → write
Pipeline orchestrator will layer on top, not replace modules.
| Layer | Must Do | Must Not Do |
|---|---|---|
| core | config, auth, models | no provider calls |
| providers | call Azure SDK | no analysis or DB writes |
| discover | gather raw data | no analysis logic |
| analyze | pure logic | no Azure or DB writes |
| report | render outputs | no analysis |
| execute | apply actions | no discovery |
| db | read/write DB | no cloud logic |
This code style ensures dfo remains clean, consistent, scalable, and maintainable as it evolves from MVP → full platform.