Engineering Roadmap
Roadmap
The learning and engineering trajectory — from today's production stack toward AI-native data platforms.
Every layer of this stack exists to make the next layer more trustworthy. Data engineering enables analytics engineering. Analytics engineering enables AI systems. AI systems enable autonomous decision intelligence.
ProductionCurrent Stack
Technologies in active production use — battle-tested, well-understood, forming the foundation of every project.
PostgreSQL— Primary data warehouse — structured storage, DWH architecture, query optimisation
Apache Airflow— Pipeline orchestration — DAG design, scheduling, retry logic, monitoring
dbt— Transformation layer — medallion architecture, metric definitions, data contracts
Metabase— Reporting layer — dashboards, scheduled reports, management views
Python— Extraction, transformation, and automation — the glue between all systems
ActiveCurrently Learning
Technologies under active study and experimentation — the next layer of the platform stack.
Apache Iceberg— Open table format for the lakehouse layer — ACID semantics, time travel, schema evolution at scale
Lakehouse Architecture— Decoupling storage from compute — S3 + Iceberg as the foundation for both BI and AI workloads
Semantic Layers— dbt Semantic Layer, Cube.dev — governing metric definitions as a programmatic API
ResearchFuture Direction
The destination: AI-native data platforms where autonomous agents reason over governed data and close the feedback loop without human intervention.
Autonomous Analytics— AI systems that proactively surface insights, detect anomalies, and generate reports — no human in the loop
AI Agents over DWH— LLM-powered agents that query governed semantic layers and answer business questions in natural language
AI-Native Data Platforms— The full stack: lakehouse + semantic layer + AI analyst — designed from the ground up for AI reasoning