Engineering Roadmap

Roadmap

The learning and engineering trajectory — from today's production stack toward AI-native data platforms.

Every layer of this stack exists to make the next layer more trustworthy. Data engineering enables analytics engineering. Analytics engineering enables AI systems. AI systems enable autonomous decision intelligence.

ProductionCurrent Stack

Technologies in active production use — battle-tested, well-understood, forming the foundation of every project.

PostgreSQL— Primary data warehouse — structured storage, DWH architecture, query optimisation

Apache Airflow— Pipeline orchestration — DAG design, scheduling, retry logic, monitoring

dbt— Transformation layer — medallion architecture, metric definitions, data contracts

Metabase— Reporting layer — dashboards, scheduled reports, management views

Python— Extraction, transformation, and automation — the glue between all systems

ActiveCurrently Learning

Technologies under active study and experimentation — the next layer of the platform stack.

Apache Iceberg— Open table format for the lakehouse layer — ACID semantics, time travel, schema evolution at scale

Lakehouse Architecture— Decoupling storage from compute — S3 + Iceberg as the foundation for both BI and AI workloads

Semantic Layers— dbt Semantic Layer, Cube.dev — governing metric definitions as a programmatic API

ResearchFuture Direction

The destination: AI-native data platforms where autonomous agents reason over governed data and close the feedback loop without human intervention.

Autonomous Analytics— AI systems that proactively surface insights, detect anomalies, and generate reports — no human in the loop

AI Agents over DWH— LLM-powered agents that query governed semantic layers and answer business questions in natural language

AI-Native Data Platforms— The full stack: lakehouse + semantic layer + AI analyst — designed from the ground up for AI reasoning