I architect cloud-native data pipelines that transform raw, scattered data into clean, trusted intelligence — powering decisions that matter.
I specialise in end-to-end data engineering on the modern data stack — from orchestrating robust Airflow ingestion workflows to building tested, documented dbt models and delivering clean Snowflake data marts that power real decisions.
With hands-on experience across Azure, Snowflake, PySpark, and dbt, I bring both engineering rigour and analytical thinking to every pipeline I build — ensuring data is not just available, but trustworthy and actionable.
Five production-grade data engineering solutions — each grounded in a real problem, a deliberate approach, and a measurable outcome.
Production-grade ELT pipeline centralising five source systems into Snowflake, transformed via a layered dbt architecture, orchestrated daily in Airflow, and served to business via Power BI.
Manual download of stock price files, that requires cleaning them up in Excel, and uploading somewhere for the team to query. This takes hours, it's error-prone, and eventually delays decision making.
Three-layer dbt architecture (staging → intermediate → business-ready) on Snowflake. Daily Airflow DAGs. Automated dbt tests and documentation enforce quality end-to-end.
Event-driven streaming pipeline ingesting high-frequency data, processing with Apache Kafka Structured Streaming, and delivering live Power BI dashboards for operational monitoring.
Modern ecommerce businesses generate thousands of customer events every second — product views, add-to-cart actions, checkouts, payments, and cancellations. Without a real-time data pipeline, these businesses face critical challenges.
Captured every ecommerce event instantly using Apache Kafka as the streaming backbone Processes and store events in Snowflake using a Bronze → Silver → Gold layered architecture for clean data delivery. With live business insights to stakeholders through a Power BI dashboard.
Full migration from a MongoDB Atlas operational database, loads raw data into Snowflake via Airbyte.
Manual processing of data prone to errors — slow queries, costly maintenance, no scalability, zero self-serve capability for business analysts.
Redesigned schemas for Snowflake's columnar engine. Easy transfer of incremental customer data with reliable pipelines with dbt for data transformation .
An interactive Power BI dashboard designed to support the analysis of brain cancer patient data. This transforms complex clinical and demographic datasets into clear, actionable visual insights
Identify which age groups or demographics are most affected by brain cancer. Delay in Spotting delays in the care pathway that may be affecting patient survival
Track tumour grade distribution across a patient population and filtering patient data by age, gender, diagnosis date, and treatment type
Verified technical credentials. Click any card to view or verify.
I am actively seeking Data Engineer roles where I can design scalable pipelines and work on meaningful data challenges. Recruiter, hiring manager, or fellow data professional — I would love to hear from you.