Resume

Data Platform Engineer

I build data platforms for ingestion, metadata, governance, query execution, orchestration, discovery, and downstream product use. The focus is durable architecture, clear ownership, and dependable operation.

Switch views to reorder the same experience around backend, platform, data, or AI roles. Current content refreshed May 2026.

Download Data Platform PDF See related projects

Role views

Choose the role view.

Current view: Data Platform . The page and PDF stay aligned around the selected evidence.

Backend platform Service ownership, data APIs, and production reliability. Data platform Pipelines, mesh foundations, streaming, and governance. AI data products Governed agents, text-to-data, visualization, and Python.

Fine tune by focus

General AI Agentic Backend Platform Data Platform

Overall profile

Projects

Relevant projects

Governed conversational data platform

I architected and built a governed conversational data and visualization agent: it retrieves business knowledge, answers business questions, runs governed queries from that context, reasons over results, and builds charts without making the LLM the data boundary.

Python FastAPI LangGraph Langfuse

Moved analytics discovery toward self-service by connecting a multi-LLM graph runtime with knowledge retrieval, governed Trino execution, answer reasoning, chart generation, prompt tracing, and a richer Open WebUI experience. The natural-language analytics path reduced query time by about 40% for supported workflows.

Read case study

Telecom network data lake

Designed and implemented data lake and warehouse foundations for mobile tower network events at petabyte scale and roughly five trillion events per day.

Flink Kafka Java Elastic Stack

Created the analytics foundation for network data products, reporting, and large-scale downstream consumption.

Read case study

Hive metastore synchronization and metadata governance

Designed and built services that keep Hive metadata consistent across independent environments using real-time listener sync, daily reconciliation, expiry cleanup, one-time interval jobs, observability, and deployment hardening.

Java Spark Airflow Apache Hive

Turned fragile metadata drift into an owned synchronization path with real-time event propagation, reconciliation, recovery behavior, logs, and platform deployment controls.

Read case study

Browsing-log analytics and safe-browsing pipelines

Built browsing-log ingestion and analytics pipelines for safe-browsing classification, audience management, cohort creation, and pattern-based downstream data products.

NiFi Spark Airflow Python

Turned raw browsing activity into governed analytical signals: threat classification, spam URL marking, audience segments, browsing-pattern cohorts, and reusable data products.

Read case study

Experience

Relevant experience

Data Platform Engineer / Architect · Airtel Digital

I architect and build governed data platforms, metadata services, workflow orchestration, access-governance integrations, in-house CI/CD onboarding, and network-scale analytics systems.

2021 - present

Designed and implemented Hive metadata synchronization for independent Hive environments, combining real-time listener sync, daily reconciliation, missed-addition/removal repair, sync-duration expiry cleanup, one-time interval jobs, observability, and deployment hardening.
Built governed self-service data platform capabilities around Kyuubi customizations, Trino query access, DataHub metadata, dbt transformations, Metabase BI, RBAC, and secrets management.
Built mobile tower network-event analytics, browsing-log data products, and real-time point-of-interest proximity pipelines, including petabyte-scale aggregation, safe-browsing classification, audience cohorts, and location-triggered downstream actions.

Data Science Engineer · dunnhumby

I worked between data science and big-data platform teams, turning statistical and ML analysis into reusable pipelines, data marts, reporting products, and client-ready analytics workflows.

2018 - 2021

Worked as a bridge between data science and big data platform teams by turning analytical work into reusable products and repeatable pipelines.
Built customer segmentation, data marts, reporting platforms, and retail optimization analytics for large datasets, including category/adjacency work associated with up to 20% sales-growth impact.
Created reusable retail analytics workflows across customer priority assortment, cross-shopping behavior, direct-mail promotion planning, category uplift, and seasonality analysis.

Software Engineer · Mphasis

I worked on mainframe and big-data systems for insurance and telecom clients, including Spark migration work and automation that saved 500+ manual hours per year.

2014 - 2018

Worked in both mainframe and big data ecosystems and contributed to migration from mainframe workloads to Spark-based processing.
Worked as a developer for insurance and telecom clients across enterprise systems and data workflows.
Built automations that saved 500+ manual hours per year by removing repetitive workflow steps.

Skill stack

Primary skills

Spark SQL Flink Python Airflow Trino Apache Kyuubi Java

Supporting skills

Spark Operator Apache Iceberg Apache Hudi DataHub Apache Hive Apache Ranger

Additional skills

Scala NiFi dbt Alluxio Power BI Apache SeaTunnel

Links

Contact and profile

Projects Email LinkedIn