Resume

Backend Engineer for Data-Heavy Systems

I have worked on backend services, data product APIs, streaming systems, and operational tooling, usually where data volume and reliability matter.

Switch views to reorder the same experience around backend, platform, data, or AI roles. Current content refreshed May 2026.

Download Backend PDF See related projects

Role views

Choose the role view.

Current view: Backend . The page and PDF stay aligned around the selected evidence.

Backend platform Service ownership, data APIs, and production reliability. Data platform Pipelines, mesh foundations, streaming, and governance. AI data products Governed agents, text-to-data, visualization, and Python.

Fine tune by focus

General AI Agentic Backend Platform Data Platform

Overall profile

Projects

Relevant projects

Governed conversational data platform

I architected and built a governed conversational data and visualization agent: it retrieves business knowledge, answers business questions, runs governed queries from that context, reasons over results, and builds charts without making the LLM the data boundary.

Python FastAPI LangGraph Langfuse

Moved analytics discovery toward self-service by connecting a multi-LLM graph runtime with knowledge retrieval, governed Trino execution, answer reasoning, chart generation, prompt tracing, and a richer Open WebUI experience. The natural-language analytics path reduced query time by about 40% for supported workflows.

Read case study

Ranger RBAC and policy-governance extensions

Extended enterprise data access governance around Apache Ranger-based RBAC, an external attribute store, DataHub tag-driven policies, row-level security, masking, Trino integration, audit clarity, and local/containerized development paths.

Java Apache Ranger Trino DataHub

Made access policy behavior more expressive and supportable by marrying tag-based governance with row-level security, masking, extensible attributes, query-engine integration, and audit/error visibility.

Read case study

Hive metastore synchronization and metadata governance

Designed and built services that keep Hive metadata consistent across independent environments using real-time listener sync, daily reconciliation, expiry cleanup, one-time interval jobs, observability, and deployment hardening.

Java Spark Airflow Apache Hive

Turned fragile metadata drift into an owned synchronization path with real-time event propagation, reconciliation, recovery behavior, logs, and platform deployment controls.

Read case study

Point-of-interest proximity streaming pipeline

Built a real-time proximity pipeline that joined customer location events with points of interest so users could receive relevant offers when they came within roughly a one-kilometer radius.

Flink Kafka Java Geospatial streaming

Moved location-aware decisioning into the event stream by combining GPS and network-triangulated location signals with point-of-interest context such as retail outlets, offers, and airports.

Read case study

Experience

Relevant experience

Software Engineer · Mphasis

I worked on mainframe and big-data systems for insurance and telecom clients, including Spark migration work and automation that saved 500+ manual hours per year.

2014 - 2018

Built automations that saved 500+ manual hours per year by removing repetitive workflow steps.
Worked as a developer for insurance and telecom clients across enterprise systems and data workflows.
Worked in both mainframe and big data ecosystems and contributed to migration from mainframe workloads to Spark-based processing.

Data Platform Engineer / Architect · Airtel Digital

I architect and build governed data platforms, metadata services, workflow orchestration, access-governance integrations, in-house CI/CD onboarding, and network-scale analytics systems.

2021 - present

Architected and built a governed conversational data platform for text-to-data workflows: multi-LLM subgraphs, knowledge retrieval, controlled query execution, answer reasoning, chart generation, prompt tracing, and platform orchestration.
Extended Apache Ranger-based access governance with an external attribute store, DataHub tag-driven policies, row-level security, masking, Trino integration, clearer audit/error paths, and local/containerized runtime support.
Built mobile tower network-event analytics, browsing-log data products, and real-time point-of-interest proximity pipelines, including petabyte-scale aggregation, safe-browsing classification, audience cohorts, and location-triggered downstream actions.

Data Science Engineer · dunnhumby

I worked between data science and big-data platform teams, turning statistical and ML analysis into reusable pipelines, data marts, reporting products, and client-ready analytics workflows.

2018 - 2021

Worked as a bridge between data science and big data platform teams by turning analytical work into reusable products and repeatable pipelines.
Built customer segmentation, data marts, reporting platforms, and retail optimization analytics for large datasets, including category/adjacency work associated with up to 20% sales-growth impact.
Created reusable retail analytics workflows across customer priority assortment, cross-shopping behavior, direct-mail promotion planning, category uplift, and seasonality analysis.

Skill stack

Primary skills

FastAPI Java Postgres Python SQL Kafka JavaScript Scala

Supporting skills

MongoDB Node.js Linux Flink Kubernetes (OCP) Trino

Additional skills

Prisma Apache Ranger Docker Next.js In-house CI/CD YAML Jenkins

Links

Contact and profile

Projects Email LinkedIn