Skip to content
Vijay Work Resume Blog Contact

Resume

Data Platform Engineer

I build data platforms for ingestion, metadata, governance, query execution, orchestration, discovery, and downstream product use. The focus is durable architecture, clear ownership, and dependable operation.

Switch views to reorder the same experience around backend, platform, data, or AI roles. Current content refreshed May 2026.

Fine tune by focus

Projects

Relevant projects

Telecom network data lake

Designed and implemented data lake and warehouse foundations for mobile tower network events at petabyte scale and roughly five trillion events per day.

Flink Kafka Java Elastic Stack

Created the analytics foundation for network data products, reporting, and large-scale downstream consumption.

Read case study

Point-of-interest proximity streaming pipeline

Built a real-time proximity pipeline that joined customer location events with points of interest so users could receive relevant offers when they came within roughly a one-kilometer radius.

Flink Kafka Java Geospatial streaming

Moved location-aware decisioning into the event stream by combining GPS and network-triangulated location signals with point-of-interest context such as retail outlets, offers, and airports.

Read case study

Governed conversational data platform

I architected and built a governed conversational data and visualization agent: it retrieves business knowledge, answers business questions, runs governed queries from that context, reasons over results, and builds charts without making the LLM the data boundary.

Python FastAPI LangGraph Langfuse

Moved analytics discovery toward self-service by connecting a multi-LLM graph runtime with knowledge retrieval, governed Trino execution, answer reasoning, chart generation, prompt tracing, and a richer Open WebUI experience. The natural-language analytics path reduced query time by about 40% for supported workflows.

Read case study

Hive metastore synchronization and metadata governance

Designed and built services that keep Hive metadata consistent across independent environments using real-time listener sync, daily reconciliation, expiry cleanup, one-time interval jobs, observability, and deployment hardening.

Java Spark Airflow Apache Hive

Turned fragile metadata drift into an owned synchronization path with real-time event propagation, reconciliation, recovery behavior, logs, and platform deployment controls.

Read case study

Experience

Relevant experience

Data Platform Engineer / Architect · Airtel Digital

I architect and build governed data platforms, metadata services, workflow orchestration, access-governance integrations, in-house CI/CD onboarding, and network-scale analytics systems.

2021 - present

  • Designed and implemented Hive metadata synchronization for independent Hive environments, combining real-time listener sync, daily reconciliation, missed-addition/removal repair, sync-duration expiry cleanup, one-time interval jobs, observability, and deployment hardening.
  • Built governed self-service data platform capabilities around Kyuubi customizations, Trino query access, DataHub metadata, dbt transformations, Metabase BI, RBAC, and secrets management.
  • Built mobile tower network-event analytics, browsing-log data products, and real-time point-of-interest proximity pipelines, including petabyte-scale aggregation, safe-browsing classification, audience cohorts, and location-triggered downstream actions.

Data Science Engineer · dunnhumby

I worked between data science and big-data platform teams, turning statistical and ML analysis into reusable pipelines, data marts, reporting products, and client-ready analytics workflows.

2018 - 2021

  • Worked as a bridge between data science and big data platform teams by turning analytical work into reusable products and repeatable pipelines.
  • Built customer segmentation, data marts, reporting platforms, and retail optimization analytics for large datasets, including category/adjacency work associated with up to 20% sales-growth impact.
  • Created reusable retail analytics workflows across customer priority assortment, cross-shopping behavior, direct-mail promotion planning, category uplift, and seasonality analysis.

Software Engineer · Mphasis

I worked on mainframe and big-data systems for insurance and telecom clients, including Spark migration work and automation that saved 500+ manual hours per year.

2014 - 2018

  • Worked in both mainframe and big data ecosystems and contributed to migration from mainframe workloads to Spark-based processing.
  • Worked as a developer for insurance and telecom clients across enterprise systems and data workflows.
  • Built automations that saved 500+ manual hours per year by removing repetitive workflow steps.

Skill stack

Primary skills

Flink Java Kafka Geospatial streaming Scala Spark SQL Python

Supporting skills

Airflow Trino Apache Kyuubi Spark Operator Apache Iceberg Apache Hudi

Additional skills

DataHub Apache Hive Apache Ranger Elastic Stack NiFi dbt