Skip to content
Vijay Work Resume Blog Contact

Project case study

Kubernetes and Spark Operator migration

A migration path for running enterprise Spark workloads on Kubernetes with Spark Operator and shared CI/CD.

Reduced infrastructure cost by about 30% while improving workload isolation, resource management, deployment repeatability, and scalability for data workloads.

Kubernetes (OCP) Spark Operator Spark In-house CI/CD YAML Jenkins Docker Helm Linux

Context

The problem

Data workloads needed a more repeatable runtime with better resource management, cleaner deployment mechanics, and lower infrastructure cost without forcing every team to invent its own Kubernetes pattern.

The work moved shared data-processing workloads from older execution patterns into Kubernetes-managed Spark execution with Spark Operator, CI/CD templates, Helm/Docker packaging, and operational controls.

System trace

How the work moved through the system

A high-level operating path: where the request starts, how the system shapes it, and how other teams consume the result.

  1. 1

    Existing Spark and data workloads were moved toward containerized execution patterns.

  2. 2

    Spark Operator handled Kubernetes-native Spark application submission and lifecycle management.

  3. 3

    CI/CD templates, Docker packaging, and Helm deployment conventions made the migration repeatable.

Migration scope

50+ workloads

Moved more than 50 Spark and data workloads toward Kubernetes-managed execution.

Cost outcome

~30% reduction

Reduced infrastructure cost by about 30% through better workload placement and resource management.

Architecture

System shape

4
  1. 1 Existing Spark and data workloads were moved toward containerized execution patterns.
  2. 2 Spark Operator handled Kubernetes-native Spark application submission and lifecycle management.
  3. 3 CI/CD templates, Docker packaging, and Helm deployment conventions made the migration repeatable.
  4. 4 Operational visibility and validation stayed part of the migration path instead of being treated as after-work.

Ownership

What I handled

4
  1. 1 Designed migration conventions for Spark workloads running on Red Hat OCP.
  2. 2 Implemented and hardened Spark Operator-based execution paths.
  3. 3 Integrated workload migration with CI/CD, image packaging, Helm, and deployment validation.
  4. 4 Helped reduce runtime cost and improve repeatability across migrated workloads.

Lessons

What carried forward

2
  1. 1 Platform migrations are successful when teams get a repeatable operating model, not just a new runtime target.
  2. 2 Spark on Kubernetes needs CI/CD, packaging, and operations to be designed together.

Engineering decisions

Use Kubernetes-native Spark lifecycle management

Spark Operator gave the platform a clearer control point for submission, lifecycle, and resource behavior than one-off execution scripts.

Make migration repeatable

Shared packaging and CI/CD conventions mattered because migrating one workload well is different from migrating dozens consistently.

Tie cost reduction to resource behavior

The cost benefit came from better workload placement, scaling, and resource management rather than a cosmetic platform move.

What can be shown

Public evidence without internal names

The internal systems stay private. This section keeps the public parts: my role, system boundaries, technology context, scale, decisions, constraints, and what I learned.

Internal enterprise system Scale signal High-level architecture

Migration scope

50+ workloads

The migration covered more than 50 Spark and data workloads across enterprise data-processing use cases.

Cost impact

~30% reduction

The move to Kubernetes-managed Spark execution reduced infrastructure cost by about 30%.

Architecture shape

  • Spark and data workloads are packaged for Kubernetes execution instead of being operated through older, less standardized runtime paths.
  • Spark Operator controls workload submission, lifecycle, and Kubernetes-native resource management.
  • Shared CI/CD templates, Helm/Docker packaging, and validation gates make migration repeatable across projects.
  • Operational controls cover build, deploy, resource behavior, and failure visibility without exposing internal clusters or workload names.

Responsibilities

  • Planned and implemented the migration path for Spark and data workloads onto Red Hat OCP.
  • Used Spark Operator to standardize workload submission and lifecycle behavior on Kubernetes.
  • Connected the migration with shared CI/CD, container build, Helm, and deployment practices.
  • Improved resource management, deployment repeatability, and operational visibility for migrated workloads.

Constraints

  • Cluster names, workload names, project names, capacity numbers, and internal deployment details are not published.
  • The public story focuses on migration scope, platform shape, responsibilities, and rounded outcome metrics.

Supporting context

High-level architecture

Kubernetes Spark workload migration topology

Can show CI/CD, container packaging, Helm deployment, Spark Operator submission, Kubernetes resource management, and workload monitoring without exposing internal clusters.

Scale signal

Migration outcome

50+ migrated workloads and roughly 30% infrastructure cost reduction are shareable rounded signals.

Related case studies

Continue through related work or return to the full project index.

Related projects

Continue in the same area

Project index

In-house CI/CD YAML + Jenkins + Backend engineering

CI/CD onboarding and developer-experience framework

Created an in-house YAML-driven CI/CD framework that let teams onboard projects with very little friction while keeping validation, security scans, deployment behavior, and Jira status updates standardized.

Spark + Kubernetes (OCP) + Backend engineering

Hive metastore synchronization and metadata governance

Designed and built services that keep Hive metadata consistent across independent environments using real-time listener sync, daily reconciliation, expiry cleanup, one-time interval jobs, observability, and deployment hardening.

Kubernetes (OCP) + Helm + Backend engineering

Ranger RBAC and policy-governance extensions

Extended enterprise data access governance around Apache Ranger-based RBAC, an external attribute store, DataHub tag-driven policies, row-level security, masking, Trino integration, audit clarity, and local/containerized development paths.