Skip to content
Vijay Work Resume Blog Contact

Project case study

Ranger RBAC and policy-governance extensions

Access governance that combines RBAC, metadata tags, row-level security, masking, custom attributes, and query-engine enforcement.

Made access policy behavior more expressive and supportable by marrying tag-based governance with row-level security, masking, extensible attributes, query-engine integration, and audit/error visibility.

Java Apache Ranger Trino DataHub Kubernetes (OCP) Helm

Context

The problem

Enterprise data platforms need policy enforcement that is expressive enough for real business rules while still being understandable to users and operators.

The contribution evidence shows external DB-backed attribute lookup, extensible policy attributes, caching, DataHub tag-driven policy implementation, row-level security, masking, Trino/Ranger integration, tag-sync support, audit fixes, denied-access message clarity, and deployment/runtime hardening.

System trace

How the work moved through the system

A high-level operating path: where the request starts, how the system shapes it, and how other teams consume the result.

  1. 1

    An external database provides extensible attributes that enrich the context available to policy evaluation.

  2. 2

    Apache Ranger-based authorization governs query-engine access through integration points such as Trino plugins.

  3. 3

    DataHub tags can drive policy behavior directly instead of remaining only catalog metadata.

Policy input

External attributes

Policy behavior was extended with database-backed attributes and caching so new policy context could be added without code changes for each attribute.

Governance shape

Tags + RLS + masking

DataHub tags became a policy entry point while row-level security and masking handled fine-grained enforcement.

Execution layer

Query enforcement

The governance work connects to query-engine execution instead of remaining a standalone admin concern.

Architecture

System shape

7
  1. 1 An external database provides extensible attributes that enrich the context available to policy evaluation.
  2. 2 Apache Ranger-based authorization governs query-engine access through integration points such as Trino plugins.
  3. 3 DataHub tags can drive policy behavior directly instead of remaining only catalog metadata.
  4. 4 Tag-based policies are combined with row-level security and masking so policy decisions can control both access and data shape.
  5. 5 Tag synchronization keeps governance metadata closer to the authorization layer.
  6. 6 Audit logs and denied-access messages are improved so failures can be investigated and explained.
  7. 7 Containerized/local setup paths make the governance stack easier to test and evolve.

Ownership

What I handled

5
  1. 1 Implemented external DB-backed attribute lookup and caching behavior.
  2. 2 Enabled DataHub tag-driven policies that combine tag governance with row-level security and masking.
  3. 3 Worked on Trino/Ranger integration and containerized runtime setup.
  4. 4 Improved audit and denied-access feedback paths.
  5. 5 Built tag-sync support between metadata and policy systems.

Lessons

What carried forward

2
  1. 1 Access governance is part of developer experience, not just security administration.
  2. 2 Policy systems are easier to operate when audit, error messages, and local testability are treated as first-class work.

Engineering decisions

Make policy context richer than roles

RBAC alone was not enough for the policy shapes required, so external attributes, DataHub tags, row filters, and masking became part of enforcement context.

Keep policy inputs extensible

Using an external attribute store made it possible to add policy attributes as requirements changed without hard-coding every new dimension into the integration.

Use catalog tags as governance intent

DataHub tags were a natural place to express governance meaning, while Ranger enforcement handled the row-level and masking behavior at query time.

Move clarity into the failure path

Denied access should tell users and operators what kind of access failed instead of behaving like a generic platform error.

Keep governance testable

Local/containerized setup made it easier to verify policy behavior without waiting on full production environments.

What can be shown

Public evidence without internal names

The internal systems stay private. This section keeps the public parts: my role, system boundaries, technology context, scale, decisions, constraints, and what I learned.

Internal enterprise system High-level architecture Open-source reference

Policy model

Tags + RLS + masking

Work married tag-based policies with row-level security, masking, and custom attribute lookup for enterprise query access.

Integration

Ranger + Trino

The access-governance work connected Apache Ranger-based authorization with query-engine execution paths.

Architecture shape

  • Query engines call into policy enforcement rather than embedding access decisions in application code.
  • An external database provides extensible attributes so policy context can evolve without code changes for every new attribute.
  • DataHub tags can drive policy behavior, combining tag-based governance with row-level security and masking.
  • Audit and denied-access paths are treated as product behavior because users and operators need clear explanations.

Responsibilities

  • Implemented external database-backed attribute lookup so policies could use as many attributes as required.
  • Enabled DataHub tag-driven policy implementation, including tag-based controls combined with row-level security and masking.
  • Worked on Trino and Ranger integration, local/containerized runtime setup, and missing-plugin/runtime fixes.
  • Improved Ranger audit behavior and denied-access error clarity.
  • Built tag-synchronization support between metadata governance and access-control layers.

Constraints

  • Internal policy names, attributes, table names, user groups, and access rules are not published.
  • This case study can discuss the architecture and implementation responsibilities without exposing policy details.

Supporting context

High-level architecture

Access-governance integration map

Can show DataHub tags, external attributes, policy evaluation, row-level filters, masking, Trino/Ranger integration, audit logs, and denied-access feedback.

Open-source reference

Open-source context

Apache Ranger and Trino can be named as technology context; internal policies and deployment details stay private.

Related case studies

Continue through related work or return to the full project index.

Related projects

Continue in the same area

Project index

Java + Kubernetes (OCP) + Backend engineering

Hive metastore synchronization and metadata governance

Designed and built services that keep Hive metadata consistent across independent environments using real-time listener sync, daily reconciliation, expiry cleanup, one-time interval jobs, observability, and deployment hardening.

Trino + Apache Ranger + Platform engineering

Self-service data platform and governance architecture

Built data-mesh platform capabilities around Kyuubi, custom engine routing, RBAC, secrets management, Trino query access, dbt transformations, DataHub metadata, and Metabase BI.

Trino + DataHub + Backend engineering

Governed conversational data platform

I architected and built a governed conversational data and visualization agent: it retrieves business knowledge, answers business questions, runs governed queries from that context, reasons over results, and builds charts without making the LLM the data boundary.