Umanist Staffing logo

Lead Data Engineer

Umanist Staffing
Full-time
On-site
Bengaluru, Karnataka, India
Senior Jobs

Hiring: Lead Data Engineer

Location: Bengaluru (Hybrid – 3 days office / 2 days WFH)
Experience: 7–8 Years
Employment Type: Full-Time,29490
Shift: Day shift with partial overlap with US stakeholders
Notice Period: Immediate to 30 days preferred

Role Overview

We are looking for a Senior / Lead Data Engineer to drive the design and modernization of a large-scale enterprise analytics platform. This role combines hands-on engineering with data architecture leadership, focusing on building a modern Databricks + Spark Lakehouse ecosystem on AWS.

You will play a key role in shaping data platform strategy, building scalable pipelines, and ensuring strong governance, reliability, and performance across analytics systems.

Key Responsibilities

Data Architecture & Engineering

  • Design and implement scalable data pipelines for batch and streaming workloads

  • Build and optimize ETL/ELT pipelines using Python, Spark, and SQL

  • Develop data solutions using Databricks Lakehouse architecture

  • Define standards for data modeling, storage formats, and performance optimization

Cloud & Platform Development

  • Work extensively with AWS services such as S3, Lambda, and EMR

  • Build reliable and high-performance data processing frameworks

  • Enable near real-time processing using streaming technologies

Orchestration & DevOps

  • Implement workflow orchestration using Apache Airflow

  • Build CI/CD pipelines and automate deployments

  • Use Docker, Kubernetes, and Infrastructure as Code (Terraform/CloudFormation)

Data Governance & Quality

  • Implement data lineage, cataloging, and access control frameworks

  • Maintain enterprise metric definitions and ensure consistency across reporting

  • Partner with analytics and business teams to deliver trusted, high-quality data

Operational Excellence

  • Implement monitoring, alerting, and observability for data pipelines

  • Define SLAs/SLOs and ensure platform reliability

  • Mentor and guide junior and mid-level engineers

Must-Have Skill Set

  • min 6 years of experience in data engineering and distributed data systems

  • Strong hands-on experience with:

    • Databricks

    • Apache Spark

    • Python for large-scale data processing

    • Advanced SQL (complex queries, performance tuning, data modeling)

  • Solid experience with AWS (S3, Lambda, EMR) in production environments

  • Experience building and managing ETL/ELT pipelines at scale

  • Hands-on experience with Apache Airflow for orchestration

  • Familiarity with CI/CD pipelines and version control (Git, Jenkins or similar)

  • Experience with Docker and infrastructure automation (Terraform or CloudFormation)

  • Knowledge of data governance, lineage, and cataloging practices

  • Experience working in modern Lakehouse architectures

Mandatory Certification

  • Databricks Certified Data Engineer – Professional

Nice-to-Have Skills

  • Experience with Kafka, Kinesis, or Spark Streaming

  • Exposure to Kubernetes for container orchestration

  • Experience in large-scale data platform migrations or modernization projects

  • Knowledge of enterprise KPI frameworks and semantic data layers

  • AWS certification (Solutions Architect – Associate or Professional)