Data Engineer - Remote, India

LeewayHertz

Full-time

Remote

This is a remote position.

Job Summary

As a Senior Data Engineer, you will be responsible for designing, building, and optimizing data pipelines and lakehouse architectures on AWS. You will ensure data availability, quality, lineage, and governance across analytical and operational platforms. Your expertise will enable scalable, secure, and cost-effective data solutions that power advanced analytics and business intelligence.

Responsibilities

Implement and manage S3 (raw, staging, curated zones), Glue Catalog, Lake Formation, and Iceberg/Hudi/Delta Lake for schema evolution and versioning.
Develop PySpark jobs on Glue/EMR, enforce schema validation, partitioning, and scalable transformations.
Build workflows using Step Functions, EventBridge, or Airflow (MWAA), with CI/CD deployments via CodePipeline & CodeBuild.
Apply schema contracts, validations (Glue Schema Registry, Deequ, Great Expectations), and maintain lineage/metadata using Glue Catalog or third-party tools (Atlan, OpenMetadata, Collibra).
Enable Athena and Redshift Spectrum queries, manage operational stores (DynamoDB/Aurora), and integrate with OpenSearch for observability.
Design efficient partitioning/bucketing strategies, adopt columnar formats (Parquet/ORC), and implement spot instance usage/bookmarking.
Enforce IAM-based access policies, apply KMS encryption, private endpoints, and GDPR/PII data masking.
Prepare Gold-layer KPIs for dashboards, forecasting, and customer insights with QuickSight, Superset, or Metabase.
Partner with analysts, data scientists, and DevOps to enable seamless data consumption and delivery.

Requirements

Essential Skills

Job

Hands-on expertise with AWS data stack (S3, Glue, Lake Formation, Athena, Redshift, EMR, Lambda).
Strong programming skills in PySpark & Python for ETL, scripting, and automation.
Proficiency in SQL (CTEs, window functions, complex aggregations).
Experience in data governance, quality frameworks (Deequ, Great Expectations).
Knowledge of data modeling, partitioning strategies, and schema enforcement.
Familiarity with BI integration (QuickSight, Superset, Metabase).

Personal

Strong problem-solving ability in complex data environments.
Ability to communicate technical insights to non-technical stakeholders.
Commitment to best practices in data governance, compliance, and security.
Collaborative mindset with cross-functional teams.

Preferred Skills

Job

Real-time ingestion experience (Kinesis, MSK, Kafka on AWS).
Exposure to ML feature store integration with SageMaker.
Infrastructure as Code (Terraform, CloudFormation, or CDK).
Experience with Data Mesh or domain-driven data architecture.

Personal

Experience mentoring junior data engineers.
Ability to lead data projects from design to production.
Proactive in learning new AWS and data ecosystem technologies.

Other Relevant Information

Bachelor’s/Master’s degree in Computer Science, Information Technology, or related field.
Minimum 4 years of proven experience in data engineering with AWS.

Benefits

This role offers the flexibility of working remotely in India.

LeewayHertz is an equal opportunity employer and does not discriminate based on race, color, religion, sex, age, disability, national origin, sexual orientation, gender identity, or any other protected status. We encourage a diverse range of applicants.

Data Engineer - Remote, India

Requirements

Benefits

LeewayHertz is an equal opportunity employer and does not discriminate based on race, color, religion, sex, age, disability, national origin, sexual orientation, gender identity, or any other protected status. We encourage a diverse range of applicants.

More jobs

Senior Data Engineer IV (Remote)

Agile Lab

Azure Databricks / Python / Pyspark Senior Data Engineer

Exusia