Job Title: Principal / Senior Data Engineer – Product Development
About the Company
We are a new-age, AI-first Digital & Cloud Engineering Services company. Our mission is to combine deep engineering expertise with automation-first thinking and AI-native execution to help clients achieve more with speed, precision, and measurable impact. We operate at the intersection of automation, AI, and agile engineering to deliver scalable, high-performance solutions that enable businesses to move faster and operate smarter.
Role Overview
We are looking for a Principal/Senior Data Engineer to join our product development team. In this role, you will be responsible for building and optimizing large-scale distributed data systems, ensuring high performance, availability, and reliability. You will work on ingestion pipelines, parallel processing frameworks, and distributed storage, while contributing to core product development and best engineering practices.
Roles & Responsibilities
Design, develop, and maintain scalable distributed data systems.
Implement data ingestion pipelines from various sources such as Amazon S3, Azure Cloud Storage, Google Cloud Storage, Snowflake, BigQuery, PostgreSQL, Kafka, and data lakehouses (e.g., Iceberg).
Enable high availability, fault tolerance, and cross-region replication for data systems.
Build and optimize Spark connectors for efficient data processing.
Manage and integrate third-party systems like Kafka and Kafka Connect.
Implement monitoring, error reporting, and performance optimization for ingestion and processing workflows.
Contribute to core product development, following engineering best practices and Agile methodologies.
Collaborate with cross-functional teams to ensure seamless CI/CD integration and reliable deployments.
Requirements
Strong knowledge of distributed systems for large-scale, data-intensive applications.
Hands-on experience with parallel/distributed processing frameworks.
Proficiency in event-driven architectures and stream processing.
Good understanding of Agile development and CI/CD pipelines.
Linux fundamentals and shell scripting skills.
Must Have Skills
Proficiency in Java or Golang.
Deep expertise in Kafka (Kafka Connect, Kafka Streams, Kafka Security).
Strong experience with Apache Spark and Spark connector development.
Strong foundation in Linux and scripting for automation.