About the Role
We are seeking a highly skilled and experienced Senior / Lead Data Engineer to design, develop, and maintain scalable, reliable, and efficient data pipelines and ETL solutions. The role requires strong expertise across multi-cloud environments, modern data warehousing platforms, programming languages, and data orchestration tools. You will play a pivotal role in transforming raw data into actionable insights, ensuring data quality, and enabling analytics and reporting initiatives across the organization.
Responsibilities
- Design, build, and optimize complex ETL/ELT data pipelines using Python, PySpark, Scala, and advanced SQL.
- Implement and manage ETL processes using Informatics PowerCenter, Data bricks, AWS Glue, and Snowflake.
- Develop and deploy scalable data solutions across AWS, Azure, GCP, and Microsoft Fabric using cloud-native services.
- Manage and optimize databases including Redshift, SQL Server, and AWS RDS.
- Orchestrate and monitor data workflows with Apache Airflow to ensure reliable and timely delivery.
- Implement streaming solutions with Apache Kafka and containerized services with Kubernetes.
- Automate data workflows and system monitoring using Unix shell scripting.
- Apply CI/CD practices to data pipelines and enforce Data Clean room principles for privacy-compliant collaboration.
- Collaborate with BI/reporting teams to deliver optimized datasets for Tableau, Looker, and PowerBI.
- Troubleshoot and resolve performance issues in pipelines and database queries.
- Maintain detailed technical documentation and collaborate closely with cross-functional teams.
Requirements
- Bachelor’s or Master’s degree in Computer Science, Engineering, Information Technology, or related field.
- 10+ years of experience as Data Engineer.
- Languages: Proficiency in SQL, Python (including PySpark), Scala, and Unix Shell Scripting.
- ETL Tools: Hands-on experience with Informatics PowerCenter, Data bricks, and AWS Glue.
- Data Warehousing: Expertise in Snowflake and Redshift.
- Cloud Platforms: Strong exposure to at least two of AWS, Azure, GCP; familiarity with Microsoft Fabric.
- Databases: Solid knowledge of Redshift, SQL Server, and AWS RDS.
- Orchestration: Proven experience with Apache Airflow.
- Streaming & Containerization: Practical experience with Apache Kafka and Kubernetes.
- Concepts: Working knowledge of CI/CD pipelines and Data Clean room practices.
- Reporting Tools: Understanding of data provisioning for Tableau, Looker, or PowerBI.
- Strong problem-solving skills, communication ability, and a proactive approach to emerging technologies.