Experience: 3–6 years
Employment Type: Full-Time
We are seeking a technically strong and highly analytical Data Engineer to join our onsite team. This role demands hands-on expertise in workflow automation, data pipeline development, and dashboarding, as well as strong skills in database investigation and optimization. Experience with AWS services such as EMR, ECS, and S3 is a plus.
Design, build, and maintain scalable ETL/ELT workflows using Apache Airflow or Prefect.
Write clean and efficient Python and PySpark scripts for large-scale data processing.
Investigate and debug data anomalies through advanced SQL and database forensics.
Develop interactive dashboards and reports using Power BI and Metabase.
Collaborate with cross-functional teams to ensure data accuracy, consistency, and availability.
Optimize workflows and data processing jobs on AWS infrastructure including EMR and ECS.
Set up alerts, monitoring, and logging to maintain high reliability in data operations.
Required Skills:
Hands-on experience with Apache Airflow and/or Prefect for workflow orchestration.
Proficiency in Python and PySpark for data transformation and processing.
Solid knowledge of SQL and data investigation techniques in complex database systems.
Experience building visualisations and reports with Power BI and Metabase.
Familiarity with Git, CI/CD practices, and working in fast-paced, production environments.
Good to Have:
Experience with AWS services: EMR, ECS, S3, Lambda, etc.
Understanding of data warehousing (e.g., Redshift, Snowflake, BigQuery).
Familiarity with Docker, Kubernetes, or other orchestration tools.
Exposure to CRM data environments (e.g., Salesforce, Zoho) is a bonus.
Soft Skills: