Requirements
We are seeking an experienced SQL & Python Data Engineer with a strong background in SQL development, performance tuning, and Python programming for data engineering. The ideal candidate will have expertise in advanced SQL queries, stored procedures, and query optimization, along with experience in Apache Airflow for workflow automation. The role involves working on large-scale data platforms, ensuring efficient data processing and ETL workflows.
Key Responsibilities:
- Develop, optimize, and maintain complex SQL queries, functions, and stored procedures to support data processing and analytics workflows.
- Analyze and optimize SQL performance by improving indexing, partitioning, and query execution plans.
- Develop scalable and efficient data pipelines using Python for data engineering and transformation.
- Implement and manage workflow automation using Apache Airflow, ensuring smooth execution of data pipelines.
- Monitor and troubleshoot SQL and Python scripts to ensure high performance and reliability.
- Collaborate with cross-functional teams to gather requirements and design efficient database solutions.
- Work on ETL processing, data ingestion, and transformation pipelines for structured and semi-structured data.
- Ensure data integrity and consistency by implementing best practices in SQL and Python development.
- Stay updated with industry trends and recommend improvements to optimize data infrastructure and performance.
Must-Have Skills:
✔ 8+ years of hands-on experience with SQL and Python programming.
✔ Expertise in writing complex SQL queries, stored procedures, functions, and performance tuning techniques.
✔ Strong experience in Python for data engineering use cases, including data transformation, ETL, and automation.
✔ Proficiency in query optimization, indexing, and partitioning strategies to improve SQL performance.
✔ Experience with Apache Airflow for workflow automation and scheduling.
✔ Strong analytical and problem-solving skills in handling large-scale data.
✔ Excellent communication and interpersonal skills, with the ability to work in high-performance environments.
Good-to-Have Skills (Preferred but Not Mandatory):
➕ Working knowledge of Greenplum platforms for large-scale data processing.
➕ Experience with distributed databases and cloud-based data warehousing.
➕ Familiarity with data pipeline orchestration tools such as Prefect or Luigi.
➕ Knowledge of big data technologies (Hadoop, Spark, Kafka, etc.).
➕ Exposure to DevOps practices for data engineering.