- Design and implement robust ETL/ELT processes to extract, transform, and load data from various sources into data warehouses and lakes.
- Manage and optimize data pipelines for performance and reliability.
- Conduct data preprocessing and cleaning to ensure data quality and integrity.
- Collaborate with cross-functional teams to understand data requirements and deliver timely solutions.
- Handle both structured and unstructured data from various databases, including Oracle, MySQL, MongoDB, and DB2.
- Utilize tools like Jupyter Notebook and Power BI for data analysis and visualization.
- Develop and maintain documentation for data engineering processes and workflows.
- Monitor and troubleshoot data pipeline issues and implement solutions for continuous improvement.
Requirements
- Bachelor’s degree in Computer Science, Information Technology, or a related field.
- 5 years of hands-on experience as a Data Engineer or in a similar role.
- Strong expertise in ETL/ELT processes and data preprocessing techniques.
- Proficient in working with databases: Oracle, MySQL, MongoDB, and DB2.
- Solid experience with data pipeline management and orchestration tools.
- Strong programming skills in Python and PySpark; familiarity with Scala is a plus.
- Experience using VS Code for development.
- Ability to handle structured and unstructured data effectively.
- Familiarity with data visualization tools, especially Power BI.
- Excellent problem-solving skills and attention to detail.
Preferred Qualifications:
- Experience with cloud-based data platforms (e.g., AWS, Azure) is a plus.
- Understanding of data warehousing concepts and best practices.
- Knowledge of big data technologies and frameworks.