Dataeconomy logo

Lead/Senior Data Engineer

Dataeconomy
Full-time
On-site
Hyderabad, Telangana, India
Senior Jobs
We are seeking a highly experienced and hands-on Lead/Senior Data Engineer to architect, develop, and optimize data solutions in a cloud-native environment. The ideal candidate will have 7–12 years of strong technical expertise in AWS Glue, PySpark, and Python, along with experience designing robust data pipelines and frameworks for large-scale enterprise systems. Prior exposure to the financial domain or regulated environments is a strong advantage.

Key Responsibilities:

  • Solution Architecture: Design scalable and secure data pipelines using AWS Glue, PySpark, and related AWS services (EMR, S3, Lambda, etc.)

  • Leadership & Mentorship: Guide junior engineers, conduct code reviews, and enforce best practices in development and deployment.

  • ETL Development: Lead the design and implementation of end-to-end ETL processes for structured and semi-structured data.

  • Framework Building: Develop and evolve data frameworks, reusable components, and automation tools to improve engineering productivity.

  • Performance Optimization: Optimize large-scale data workflows for performance, cost, and reliability.

  • Data Governance: Implement data quality, lineage, and governance strategies in compliance with enterprise standards.

  • Collaboration: Work closely with product, analytics, compliance, and DevOps teams to deliver high-quality solutions aligned with business goals.

  • CI/CD Automation: Set up and manage continuous integration and deployment pipelines using AWS CodePipeline, Jenkins, or GitLab.

  • Documentation & Presentations: Prepare technical documentation and present architectural solutions to stakeholders across levels.



Requirements

Required Qualifications:

  • 7–12 years of experience in data engineering or related fields.

  • Strong expertise in Python programming with a focus on data processing.

  • Extensive experience with AWS Glue (both Glue Jobs and Glue Studio/Notebooks).

  • Deep hands-on experience with PySpark for distributed data processing.

  • Solid AWS knowledge: EMR, S3, Lambda, IAM, Athena, CloudWatch, Redshift, etc.

  • Proven experience in architecture and managing complex ETL workflows.

  • Proficiency with Apache Airflow or similar orchestration tools.

  • Hands-on experience with CI/CD pipelines and DevOps best practices.

  • Familiarity with data quality, data lineage, and metadata management.

  • Strong experience working in agile/scrum teams.

  • Excellent communication and stakeholder engagement skills.

Preferred/Good to Have:

  • Experience in financial services, capital markets, or compliance systems.

  • Knowledge of data modeling, data lakes, and data warehouse architecture.

  • Familiarity with SQL (Athena/Presto/Redshift Spectrum).

  • Exposure to ML pipeline integration or event-driven architecture is a plus.



Benefits


  • Flexible work culture and remote options

  • Opportunity to lead cutting-edge cloud data engineering projects

  • Skill-building in large-scale, regulated environments.