We are seeking a highly skilled and experienced Lead Data Engineer to join our data team in the New York City office. The ideal candidate will have a strong background in building scalable data pipelines, managing cloud-based data solutions, and working extensively with Databricks and AWS services. You will play a key role in designing, developing, and maintaining our data infrastructure to support analytics and business intelligence initiatives.
Lead Data Engineer
We will count on you to:
Lead the design and implementation of data ingestion processes from diverse sources into the Databricks Lakehouse platform, ensuring adherence to architectural standards.
Oversee the development, maintenance, and optimization of data models and ETL pipelines supporting the Medallion Architecture (Bronze, Silver, Gold layers) to improve data processing and transformation workflows.
Drive data integration, cleansing, and consolidation efforts using Databricks and Delta Lake for version control and data reliability.
Manage data governance policies via Unity Catalog, ensuring secure, compliant, and centralized data access across the organization.
Lead migration of existing ETL processes from Informatica IICS to cloud-based pipelines within Databricks, minimizing disruption and maximizing efficiency.
Collaborate with clients and stakeholders to support architectural design, address technical queries, and provide strategic guidance on Databricks Lakehouse utilization.
Mentor and lead a team of data engineers, fostering a culture of continuous learning, innovation, and best practices.
Stay current with industry trends and emerging technologies in data engineering, especially related to Databricks, cloud solutions, and ETL migration strategies.
Promote data and analytics capabilities to business stakeholders, educating them on leveraging the Medallion Architecture for their analytical needs.
What you need to have:
Bachelor’s or master’s degree in computer science or equivalent.
Proficiency in SQL and experience with enterprise data warehousing technologies such as Databricks and Redshift.
Proven experience designing, developing, and automating data pipelines, with familiarity in version control (Git), CI/CD, and DevOps practices.
Strong analytical, problem-solving, and communication skills, with the ability to translate complex concepts for diverse audiences.
Extensive experience working in enterprise environments with strict data governance, security, and compliance requirements.
Over 10+ years of practical experience with cloud-based data engineering platforms like Databricks, AWS S3, Athena, Glue, or similar.
Demonstrated thought leadership through best practices, standards, and comprehensive documentation of data processes and architectures.
What makes you stand out:
Proficiency in data visualization tools such as Power BI, Tableau, or equivalent.
Experience with agile project delivery and collaborative team environments.
Knowledge of containerization (Docker, Kubernetes) and orchestration tools like Apache Airflow.
Certifications in Databricks and AWS are highly desirable.
Why join our team:
We help you be your best through professional development opportunities, interesting work and supportive leaders.
We foster a vibrant and inclusive culture where you can work with talented colleagues to create new solutions and have impact for colleagues, clients and communities.