We are looking for Data Engineers for the data projects of our Fortune50 client. The main objective of this project is to modernize client’s data infrastructure by migrating to Databricks Unity Catalog, with a full transition to DBX-as-a-Service (DBXaaS). This crypto project entails a comprehensive migration of all data assets, including tables, pipelines, permissions, and associated components, across all environments. The goal is to establish a unified governance model that ensures consistency, improves security, and enhances the manageability and scalability of the data platform.
This is a remote-first position for engineers based in Europe, Turkey, and Middle East with a required overlap of US working hours (2-6 PM CET).
Migration of data infrastructure to Databricks
Transition to DBXaaS
Design data models.
Develop SQL code to define data structures and transform data from staging to marts.
Create Source to Target Mappings (STMs) for ETL specifications.
Evaluate data sources, assess quality, and determine the best integration approach.
Develop strategies for integrating data from various sources and data warehouses.
Optimize and maintain the data pipeline for SDCM, ECM, and DCM, flattening JSON into Databricks tables.
Work with data vault modeling (a plus).
Implement changes to the JSON flattening process based on business needs.
Write and execute unit tests to ensure code accuracy.
Optimize the performance of the data pipeline and fix data quality issues.
Implement active monitoring for both data pipelines and data quality.
Gather requirements, set targets, define interface specifications, and conduct design sessions.
Work closely with data consumers to ensure proper integration.
Adapt and learn in a fast-paced project environment.
Start Date: ASAP
Location: Remote
Working hours: US time zone overlap required: 2-6pm CET
Long-term contract based-role: 6+month
Strong Spark and SQL skills for ETL, data modeling, and performance tuning.
Experience with Databricks
Proficiency in Python, especially for handling and flattening complex JSON structures.
Hands-on experience with Cloud architecture (Azure - preferred, AWS, GCP).
Experience with data orchestration tool Airflow.
Proficiency in RDBMS/NoSQL data stores and appropriate use cases.
Understanding of software engineering and testing practices within an Agile environment.
Experience with Data as Code; version control, small and regular commits, unit tests, CI/CD, packaging, familiarity with containerization tools such as Docker (must have) and Kubernetes (plus).
Excellent teamwork and communication skills.
Proficiency in English, with strong written and verbal communication skills.
Efficient, high-performance data pipelines for real-time and batch data processing.
Knowledge of cryptography and its application in blockchain is a plus.
Experience with blockchain indexing is a plus.
Familiarity with Delta Lake and data warehousing concepts is a plus.
Your application has been successfully submitted!