This is a remote position.
We are seeking a Senior Data Engineer to support the ingestion, processing, and synchronization of data across our analytics platform. This role focuses on using Python Notebooks to ingest data via APIs into Microsoft Fabric's Data Lake and Data Warehouse, with some data being synced to a Synapse Analytics database for broader reporting needs.
The ideal candidate will have hands-on experience working with API-based data ingestion and modern data architectures, including implementing Medallion layer architecture (Bronze, Silver, Gold) for optimal data organization and quality management, with bonus points for exposure to marketing APIs like Google Ads, Google Business Profile, and Google Analytics 4.
This is a remote position. We welcome applicants globally, but this role has a preference for LATAM candidates to ensure smoother collaboration with our existing team
Key Responsibilities
- Build and maintain Python Notebooks to ingest data from third-party APIs
- Design and implement Medallion layer architecture (Bronze, Silver, Gold) for structured data organization and progressive data refinement
- Store and manage data within Microsoft Fabric's Data Lake and Warehouse using delta parquet file formats
- Set up data pipelines and sync key datasets to Azure Synapse Analytics
- Develop PySpark-based data transformation processes across Bronze, Silver, and Gold layers
- Collaborate with developers, analysts, and stakeholders to ensure data availability and accuracy
- Monitor, test, and optimize data flows for reliability and performance
- Document processes and contribute to best practices for data ingestion and transformation
Tech Stack You'll Use
Ingestion & Processing:
- Python (Notebooks)
- PySpark
Storage & Warehousing:
- Microsoft Fabric Data Lake & Data Warehouse
- Delta Parquet files
Sync & Reporting:
Cloud & Tooling:
- Azure Data Factory, Azure DevOps
Requirements
- Strong experience with Python for data ingestion and transformation
- Proficiency with PySpark for large-scale data processing;
- Proficiency in working with RESTful APIs and handling large datasets;
- Experience with Microsoft Fabric or similar modern data platforms;
- Understanding of Medallion architecture (Bronze, Silver, Gold layers) and data lakehouse concepts;
- Experience working with Delta Lake and parquet file formats;
- Understanding of data warehousing concepts and performance tuning;
- Familiarity with cloud-based workflows, especially within the Azure ecosystem.
Nice to Have
- Experience with marketing APIs such as Google Ads or Google Analytics 4;
- Familiarity with Azure Synapse and Data Factory pipeline design;
- Understanding of data modeling for analytics and reporting use cases;
- Experience with AI coding tools;
- Experience with Fivetran, Aribyte, and Riverly.