We are looking for a Data Engineer who can work across modern data platforms and streaming frameworks to build scalable and reliable pipelines. If you enjoy working with Spark on Databricks, Kafka, Snowflake, and MongoDB — and want to solve real-world data integration challenges — this role is for you.
You will develop ETL/ELT pipelines in Databricks (PySpark notebooks) or Snowflake (SQL/Snowpark), ingesting from sources like Confluent Kafka
Handle data storage optimizations using Delta Lake/Iceberg formats, ensuring reliability (e.g., time travel for auditing in fintech pipelines).
Integrate with Azure ecosystems (e.g., Fabric for warehousing, Event Hubs for streaming), supporting BI/ML teams—e.g., preparing features for demand forecasting models
Contribute to real-world use cases, such as building dashboards for healthcare outcomes or optimizing logistics routes with aggregated IoT data.
Write clean, maintainable code in Python or Scala
Collaborate with analysts, engineers, and product teams to translate data needs into scalable solutions
Ensure data quality, reliability, and observability across the pipelines
3–6 years of hands-on experience in data engineering
Experience with Databricks / Apache Spark for large-scale data processing
Familiarity with Kafka, Kafka Connect, and streaming data use cases
Proficiency in Snowflake — including ELT design, performance tuning, and query optimization
Exposure to MongoDB and working with flexible document-based schemas
Strong programming skills in Python or Scala
Comfort with CI/CD pipelines, data testing, and monitoring tools
Good to have: -
Experience with Airflow, dbt, or similar orchestration tools
Worked on cloud-native stacks (AWS, GCP, or Azure)
Contributed to data governance and access control practices