Cloud Data Engineer
Req number:
R5934
Employment type:
Full time
Worksite flexibility:
Remote
Who we are
CAI is a global technology services firm with over 8,500 associates worldwide and a yearly revenue of $1 billion+. We have over 40 years of excellence in uniting talent and technology to power the possible for our clients, colleagues, and communities. As a privately held company, we have the freedom and focus to do what is right—whatever it takes. Our tailor-made solutions create lasting results across the public and commercial sectors, and we are trailblazers in bringing neurodiversity to the enterprise.
Job Summary
We are seeking a motivated Cloud Data Engineer that has experience in building data products using Databricks and related technologies. This is a Full-time and Remote position.
Job Description
What You’ll Do
- Analyze and understand existing data warehouse implementations to support migration and consolidation efforts.
- Reverse-engineer legacy stored procedures (PL/SQL, SQL) and translate business logic into scalable Spark SQL code within Databricks notebooks.
- Design and develop data lake solutions on AWS using S3 and Delta Lake architecture, leveraging Databricks for processing and transformation.
- Build and maintain robust data pipelines using ETL tools with ingestion into S3 and processing in Databricks.
- Collaborate with data architects to implement ingestion and transformation frameworks aligned with enterprise standards.
- Evaluate and optimize data models (Star, Snowflake, Flattened) for performance and scalability in the new platform.
- Document ETL processes, data flows, and transformation logic to ensure transparency and maintainability.
- Perform foundational data administration tasks including job scheduling, error troubleshooting, performance tuning, and backup coordination.
- Work closely with cross-functional teams to ensure smooth transition and integration of data sources into the unified platform.
- Participate in Agile ceremonies and contribute to sprint planning, retrospectives, and backlog grooming.
- Triage, debug and fix technical issues related to Data Lakes.
- Maintain and Manage Code repositories like Git.
What You'll Need
- 5+ years of experience working with Databricks, including Spark SQL and Delta Lake implementations.
- 3 + years of experience in designing and implementing data lake architectures on Databricks.
- Strong SQL and PL/SQL skills with the ability to interpret and refactor legacy stored procedures.
- Hands-on experience with data modeling and warehouse design principles.
- Proficiency in at least one programming language (Python, Scala, Java).
- Bachelor’s degree in Computer Science, Information Technology, Data Engineering, or related field.
- Experience working in Agile environments and contributing to iterative development cycles. Experience working on Agile projects and Agile methodology in general.
- Databricks cloud certification is a big plus.
- Exposure to enterprise data governance and metadata management practices.
Physical Demands
- This role involves mostly sedentary work, with occasional movement around the office to attend meetings, etc.
- Ability to perform repetitive tasks on a computer, using a mouse, keyboard, and monitor.
Reasonable accommodation statement
If you require a reasonable accommodation in completing this application, interviewing, completing any pre-employment testing, or otherwise participating in the employment selection process, please direct your inquiries to application.accommodations@cai.io or (888) 824 – 8111.