City: LA, CA
Onsite/ Hybrid/ Remote: Remote
Duration: 6 months
Rate Range: Up to$92.5/hr on W2 depending on experience (no C2C or 1099 or sub-contract)
Work Authorization: GC, USC, All valid EADs except OPT, CPT, H1B
Core Skills:
Expertise in big data engineering pipelines, Spark. Python, MPP Databases/SQL (Snowflake), Cloud Environments (AWS)
Must Have:
- Expertise in Big Data engineering pipelines
- Strong SQL and MPP Databases (Snowflake, Redshift, or BigQuery)
- Apache Spark (PySpark, Scala, Hadoop ecosystem)
- Python/Scala/Java programming
- Cloud Environments (AWS – S3, EMR, EC2)
- Data Warehousing and Data Modeling
- Data orchestration/ETL tools (Airflow or similar)
Responsibilities:
- Design, build, and optimize large-scale data pipelines and warehousing solutions.
- Develop ETL workflows in Big Data environments across cloud, on-prem, or hybrid setups.
- Collaborate with Data Product Managers, Architects, and Engineers to deliver scalable and reliable data solutions.
- Define data models and frameworks for data warehouses and marts supporting analytics and audience engagement.
- Maintain strong documentation practices for data governance and quality standards.
- Ensure solutions meet SLAs, operational efficiency, and support analytics/data science teams.
- Contribute to Agile/Scrum processes and continuously drive team improvements.
Qualifications:
- 6+ years of experience in data engineering with large, distributed data systems.
- Strong SQL expertise with ability to create performant datasets.
- Hands-on experience with Spark, Hadoop (HDFS, Hive, Presto, PySpark).
- Proficiency in Python, Scala, or Java.
- Experience with at least one major MPP or cloud database (Snowflake preferred, Redshift or BigQuery acceptable).
- Experience with orchestration tools such as Airflow.
- Strong knowledge of data modeling techniques and data warehousing best practices.
- Familiarity with Agile methodologies.
- Excellent problem-solving, analytical, and communication skills.
- Bachelor’s degree in STEM required.