Introduction to the Role:
Are you passionate about unlocking the power of data to drive innovation and transform business outcomes? Join our cutting-edge Data Engineering team and be a key player in delivering scalable, secure, and high-performing data solutions across the enterprise. As a Data Engineer, you will play a central role in designing and developing modern data pipelines and platforms that support data-driven decision-making and AI-powered products. With a focus on Python, SQL, AWS, PySpark, and Databricks, you'll enable the transformation of raw data into valuable insights by applying engineering best practices in a cloud-first environment.
We are looking for a highly motivated professional who can work across teams to build and manage robust, efficient, and secure data ecosystems that support both analytical and operational workloads.
Accountabilities:
• Design, build, and optimize scalable data pipelines using PySpark, Databricks, and SQL on AWS cloud platforms.
• Collaborate with data analysts, data scientists, and business users to understand data requirements and ensure reliable, high-quality data delivery.
• Implement batch and streaming data ingestion frameworks from a variety of sources (structured, semi-structured, and unstructured data).
• Develop reusable, parameterized ETL/ELT components and data ingestion frameworks.
• Perform data transformation, cleansing, validation, and enrichment using Python and PySpark.
• Build and maintain data models, data marts, and logical/physical data structures that support BI, analytics, and AI initiatives.
• Apply best practices in software engineering, version control (Git), code reviews, and agile development processes.
• Ensure data pipelines are well-tested, monitored, and robust with proper logging and alerting mechanisms.
• Optimize performance of distributed data processing workflows and large datasets.
• Leverage AWS services (such as S3, Glue, Lambda, EMR, Redshift, Athena) for data orchestration and lakehouse architecture design.
• Participate in data governance practices and ensure compliance with data privacy, security, and quality standards.
• Contribute to documentation of processes, workflows, metadata, and lineage using tools such as Data Catalogs or Collibra (if applicable).
• Drive continuous improvement in engineering practices, tools, and automation to increase productivity and delivery quality.
Essential Skills / Experience:
• 4 to 6 years of professional experience in Data Engineering or a related field.
• Strong programming experience with Python and experience using Python for data wrangling, pipeline automation, and scripting.
• Deep expertise in writing complex and optimized SQL queries on large-scale datasets.
• Solid hands-on experience with PySpark and distributed data processing frameworks.
• Expertise working with Databricks for developing and orchestrating data pipelines.
• Experience with AWS cloud services such as S3, Glue, EMR, Athena, Redshift, and Lambda.
• Practical understanding of ETL/ELT development patterns and data modeling principles (Star/Snowflake schemas).
• Experience with job orchestration tools like Airflow, Databricks Jobs, or AWS Step Functions.
• Understanding of data lake, lakehouse, and data warehouse architectures.
• Familiarity with DevOps and CI/CD tools for code deployment (e.g., Git, Jenkins, GitHub Actions).
• Strong troubleshooting and performance optimization skills in large-scale data processing environments.
• Excellent communication and collaboration skills, with the ability to work in cross-functional agile teams.
Desirable Skills / Experience:
• AWS or Databricks certifications (e.g., AWS Certified Data Analytics, Databricks Data Engineer Associate/Professional).
• Exposure to data observability, monitoring, and alerting frameworks (e.g., Monte Carlo, Datadog, CloudWatch).
• Experience working in healthcare, life sciences, finance, or another regulated industry.
• Familiarity with data governance and compliance standards (GDPR, HIPAA, etc.).
• Knowledge of modern data architectures (Data Mesh, Data Fabric).
• Exposure to streaming data tools like Kafka, Kinesis, or Spark Structured Streaming.
• Experience with data visualization tools such as Power BI, Tableau, or QuickSight.
Work Environment & Collaboration:
We value a hybrid, collaborative environment that encourages shared learning and innovation. You will work closely with product owners, architects, analysts, and data scientists across geographies to solve real-world business problems using cutting-edge technologies and methodologies. We encourage flexibility while maintaining a strong in-office presence for better team synergy and innovation.
About Agilisium -
• Agilisium, is an AWS technology Advanced Consulting Partner that enables companies to accelerate their "Data-to-Insights-Leap.
• With $50+ million in annual revenue and over 30% year-over-year growth, Agilisium is one of the fastest-growing IT solution providers in Southern California.
• Our most important asset? People.
• Talent management plays a vital role in our business strategy.
• We’re looking for “drivers”; big thinkers with growth and strategic mindset — people who are committed to customer obsession, aren’t afraid to experiment with new ideas.
• And we are all about finding and nurturing individuals who are ready to do great work.
• At Agilisium, you’ll collaborate with great minds while being challenged to meet and exceed your potential