PricewaterhouseCoopers logo

GenAI Data Engineer - Senior Associate

PricewaterhouseCoopers
Full-time
On-site
Bengaluru, India

Industry/Sector

Not Applicable

Specialism

Data, Analytics & AI

Management Level

Senior Associate

Job Description & Summary

A career in our Advisory Acceleration Centre is the natural extension of PwC’s leading class global delivery capabilities. We provide premium, cost effective, high quality services that support process quality and delivery capability in support for client engagements.

Job Description: GenAI Data Engineer - Senior Associate (PwC US AC)

PwC US - Acceleration Center is seeking a highly skilled and experienced GenAI Data Engineer to join our team at Senior Associate level. As a GenAI Data Engineer, you will be responsible for developing and maintaining data pipelines, implementing machine learning models, and optimizing data infrastructure for our GenAI projects. The ideal candidate should have a strong background in data engineering, with a focus on GenAI technologies, and possess a solid understanding of data processing, event-driven architectures, containerization, and cloud computing.

Responsibilities:

- Design, develop, and maintain data pipelines and ETL processes for GenAI projects.

- Collaborate with data scientists and software engineers to implement machine learning models and algorithms.

- Optimize data infrastructure and storage solutions to ensure efficient and scalable data processing.

- Implement event-driven architectures to enable real-time data processing and analysis.

- Utilize containerization technologies like Kubernetes and Docker for efficient deployment and scalability.

- Develop and maintain data lakes for storing and managing large volumes of structured and unstructured data.

- Implement and integrate LLM frameworks (Langchain, Semantic Kernel) for advanced language processing and analysis.

- Collaborate with cross-functional teams to design and implement solution architectures for GenAI projects.

- Utilize cloud computing platforms such as Azure or AWS for data processing, storage, and deployment.

- Monitor and troubleshoot data pipelines and systems to ensure smooth and uninterrupted data flow.

- Stay up-to-date with the latest advancements in GenAI technologies and recommend innovative solutions to enhance data engineering processes.

- Collaborate with cross-functional teams to understand business requirements and translate them into technical solutions.

- Document data engineering processes, methodologies, and best practices.

- Maintain solution architecture certificates and stay current with industry best practices.

Requirements:

  • Python Proficiency: Minimum 3 years of hands-on experience building applications with Python.

  • Scalable System Design: Solid understanding of designing and architecting scalable Python applications, particularly for Gen AI use cases, with a strong understanding of various components and systems architecture patterns to make cohesive and decoupled, scalable applications.

  • Web Frameworks: Familiarity with Python web frameworks (Flask, FastAPI) for building web applications around AI models.

  • Modular Design & Security: Demonstrated ability to design applications with modularity, reusability, and security best practices in mind (session management, vulnerability prevention, etc.,).

  • Cloud-Native Development: Familiarity with cloud-native development patterns and tools (e.g., REST APIs, microservices, serverless functions).

  • Cloud Deployments: Experience deploying and managing containerized applications on Azure/AWS (Azure Kubernetes Service, Azure Container Instances, or similar).

  • Version Control (Git):  Strong proficiency in Git for effective code collaboration and management.

  • CI/CD: Knowledge of continuous integration and deployment (CI/CD) practices on cloud platforms.

  • 3-5 years of relevant technical/technology experience, with a focus on GenAI projects.

  • Strong programming skills in Python.

  • Experience with data processing frameworks like Apache Spark or similar.

  • Proficiency in SQL and database management systems.

Preferred Skills:

  • Gen AI Frameworks:  Experience with LLM frameworks or tools for interacting with LLMs such as LangChain, Semantic Kernel, LlamaIndex

  • Data Pipelines: Experience in setting up data pipelines for model training and real-time inference.

● Good analytical & problem-solving skills

● Good communication and presentation skills

Travel Requirements

Not Specified

Job Posting End Date