Oracle logo

Senior Data Engineer (AI,ML)

Oracle
Full-time
On-site
Mexico
Senior Jobs
Description

Essential Skills

  • Proficiency in Python (Java a plus) with hands-on experience in modern ML frameworks such as PyTorch and TensorFlow, plus a solid foundation in statistics and data modeling.
  • Experience building end-to-end ML and GenAI pipelines, including data preprocessing, feature engineering, model training, validation, and production deployment.
  • Practical expertise in Generative AI and RAG systems, including embeddings, chunking strategies, hybrid retrieval, reranking, and evaluation techniques.
  • Hands-on experience with agentic AI workflows, including prompt engineering, intent routing, tool orchestration, function calling, and safe tool-use with guardrails.
  • Experience with enterprise software development and cloud-native architectures, including REST APIs, microservices, containerization, CI/CD, and platforms such as AWS, Azure, GCP, or Oracle Cloud.
  • Strong problem-solving skills, with the ability to translate business requirements into scalable, reliable, and cost-effective AI solutions.
  • Excellent written and verbal communication skills, with the ability to work effectively in a collaborative, cross-functional, and global team environment.

 

 

 



Responsibilities

AI/ML

  • Design, train, and optimize machine learning models for real-world applications.
  • Build end-to-end ML pipelines, including data preprocessing, feature engineering, model training, validation, and deployment.
  • Collaborate with data engineers and software developers to integrate ML models into production systems.
  • Monitor model performance, detect data drift, and retrain models for continuous improvement.

GenAI

  • Agentic Solution Design & Orchestration
    • Architect LLM-powered applications, including intent routing across tools and skills.
    • Implement agentic workflows using frameworks such as LangGraph or equivalents; decompose tasks, manage tool invocation, and ensure determinism and guardrails.
    • Integrate MCP-compatible tools and services to extend system capabilities.
  • Retrieval & Embeddings
    • Build effective RAG systems: chunking strategies, embedding model selection, vector indexing, reranking, and grounding to authoritative data.
    • Optimize vector stores and search using ANN, hybrid retrieval, filters, and metadata schemas.
  • Prompting & Model Strategy
    • Develop robust prompting patterns and templates; structure prompts for tool use and function calling.
    • Compare generic vs. fine-tuned LLMs for intent routing; make data-driven choices on cost, latency, accuracy, and maintainability.
  • Data & Integrations
    • Implement NL2SQL (and guarded SQL execution) patterns; connect to microservices and enterprise systems via secure APIs.
    • Define and enforce data schemas, metadata, and lineage for reliable retrieval.
  • Production Readiness
    • Establish evaluation datasets and automated regressions for RAG and agent systems.
    • Monitor quality (precision/recall, hallucination rate), latency, cost, and safety.
    • Apply guardrails, PII handling, access controls, and policy enforcement end-to-end.

MLOps / LangOps

  • Version prompts, models, embeddings, and pipelines; manage A/B tests and rollout strategies.
  • Instrument tracing and telemetry for agent steps and tool calls; implement fallback, timeout, and retry policies.

Core Qualifications

  • Programming:
    • Strong proficiency in Python (NumPy, Pandas, Scikit-learn); experience with ML frameworks such as TensorFlow and PyTorch.
  • Machine Learning & Deep Learning
    • Hands-on experience with supervised, unsupervised, and reinforcement learning techniques.
  • Mathematics & Statistics
    • Solid foundation in linear algebra, probability, optimization, and statistical modeling.
  • Data Handling
    • Experience with SQL and NoSQL databases, data preprocessing, and feature engineering.
  • GenAI Expertise
    • Strong understanding of vector embeddings and similarity search (cosine, inner product, L2), chunking strategies, and reranking.
    • Hands-on experience building RAG pipelines (indexing, metadata, hybrid search, evaluators).
    • Practical prompt engineering for tool use, function calling, and agent planning.
    • Experience with agentic frameworks (e.g., LangGraph or similar) and orchestration of tools and services; familiarity with MCP and tool-integration patterns.
    • Knowledge of NL2SQL techniques, SQL safety (schema constraints, query sandboxes), and microservice integration.
    • Ability to evaluate tradeoffs between generic/base LLMs and fine-tuned/task-specific models (accuracy, drift, data/ops burden, latency, and cost).
    • Proficiency with Python and common LLM/RAG libraries; containerization and CI/CD.
    • Understanding of enterprise security, privacy, and compliance; RBAC/ABAC for data access, logging, and auditability.

MLOps & Deployment

  • Familiarity with model deployment frameworks (MLflow, Kubeflow, SageMaker, Vertex AI), CI/CD pipelines, and containerization using Docker and Kubernetes.

Preferred Experience

  • Hands-on experience with at least one major cloud provider (AWS, Azure, GCP, OCI).
  • Experience with large-scale distributed systems and big data frameworks (Spark, Hadoop).
  • Retrieval optimization using hybrid lexical + vector search, metadata filtering, and learned rerankers.
  • Model fine-tuning and adapter methods (LoRA, SFT, DPO) and evaluation.
  • Observability stacks for LLM applications (tracing, evaluation dashboards, cost/latency SLOs).
  • Document AI (OCR, layout parsing) and schema construction for unstructured data.
  • Caching, batching, and KV-cache optimization for throughput and cost efficiency.
  • Safe tool-use patterns, including constrained decoding, JSON schemas, and policy checks.

How We’ll Assess

  • Portfolio or walkthrough of a production RAG or agent system: objectives, architecture, evaluations, and outcomes.
  • Hands-on exercise: design an intent router, justify model choice (generic vs. fine-tuned), propose chunking and metadata strategy, and define evaluation metrics.
  • Discussion of failure modes (hallucinations, tool errors, SQL risk) and mitigation strategies.
  • Approach to governance: access controls, PII handling, audit logging, and red-teaming.


Qualifications

Career Level - IC4



Apply now
Share this job