K

Data Engineer (AI)

Koger
Full-time
Remote

This is a remote position.

Position Summary

We're seeking a Data Engineer to architect and scale our data infrastructure as we handle increasingly complex and sensitive datasets. You'll lead the development of automated data pipelines that process everything from clinical notes to diagnostic imaging, ensuring compliance, performance, and reliability at scale.

This is a high-impact role where you'll shape our technical direction while working with cutting-edge data types in healthcare and media.

Key Responsibilities

Infrastructure Leadership:

  • Design and implement scalable, automated data pipelines with robust validation, transformation, and compliance controls
  • Optimize storage and retrieval systems to reduce costs while improving performance across the platform
  • Lead architectural decisions for handling petabyte-scale, multi-modal datasets

Complex Data Processing:

  • Extract, process, and transform unique data types including clinical notes, video metadata, medical imaging (DICOM), and unstructured documents
  • Build systems that handle real-time and batch processing of sensitive healthcare and media data
  • Implement data quality frameworks and monitoring systems

Cross-Functional Impact:

  • Collaborate with AI/ML teams to optimize data pipelines for model training workflows
  • Partner with product and engineering teams to deliver scalable solutions
  • Mentor junior engineers and contribute to technical strategy discussions

Compliance & Security:

  • Ensure HIPAA compliance and implement privacy-preserving data processing techniques
  • Build audit trails and governance frameworks for sensitive data handling
  • Design systems that meet enterprise security and compliance requirements

Required Qualifications

Required Experience:

  • 5+ years in data engineering with proven experience scaling data systems in production
  • Startup experience preferred - comfortable with ambiguity and rapid iteration
  • Technical expertise: Python and/or Java in production environments, modern data infrastructure (Spark, Kafka, Airflow), cloud platforms (AWS/GCP/Azure)
  • Independent execution: Ability to take vague requirements and deliver structured, scalable solutions

Preferred Qualifications:

  • Experience with healthcare data compliance (HIPAA, GDPR) or media data pipelines
  • Background with unstructured data processing (PDFs, images, video)
  • Data privacy and security experience in regulated industries
  • Previous work with ML/AI data pipelines and model training workflows

Compensation & Benefits

Compensation & Equity:

  • Competitive salary commensurate with experience
  • Meaningful equity stake in a fast-growing company
  • Comprehensive benefits package

Growth & Impact:

  • Front-row seat to solving one of AI's most strategic problems
  • Autonomy to lead high-impact projects with significant technical scope
  • Direct influence on product direction and company growth

Work Environment:

  • Remote-first culture with flexible working arrangements
  • Small, highly effective team with diverse backgrounds
  • Access to cutting-edge technology and substantial infrastructure budget
  • Collaborative environment with minimal bureaucracy

Ready to Shape the Future of AI Data Infrastructure?

Join us in building the platform that will enable the next generation of AI breakthroughs. If you're passionate about solving complex data challenges at scale and want to make a meaningful impact in healthcare and media AI, we'd love to hear from you.

Koger Recruiting is conducting this search on behalf of our client. We are an equal opportunity recruiting firm committed to diversity and inclusion.