Senior Data Engineer

Hatch

Full-time

On-site

New York City, New York, United States

Senior Jobs

HATCH

https://www.usehatchapp.com/

Senior Data Engineer

MUST BE BASED IN NYC - No Relocation

Cannot Sponsor

About Hatch

At Hatch, we’re building AI that doesn’t just assist behind the scenes; it converses with customers out in the wild. Backed by Y Combinator and top-tier investors like Bessemer and NextView, we’re scaling fast, doubling revenue year over year, and looking for A players to help us cement our place as the category leader in AI for customer engagement.

About the Role
We are looking for a skilled Data Engineer with proven software development experience to join our growing data team. You will build, optimize, and maintain data pipelines and platform services that power our analytics, reporting, and AI initiatives.
The critical requirement: You must have already built production APIs, SDKs, or backend services in previous roles. We need someone who brings software development expertise to data engineering—not someone looking to learn. If you haven't designed APIs, applied design patterns in production code, or shipped services that other engineers consume, this role is not the right fit.
This is not a business intelligence, analytics, or pure SQL/ETL role. Candidates whose experience is primarily dashboards, reports, or configuring low-code tools will not be successful here.
Key Responsibilities

Design and build scalable batch and real-time data pipelines using Kinesis, Pub/Sub, Flink, Spark, Airflow, and dbt.
Architect and implement multi-tier data lake architectures with raw/staging/curated layers, defining promotion criteria, data quality gates, and consumption patterns.
Develop and maintain production-quality APIs, SDKs, and backend services that integrate with data infrastructure.
Apply software engineering best practices—modular design, design patterns, testing, CI/CD, observability, and code reviews—to all data platform work.
Model and optimize datasets in BigQuery and Aurora PostgreSQL with attention to performance, cost, and governance.
Collaborate with backend teams to define data contracts, streaming interfaces, and service boundaries.
Implement infrastructure-as-code (Terraform, Docker, Kubernetes/EKS) for deployment automation.
Establish and monitor SLOs for data quality, latency, and availability; troubleshoot production issues across distributed systems.

What We're Looking For
Must-have software development background (non-negotiable):

3+ years building production APIs, SDKs, or backend services in Python, Go, or similar languages.
Demonstrated expertise with software design patterns (repository, factory, dependency injection, etc.) applied in real production systems—not theoretical knowledge.
Proven ability to write clean, tested, maintainable code with proper abstractions and error handling.
Experience with code reviews, CI/CD pipelines, and production deployments.
Strong computer science fundamentals: data structures, algorithms, concurrency, distributed systems.

Must-have data engineering experience:

5+ years total engineering experience, with 2+ years focused on data engineering.
Hands-on expertise with distributed data technologies: Kafka/Kinesis/PubSub, Spark/Flink, Airflow, dbt, BigQuery.
Experience with modern data lake table formats like Apache Iceberg, Delta Lake, or Apache Hudi for advanced schema management and data lake optimization.
Experience designing and implementing layered data architectures (raw/landing → refined/standardized → curated/consumption) with appropriate transformations and quality checks at each stage.
Strong SQL skills and experience with data modeling (dimensional, event-driven, domain patterns) and query optimization.
Production experience building both batch and streaming data pipelines.

Must-have platform experience:

Working knowledge of AWS and GCP, including monitoring/troubleshooting (CloudWatch, Prometheus/Grafana).
Familiarity with containerization, Kubernetes/EKS, and infrastructure-as-code (Terraform).
Exposure to event-driven microservices and schema governance (parquet/protobuf/Avro).
Excellent communication skills—can explain complex systems clearly and collaborate effectively with engineering teams.

Nice to Have

Experience with ML/LLM pipelines in production (vector databases, feature stores, prompt orchestration).
Open-source contributions or work in fast-moving startup environments.

What We Offer

Competitive salary and equity
Remote (Eastern or Central Time Zone required) OR Hybrid work environment (3 days/week in our NYC office)
Medical, dental, and vision benefits
401(k) plan
Flexible PTO
Opportunity to build at the ground floor of a high-growth, mission-driven company
Not offering sponsorship

Why Hatch

Shape the future of AI-driven customer service
Build alongside founders and leaders who value speed, ownership, and ambition
Solve hard problems that impact real businesses and customers
Join a team of builders who care about great engineering, fast execution, and each other

Senior Data Engineer

HATCH

Senior Data Engineer

About Hatch

What We're Looking For
Must-have software development background (non-negotiable):

Must-have platform experience:

What We Offer

Why Hatch

More jobs

CV Strip - Lead Data Engineer (Python) - Ndousa Patrick Dafikpaku

Gstsolutions

Principal Data Engineer

Atlassian

Senior Data Engineer

HATCH

Senior Data Engineer

About Hatch

What We're Looking ForMust-have software development background (non-negotiable):

Must-have platform experience:

What We Offer

Why Hatch

More jobs

CV Strip - Lead Data Engineer (Python) - Ndousa Patrick Dafikpaku

Gstsolutions

Principal Data Engineer

Atlassian

What We're Looking For
Must-have software development background (non-negotiable):