At GKM IT, we don’t just store data—we shape it into meaningful insight. We’re looking for a Data Engineer - Senior I to help us build and scale data infrastructure that powers real-time decision-making and intelligent product experiences.
If transforming messy data into clean pipelines excites you, and you love designing systems that perform under scale—you’ll thrive in this role. You’ll work at the intersection of cloud technologies, big data, and automation, driving the way we handle, manage, and leverage data across the organization.
3–5 years of experience in Data Engineering or similar roles
Strong foundation in cloud-native data infrastructure and scalable architecture design
Build and maintain reliable, scalable ETL/ELT pipelines using modern cloud-based tools
Design and optimize Data Lakes and Data Warehouses for real-time and batch processing
Ingest, transform, and organize large volumes of structured and unstructured data
Collaborate with analysts, data scientists, and backend engineers to define data needs
Monitor, troubleshoot, and improve pipeline performance, cost-efficiency, and reliability
Implement data validation, consistency checks, and quality frameworks
Apply data governance best practices and ensure compliance with privacy and security standards
Use CI/CD tools to deploy workflows and automate pipeline deployments
Automate repetitive tasks using scripting, workflow tools, and scheduling systems
Translate business logic into data logic while working cross-functionally
Strong in Python and familiar with libraries like pandas and PySpark
Hands-on experience with at least one major cloud provider (AWS, Azure, GCP)
Experience with ETL tools like AWS Glue, Azure Data Factory, GCP Dataflow, or Apache NiFi
Proficient with storage systems like S3, Azure Blob Storage, GCP Cloud Storage, or HDFS
Familiar with data warehouses like Redshift, BigQuery, Snowflake, or Synapse
Experience with serverless computing like AWS Lambda, Azure Functions, or GCP Cloud Functions
Familiar with data streaming tools like Kafka, Kinesis, Pub/Sub, or Event Hubs
Proficient in SQL, and knowledge of relational (PostgreSQL, MySQL) and NoSQL (MongoDB, DynamoDB) databases
Familiar with big data frameworks like Hadoop or Apache Spark
Experience with orchestration tools like Apache Airflow, Prefect, GCP Workflows, or ADF Pipelines
Familiarity with CI/CD tools like GitLab CI, Jenkins, Azure DevOps
Proficient with Git, GitHub, or GitLab workflows
Strong communication, collaboration, and problem-solving mindset
Experience with data observability or monitoring tools (bonus points)
Contributions to internal data platform development (bonus points)
Comfort working in data mesh or distributed data ownership environments (bonus points)
Experience building data validation pipelines with Great Expectations or similar tools (bonus points)