Big Data Engineer (Python, Java, Analytics, Storage)
Role Overview:
We’re looking for a highly skilled Big Data Engineer to help architect and implement large-scale data storage and analytics platforms as part of a global client implementation. You’ll be responsible for managing data flow from ingestion through to storage and insight generation, using both open-source and enterprise technologies.
Key Responsibilities:
- Design and build end-to-end data pipelines from ingestion to analytics-ready format.
- Leverage Python and Java to process large datasets, build transformation logic, and optimize performance.
- Integrate real-time and batch data flows using Kafka, RabbitMQ, and NiFi.
- Model and manage scalable data storage solutions for analytics using HDFS, S3, or data lake tools.
- Support business and client teams with accessible, structured datasets and metadata tagging.
- Enable searchability and observability using ElasticSearch and visual tools.
Required Skills:
- 4+ years’ experience in big data engineering, with proven projects in implementation settings.
- Proficient in Python and Java for large-scale ETL and transformation.
- Experience with Kafka, RabbitMQ, NiFi, and streaming architecture design.
- Deep understanding of data lake / warehouse design, data partitioning, and indexing.
- Experience with ElasticSearch and Zookeeper in a production context.
- Familiarity with security and compliance practices in multi-tenant data environments.
Nice to Have:
- Experience integrating with BI tools or data visualization platforms.
- Knowledge of ML model deployment pipelines.
Other Requirements
- Eligibility for Top-Level Security Clearance:
- Candidates must be eligible to obtain and maintain security clearance at the highest level, in accordance with applicable national security regulations.
- On-Site Work Requirement:
- This role requires full-time, on-site presence at the client’s premises located in Pretoria. Remote or hybrid work arrangements are not applicable.