We are looking for a passionate and experienced Big Data Engineer for driving the design and
development of core platform frameworks with the mega knowledge of Spark, Python/Scala
and Kafka Streams that enable the delivery and construction processes for the Data
Management, Data Discovery and Analytics group, and using emerging big data technologies.
Title: Lead Data Engineer - Big Data
Location: Gurugram
Experience: 4-7 Yrs.
Education: Bachelor’s / Master’s in Software Engineering
Responsibilities
● Gather and process raw data at scale.
● Design and develop data applications using selected tools and frameworks.
● Read, extract, transform, stage, and load data to selected tools and frameworks.
● Build and execute data warehousing, mining, and modeling activities using agile
development techniques.
● Perform tasks such as writing scripts, calling APIs, writing SQL queries, etc.
● Deploy and monitor products on AWS Cloud platform
● Monitoring data performance and modifying infrastructure as needed.
● Developing ETL flows based on Hadoop/Big Data stack technology: Apache Nifi, Hive,
Spark, Kafka,AWS GLUE ,AWS LakeFormation,AWS S3.
● Troubleshooting and performance optimization for data processing flows, data models.
● Able to develop and maintain Python based Rest api’s.
Sounds Like You?
● 4+ years of Big data development experience with working proficiency in Java/Scala
/Python to write data pipelines and data processing layers.
● Strong in Data Structures & Algorithms, Analytical, and problem-solving skills.
● Proficiency in Python with exposure to any web frameworks.
● Proficiency with Spark, Hadoop v2, MapReduce, HDFS.
● Experience with building stream-processing systems, using solutions such as Storm, or
Spark-Streaming.
● Experience with the integration of data from multiple data sources.
● Experience with SQL, NoSQL databases.
● Experience with various messaging systems, such as Kafka, or RabbitMQ.
● Experience with confluent packages like Kafka connect, schema registry.
● Experience with Cloudera/MapR/Hortonworks.
● Experience with AWS big data solutions like GLUE,Redshift,Lakeformation.
● Experience in Data Lake platforms like Hudi,Iceberg or Delta.
● Good knowledge of Big Data querying tools, such as Pig, Hive or Impala
● Knowledge of various ETL techniques and frameworks, such as Flume.