This is a remote position.
Responsibilities:
- As a developer, possess excellent Knowledge of distributed computing architecture, core hadoop component (HDFS, Spark, Yarn, Map-Reduce, H base, HIVE, Impala) and related technologies.
- Technical Design and development of ETL/Hadoop and Analytics services /components
- Contribute in end to end architecture and process flow
- Understand Business requirement and publish reusable designs
- Result oriented approach with ability to provide apt solutions.
- Proficient in performance improvement & fine-tuning ETL and Hadoop implementations
- Conduct code reviews across projects. Takes responsibility for ensuring that build and code adhere to architectural and quality standards and policies.
- Can work independently with minimum supervision.
- Strong analytical and problem solving skills
- Experience/Exposure to SQL, advanced SQL skills
Requirements
Skills Set:
- Strong understanding of distributed computing architecture, core hadoop component (HDFS, Spark, Yarn, Map-Reuduce, H base, HIVE, Impala) and related technologies.
- Hands on experience with batch data ingestion (Sqoop)
- Expert level understanding of relational data structure and RDBMS as well as NoSQL databases (Cassandra, MongoDB, Elasticsearch)
- Experience with automation/Scheduling of workflows/jobs (via shell-scripting, Tivoli)
- Solid Grasp of data storage formats (Parquet, Avro, HBase, Cassandra)
- Understanding of Agile methodologies as well as SDLC life-cycles and processes.
- Strong Understanding of Data warehousing and lakes