COLLAB. Recruitment logo

Lead Data Engineer

COLLAB. Recruitment
Full-time
On-site
London, England, United Kingdom
Senior Jobs

Company Description

The client are leaders in modern money. Each and every time you use your debit card or credit card to pay for something, whether online or face-to-face, there’s a good chance it happened because of them. On an annual basis, their innovations, systems and technology enable billions of monetary transactions globally. Working with customers both large and small, they help them to take your payments quickly, safely and reliably, allowing them to grow their businesses and making your life more convenient in the process. As a leader in global FinTech and the largest London IPO since 2011, this is a great time to join them in building for the next phase of their journey.

Job Description

The client puts data at the very heart of their business and are excited to be building a new enterprise data platform which provides powerful analytic capabilities for them internally and subsequently their customers. Therefore, this is a key role within their data engineering team working on this new data platform.

The role will involve loading data from multiple operational systems into the client’s vast data lake. Due to the large amount of data, they use HiveQL as well as Hadoop’s inbuilt data loading tools for much of the processing. 

While working within the client’s dynamic team, you will be responsible for building and data loading using the Hortonworks toolset. This will also include creating data migration scripts to take all data stores in the cluster from one release to the next. This is ultimately a great opportunity to gain skills in the Hortonworks Data Platform and related tools in an innovative and passionate environment.

Qualifications

The ideal candidate will have experience with the following:

  • Knowledge of standard SQL
  • Large scale database systems such as Netezza, DB2, Sybase or Oracle.
  • Data Loading (either Scripting or using ETL technologies)
  • Big Data technologies such as Apache Hadoop, MapReduce, HDFS, Hive, Pig, Sqoop, NoSQL Databases and HBase.
  • Data Streaming technologies such as Kafka, Flume, Storm, Spark or Flink.
  • Search technologies such as Elasticsearch or Solr
  • Developing efficient code in SQL, Linux shell, Java or Python.

Please note, other relevant skills and experience are also considered.