Data Engineer (Python)

Job description

We are looking for a pragmatic problem solver. You are driven by efficiency and robustness, and want to employ your skills to transform the way in which data scientists achieve their results. Machine learning has caught your eye, but rather than crunching numbers yourself, you'd rather work on the larger, over-arching tools and frameworks that Jungle uses. You'll smoothen out the kinks, automate repetitive tasks, improve iteration speed, and help Jungle create their impact on a much larger scale!


Why do we need you?

  • We enjoy working on big problems with big impact. In our focus areas they often go hand in hand with large amounts of historical and streaming data. We need you to help us further develop our in-house pipeline that allows us to quickly iterate on these challenges.
  • We are building technology to remove as much manual work from our activities as possible. By contributing to our tools you will enable Jungle engineers to work more efficiently.
  • We need to be able to trust our model output without having it be observed by humans. You'll help us improve the robustness of our output and its automatic verification.
  • On a day-to-day basis you will interact with our data scientists, understanding their needs and challenges and drafting awesome ideas for the next functionalities of our tools.
  • We are preparing to open source some of our tools, but they need rigorous testing and polishing before we can do that.


Why work with us?

  • Use your skills to have a meaningful impact in the world!
  • Work with state-of-the-art (TRUE state-of-the-art) technology
  • Flexible work schedule, holiday policy, and work location
  • Daily fuel to keep you going; breakfast, lunch, fruits, and snacks
  • Jungle takes care of you with our monthly massage programme and bi-weekly yoga classes
  • Modern work environment, tools and peripherals
  • Become part of a warm and skilled group of people, committed to each others success
  • Regular team activities (incl. TGIF at the roof terrace)

Requirements

Key Skills

  • At least 3 years of prior working experience as a Python developer, or equivalent experience.
  • You have experience in developing data pipelines and ETL frameworks.
  • You have experience with the Python ML stack (e.g. Pandas, Dask, Scikit, Tensorflow/PyTorch).
  • Understanding of the threading limitations of Python, and multi-process architecture.
  • Strong unit testing and integration testing skills.
  • Proficient understanding of code versioning tools such as Git.
  • [bonus]¬†Understanding of the pre-processing steps required to train models, such as scaling, standardization, dimensionality reduction, imputation, among others.


About you

  • You work meticulously and your output is robust and trustworthy.
  • You are passionate about the applications of ML and want to improve the frameworks for achieving results with ML.
  • You're pragmatic; you know when to trade off diving deep with quick fixes.
  • You're excited about smoothing out processes and automation of repetitive tasks.
  • [bonus] Have contributed to open source tools in the past.