Track Overview

Data Engineering : Where the Rubber meets the Road in Data Science

Data Science is a discipline with brilliant minds and employing cutting edge research. However, science does not imply engineering. The Data Engineering: Where the Rubber Meets the Road in Data Science is all about advancing the engineering side of the profession. The track discusses the patterns and practices with core tooling like Jupiter Notebooks, big data cloud migrations, and lessons from Data Scientists who have been there.


From this track

SESSION + Live Q&A Data Science

Effective Data Pipelines: Data Mngmt from Chaos

Creating automated, efficient and accurate data pipelines out of the (often) noisy, disparate and busy data flows used by today's enterprises is a difficult task. Data science teams and engineering teams may be asked to work together to create a management platform (or install one) that helps...

Katharine Jarmul

Python engineer, Founder @kjamistan

SESSION + Live Q&A Data Science

Data Cleansing and Understanding Best Practices

Any data scientist who works with real data will tell you that the hardest part of any data science task is the data preparation. Everything from cleaning dirty data to understanding where your data is missing and how your data is shaped, the care and feeding of your data is a prime task for the...

Casey Stella

Committer and PMC member on the Apache Metron project

SESSION + Live Q&A Data Science

Reliable & Scalable Data Infra Eco-System At Uber

Uber's vision is to make transportation as reliable as running water everywhere, for everyone. Data is key for Uber's 24x7 global business operations and making data available for different use cases across the company in a reliable, scalable and performant way is often challenging. In this...

Sudhir Mallem

Staff Engineer @Uber

SESSION + Live Q&A Data Science

Building a Data Science Capability From Scratch

This talk will cover the challenges, both technical and cultural, of building a data science team and capability in a large, global company. It will discuss best practices, lessons learned, and rewards of leveraging data effectively in the next frontier of data science: commercial insurance.

Victor Hu

Head of Data Science @QBE

SESSION + Live Q&A Data Science

Data Engineering Open Space

SESSION + Live Q&A Data Science

Building Data Pipelines in Python

This talk discusses the process of building data pipelines, e.g. extraction, cleaning, integration, pre-processing of data, in general all the steps that are necessary to prepare your data for your data-driven product. In particular, the focus is on data plumbing and on the practice of going from...

Marco Bonzanini

Data Scientist & Co-Organiser of PyData London Meetup


Speakers from this track

Katharine Jarmul

Python engineer, Founder @kjamistan

Katharine Jarmul is a Python engineer and educator based in Berlin, Germany. She runs a data science consulting company, Kjamistan, and offers several private and public courses on data automation, cleaning and acquisition. She has worked on data extraction and analysis since 2008. She offers...

Read more
Find Katharine Jarmul at:

Casey Stella

Committer and PMC member on the Apache Metron project

I am a committer and PMC member on the Apache Metron project in the engineering team at Hortonworks. In the past, I've worked as an architect and senior engineer at a healthcare informatics startup spun out of the Cleveland Clinic, as a developer at Oracle and as a Research Geophysicist in the...

Read more
Find Casey Stella at:

Sudhir Mallem

Staff Engineer @Uber

Sudhir Mallem is a Staff Engineer at Uber working in the data infrastructure team. He was previously a Staff engineer and an early team member of the data infra team at LinkedIn where he built and maintained massively scalable enterprise and analytical warehouse that powered business operations,...

Read more
Find Sudhir Mallem at:

Victor Hu

Head of Data Science @QBE

Victor Hu is the Head of Data Science at QBE and comes from a background of leveraging data intelligently to benefit and transform industries. Previously he was the Chief Data Officer at Tictrac and built the data science team at Next Big Sound, a music analytics startup acquired by Pandora in...

Read more
Find Victor Hu at:

Marco Bonzanini

Data Scientist & Co-Organiser of PyData London Meetup

I'm a Data Science consultant based in London, UK. Author of "Mastering Social Media Mining with Python", published by Packt Publishing. Co-organiser of the PyData London meetup. Backed by a PhD in Information Retrieval, I specialise in search applications and text analytics applications, and...

Read more
Find Marco Bonzanini at:

Track Host

Andreas Gertsch Grover

Head of Data Science @JinnApp

Andreas is the Head of Data Science at Jim (an on-demand delivery service based in the London area). Prior to joining Jinn in July of 2016, he was a senior Data Scientist at Skipjaq and a consultant at the Advisory House.

Read more
Find Andreas Gertsch Grover at:

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.