EuroPython 2016

Analyzing Data with Python & Docker

Speaker(s) Andreas Dewes
Sub Community: PyData

Docker is a powerful tool for packaging software and services in containers and running them on a virtual infrastructure. Python is a very powerful language for data analysis. What happens if we combine the two? We get a very versatile and robust system for analyzing data at small and large scale!

I will show how we can make use of Python and Docker to build repeatable, robust data analysis workflows that can be used in many different contexts. I will explain the core ideas behind Docker and show how they can be useful in data analysis. I will then discuss an open-source Python library (Rouster) which uses the Python Docker-API to analyze data in containers and show several interesting use cases (possibly even a live-demo).

Outline:

  1. Why data analysis can be frustrating: Managing software, dependencies, data versions, workflows
  2. How Docker can help us to make data analysis easier & more reproducible
  3. Introducing Rouster: Building data analysis workflows with Python and Docker
  4. Examples of data analysis workflows: Business Intelligence, Scientific Data Analysis, Interactive Exploration of Data
  5. Future Directions & Outlook

in on Thursday 21 July at 10:30 See schedule

Comments

  1. Gravatar
    I'm really interested in attending this talk. I have already bought a EuroPython ticket. I'm planning the trip and hotel reservation, so I would like to know the talk date and time (if possible).
    Thank you. Marco.
    — Marco Basilico,

New comment