The first whole-day meetup of the Munich Datageeks

08 October 2016

Great Data Science talks, food + drinks and a lot of time for networking!

Book your seat

Speakers

Ralf Klüber

Managing Director at P3 Group

P3 Insight GmbH

Abstract: Legally fine but creepy - Where responsibility of the data scientist comes into play

In the recent past, community norms helped guide a clear sense of ethical boundaries with respect to privacy. One does not peek into the window of a house even if it is left open. Data collection via apps on smartphones allows businesses to open a virtual window into the private sphere of users. The general principles do not change when instead of physical open window data scientists are looking via a virtual window into the private space of others. When still perfectly covered under legal aspects certain analytics can already be considered creepy. Perceptions of creepiness are highly subjective and difficult to generalize. Ralf will give some examples to the audience to judge about their creepiness. The discussion will lead to the challenge how to draw a personal boundary when actively working with personal identifiable, pseudonymous and anonymous data.

Bio:

Ralf Klüber is Managing Director and Co-Founder at P3 insight GmbH. He focuses on benchmarking mobile networks based on data collected via crowdsourced smartphone applications. Ralf holds a Dipl.-Ing. in Electrical Engineering from Technische Hochschule Darmstadt. Ralf is active as a professional in the mobile network space for seventeen years and held major roles in areas of radio network planning, strategy and consulting in tier-one mobile operators and consulting firms. He continuously is swimming up-stream in the river of technology, dealing currently with data analytics, big data, visualization, managing a start-up and a family of four.

Felix Friedmann

Deep Learning Expert

Audi Electronics Venture GmbH

Abstract: Semantic Segmentation: How to teach machines to understand images

With the rise of Deep Learning, paradigms have changed in several well-established fields of computer science. Algorithms that had been developed for decades were instantly outperformed by deep neural networks and often levels of performance were reached that seemed unimaginable. Computer Vision was one of the first areas that were conquered by Deep Learning. AlexNet, a deep Convolutional Neural Network for image classification achieved outstanding results in the 2012 ImageNet competition and ignited the Deep Learning hype and became a baseline for the following deep architectures.

While image classification finds a plethora of applications in i.e. web services, many practical applications depend not only on knowing whether a given pattern is present or not in images, but on knowing the patterns’ exact location. This challenge is addressed by object detection, the detection of objects’ bounding boxes, and semantic segmentation, the classification of every pixel of a given image. One major trait of semantic segmentation is that pixel-accurate class differentiation allows for precise analysis of an image’s composition.

This talk aims to highlight characteristics and promising applications of semantic segmentation. For this purpose, the relation between semantic segmentation and other current image recognition methods is investigated, state-of-the-art methods for semantic segmentation are discussed and suitable hardware and frameworks for training and deployment of semantic segmentation networks are addressed.

Bio:

Felix Friedmann holds a diploma in Electrical Engineering from Technical University of Munich (TUM). Machine Learning was a strong focus of his studies at TUM. Next to his education, he worked for eight years as a part time software engineer and contributed to software projects in various industries. After graduating, he joined Audi Electronics Venture (AEV) GmbH in 2014, where he current works on the application of Deep Learning to autonomous driving. Felix is passionate about exploring new neural network architectures, deployment of AI on embedded platforms, and the manifold opportunities Deep Learning offers for automotive applications.

Daniel Weimer

Data Scientist

VW DataLab

Abstract: Deep Convolutional Neural Networks in industrial applications

Deep Learning is a new (and at the same time old) paradigm in machine learning which allows to extract features directly from huge amounts of raw data with a minimum of human interaction. This talk gives an introduction about deep learning in general and focuses an important application of deep convolutional neural networks (CNN) in industrial settings: optical quality control.

Bio:
Daniel studied electronic engineering with a focus on automation and robotics. After his master thesis in the field of autonomous driving and parallel computing he joined the Intelligent Production Systems group at University of Bremen where he worked in the field of smart production and logistics systems and headed the Computer Vision Lab. His PhD research in mathematics and computer science was awarded with a scholarship at Technion – Israel Institute of Technology in Haifa, Israel and compared deep learning and traditional machine learning in different applications. In his current position as a Project Manager and Data Scientist at Volkswagen Data:Lab he is responsible for Big Data, AI and digitalization projects along the whole automotive value chain.

Philipp Schapotschnikow

Freelancer

Brains for hire

Abstract: Automatic plant disease recognition using image data and machine learning

Modern image analysis together with state-of-the art machine learning techniques can be used to extract relevant information from images. In a pilot project, we have developed a system which successfully diagnoses early symptoms of plant diseases. This information is used to prevent outbreaks of epidemics in a greenhouse, thus increasing horticultural production efficiency.

Bio:

Philipp has studied Mathematics at the TUM, before doing PhD in Computational Materials Science. Afterwards, he worked in the field of combustion modelling. In 2012 he started working on Computer Vision, Machine Learning and Artificial Intelligence applications, where he completed a number of successful projects in manufacturing, shipbuilding, medical imaging and other areas.

Andreas Groll

Associate Professor

LMU

Abstract: Modeling Football Results Using Match-specific Covariates

A model for results of football matches is proposed that is able to take into account match-specifi c covariates as, for example, the total distance a team runs in the specifi c match. The model extends the Bradley-Terry model in many diff erent ways. In addition to the inclusion of covariates, it considers ordered response values and (possibly team-speci fic) home e ffects. Penalty terms are used to reduce the complexity of the model and to find clusters of teams with equal covariate effects.

Bio:

From 2001-2007 I studied business mathematics at the LMU Munich. Next, from April 2008 until November 2011 I was working as a Ph.D. student under supervision of Prof. Tutz at the Department of Statistics of the LMU. From March 2012 until February 2016 I was working at Prof. Biagini's chair at the Deparment of Mathematics of the LMU as postdoc and since 1st of March 2016 I'm working as postdoc at the chair of Prof. Kneib.

In the meantime, from October to December 2014 I was working as visiting professor for "Applied Statistics" (W2) at the TU Clausthal  and from April to September 2016 I was working as visiting professor "Seminar for Applied Stochastic" (W3) at the LMU. From January to March 2015 I visited the University of Stanford for a research period with the project partners Prof. Hastie and Prof. Tibshirani.

 

Josef Adersberger

CTO & co-founder QAware

QAware GmbH

Abstract: Time Series Processing with Spark
A lot of data is best represented as time series: Operational data, financial data and even in general-purpose DWHs the dominant dimension is time. The area of time series databases is growing rapidly but the support in Spark to process and analyze time series data is still in the early stages. We present Chronix Spark which provides a mature TimeSeriesRDD implementation for fast retrieval and complex analysis of time series data. Chronix Spark is open source software and battle-proved at a big German car manufacturer and a German telco. We show how we‘ve used Chronix Spark in a real-life project and provide some benchmarks how it has outperformed common time series databases like OpenTSDB, KairosDB and InfluxDB. We lift the curtain and deep-dive into the internals how we‘ve achieved this.

Bio:
Josef Adersberger has been a software engineering fanatic for over 10 years. He studied computer science in Rosenheim and Munich and holds a doctoral degree in software engineering. He is co-founder and CTO of QAware, a German software development company, and is a lecturer at several German universities. His main area of interest is cloud computing.

Felix Klein

Consultant Data Science

Alexander Thamm GmbH

Abstract: Connected Flipper

To make the Internet of Things and Big Data more tangible, we took a popular old analog product, a Pinball (Flipper) machine from 1987, equipped it with sensors and a raspberry pi and connected it to the internet. By processing the sensor data, it records individual player profiles, compares them with other players and, using machine learning algorithms, it separates different player types. To ease the recording of the highscore, a camera reads the score at the end of the match, which is then saved to a data base after confirmation.

Connected to a display it can also show a live visualization of the sensor data, and of course a highscore list. Since it is configured as a webserver, the database is accessible from any other computer, so the data can be analyzed independently from the current location of the Flipper and of the compute power limits of the pi. And as you have to keep the players up-to- date, the Flipper can automatically send e-mails with information about current highscores or your own profile stats.

Of course this project wasn’t just for fun. The procedure pretty much covers all parts of a Data Science project and was used to train the skills of our trainees.

Bio
For my diploma thesis in physics I joined the Medical Physics research group of Oliver Jäkel at the German Cancer Research Center (DKFZ, Heidelberg) to characterize a fiber optic based system for dosimetry. To analyze the large amounts of data produced during the measurements I started with scripting and analysis automatization. For my PhD I moved to the European Molecular Biology Laboratory (EMBL, Heidelberg) and joined the bioinformatics group of Wolfgang Huber. There I analyzed many biological data sets and developed bioinformatics tools which are published on Bioconductor (an R Framework for bioinformatics). With my interest in data analysis and big data I started as a consultant for Data Science at the Alexander Thamm GmbH in Munich. In the last one and a half year with the company I had the chance to work on many different data science projects in various industries. One of the projects was the connected flipper which I will present.

 

Karim Jedda

Data Scientist

ProSiebenSat.1 Media SE

Will be announced soon...

Daniel Petersson

Data Scientist

TrustYou GmbH

Title: Bayesian Semantics: Verb Sense Induction

Abstract: Bayesian models, such as topic modeling (LDA), have had enormous impact on natural language processing. Although deep neural architectures have improved performance on many tasks, there are still many problems that lend themselves best to a Bayesian treatment. In this talk, we will motivate and develop a Bayesian model for verb sense induction, based on the syntactic structures of those verbs. There are many common elements between the proposed model and topic modeling, but in a simpler overall system. This should provide a friendly introduction to core concepts of Bayesian analytics, and give experienced scientists insight to adapting these models to new domains.

Bio: Daniel Peterson is a Senior Data Scientist at TrustYou, and a PhD candidate at the University of Colorado. In his day job, he tries to present accurate, comprehensive representations of millions of hotel reviews to data partners like Google. In his PhD work, he tries to extend VerbNet, a semantic resource built on sound theoretical linguistics.

Giuseppe Casalicchio

PhD student

LMU

Abstract: Introducing an R package to interface the OpenML platform 
OpenML is an online machine learning platform where researchers can automatically log and share data, code, and experiments, and organize them online to work and collaborate more effectively. We present an R package to interface the OpenML platform and illustrate its usage both as a stand-alone package and in combination with the mlr machine learning package. We show how the OpenML package allows R users to easily search, download and upload machine learning datasets. Users can easily log their auto ML experiment results online, have them evaluated on the server, share them with others and download results from other researchers to build on them. Beyond ensuring reproducibility of results, it automates much of the drudge work, speeds up research, facilitates collaboration and increases user's visibility online. Currently, OpenML has 1,000+ registered users, 2,000+ unique monthly visitors, 2,000+ datasets, and 500,000+ experiments. The OpenML server currently supports client interfaces for Java, Python, .NET and R as well as specific interfaces for the WEKA, MOA, RapidMiner, scikit-learn and mlr toolboxes for machine learning.

Bio:
Giuseppe Casalicchio is a PhD student in computational statistics at the Ludwig Maximilian University of Munich. He earned a Bachelor and Masters degree in statistics in 2011 and 2013 respectively and worked as a statistical consultant at the statistical consulting unit 'StaBLab' from 2012 to 2015. Since 2014 he is giving R training courses for scientists and business clients at the 'Department of Statistics - Munich R Courses'. His research interests focus on optimizing machine learning algorithms, visualizing their predictive performance and gaining insights from predictive models.

/

Venue

QAware GmbH, Aschauer Straße 32 81549 München

www.qaware.de/

Sponsors

We are still looking for sponsors, fee is only at 500€!
To become a sponsor, get in touch with us


Agenda

  1. 08:30 AM - 09:00 AM : Welcome and come together

    Arrive and come together. Enjoy some Brezen and start some networking!

  2. 09:00 AM - 09:15 AM : Welcome Talk

    The board of the Munich Datageeks e.V. welcomes you!

  3. 09:15 AM - 10:15 AM : Key Note

    Karim Jedda

  4. 10:30 AM - 12:30 PM : 3 Talks
    • Josef Adersberger
      Time Series Processing with Spark
    • Andreas Groll
      Modeling Football Results Using Match-specific Covariates
    • Ralf Klüber
      Legally fine but creepy – Where responsibility of the data scientist comes into play
  5. 12:30 PM - 13:30 PM : Lunch

    Great Lunch provided by catering

  6. 13:30 PM - 15:30 PM : Deep Learning Session
    • Daniel Weimer
      Deep Convolutional Neural Networks in industrial applications
    • Felix Friedmann
      Semantic Segmentation: How to teach machines to understand images
  7. 15:30 PM - 16:00 PM : Coffee Break with Pub Quiz

    Enjoy some coffee and prove your skills at our Pub Quiz

  8. 16:00 PM - 18:00 PM : 3 Talks
    • Philipp Schapotschnikow
      Automatic plant disease recognition using image data and machine learning
    • Felix Klein
      Connected Flipper
    • Daniel Pettersson
      Bayesian Semantics: Verb Sense Induction
  9. 18:00 PM - 18:30 PM : Live Coding

    Giuseppe Casalicchio
    Introducing an R package to interface the OpenML platform

  10. 18:30 PM - 19:30 PM : Dinner and Awards

    Some more food and bear. Awards of the Pub Quiz.

  11. 19:30 PM - 00:00 AM : Networking Party

    Party hard with some music, meet interesting people and have some more beer!

Register

As this is a meetup, you can get your tickets via the official meetup website.

15€

limited tickets available

SOLD OUT!