A lot of data is best represented as time series: Operational data, financial data and even in general-purpose DWHs the dominant dimension is time. The area of time series databases is growing rapidly but the support in Spark to process and analyze time series data is still in the early stages. We present Chronix Spark which provides a mature TimeSeriesRDD implementation for fast retrieval and complex analysis of time series data. Chronix Spark is open source software and battle-proved at a big German car manufacturer and a German telco. We show how we‘ve used Chronix Spark in a real-life project and provide some benchmarks how it has outperformed common time series databases like OpenTSDB, KairosDB and InfluxDB. We lift the curtain and deep-dive into the internals how we‘ve achieved this.
Josef Adersberger has been a software engineering fanatic for over 10 years. He studied computer science in Rosenheim and Munich and holds a doctoral degree in software engineering. He is co-founder and CTO of QAware, a German software development company, and is a lecturer at several German universities. His main area of interest is cloud computing.