Timelapse data exploration of NYC taxi rides

Back in 2014 the city of New York put online a dataset with yellow cab rides comprising a full year of data. Back then I remember struggling quite a bit with managing the sheer volume of the dataset involved, trying out various alternatives for reading in the full dataset. After a few years SAP introduced an “Express edition” of their HANA in-memory database which allowed you to run a 32 GB database just from your own hardware. That was enough to load a full years’ worth of data and be able to analyze it using a standard SQL approach. Read More ›

Explainable Forecasting Using HANA and APL

This is part 2 in a two-part series of blogs on large-scale and explainable forecasting using APL. In part 1 I have outlined a way to utilize the APL library for in-database training of a regression model in HANA in order to be used together with an external Node.js inference script. In this part of the blog I will dive deeper into built-in functionality to retrieve insights into a trained model which is called the ‘model debrief’. Read More ›

Large-scale Forecasting Using HANA, APL and Node.js

Last year I became involved in a project for a retailer based in The Netherlands who had finished construction of a new distribution center. This new innovative DC is fully mechanized and is operated with a minimal amount of personnel, which is different from conventional distribution centers where sudden large order spikes are fulfilled by having more staff pick these orders in parallel. This blog post describes my work of developing a large-scale machine learning model to forecasting the goods movements between various parts of the supply chain. Read More ›