Title: Cardinality Estimation in the Learned Systems Era
Speaker: Andreas Kipf (MIT)
Time: 3:00 pm, November 07 (Monday), 2022
Location: N/A
Online link: provided upon request or see the seminar email.


In this talk, we introduce a new deep learning approach to cardinality estimation, which is the core problem in cost-based query optimization. We propose a new neural network model (MSCN) that can capture correlations between columns. Trained with past queries, our model can predict the cardinalities of future queries and significantly enhances the quality of cardinality estimation. We also briefly discuss follow-up work on a new loss function (Flow-Loss) that improves MSCN’s impact on resulting query plans by focusing the model capacity on the estimates that matter.

Bio: Andreas Kipf is an applied scientist at AWS Redshift. Previously, he was a postdoc researcher in the MIT Data Systems Group where he worked with Prof. Tim Kraska. His interests are in improving systems with machine learning with a focus on index structures, storage layouts, and query optimization. Andreas earned his PhD at TUM where he worked with Prof. Alfons Kemper and Prof. Thomas Neumann. During his PhD, he interned with Google in Mountain View & Zurich to work on query-driven materialization and lightweight secondary indexing. Andreas won the 2016 SIGMOD Best Demonstration Award and the 2017 SIGMOD Programming Contest. Outside of work, Andreas is an avid triathlete and loves bikepacking.