Thu Jan 23 2025

Data & Drinks: Building Next-Gen Data Systems with Apache DataFusion

Raamstraat 7, 1016 XL Amsterdam
Thu Jan 23 202516:30
Raamstraat 7, 1016 X...
Join event

The first Data & Drinks event of 2025 will offer an exciting exploration of Apache DataFusion and its potential to shape the future of data systems! DataFusion features a full query planner, a columnar, streaming, multi-threaded, vectorized execution engine, and partitioned data sources.

This edition, which takes place on Thursday the 23rd of January at Xomnia's HQ in the heart of Amsterdam, features three insightful talks from Apache DataFusion contributors, showcasing the inner workings of the project and real-world applications, and providing an opportunity to explore the diverse possibilities Apache DataFusion unlocks for data-centric systems.

This free event includes dinner, drinks and a lot of networking opportunities with data professionals from Amsterdam and beyond.

Abstracts

Talk 1: Intro to DataFusion: Technology, Community, and Not Quite Enough Time by Andrew Lamb
Andrew delves into the architecture, modularity, and tradeoffs of Apache DataFusion, a high-performance Rust-based query engine, and how it's employed in building advanced data systems.

Talk 2: Building A Unified Compute Engine with Apache DataFusion by Mehmet Ozan Kabak
This talk explores how DataFusion’s modular architecture enables the vision of “unified” compute engines. Ozan will discuss how its extensibility addresses core engine-level limitations and empowers streamlined solutions for data and AI workloads, while also considering the challenges that remain.

Talk 3: Distributed Joins with DataFusion at Coralogix
A technical deep dive into optimizing performance for large-scale group aggregations with Apache DataFusion.

Biographies of the speakers:

  • Andrew Lamb, InfluxData, Staff Engineer, Apache DataFusion PMC Chair: After spending many years as C/C++ systems programmer (databases and compilers), and a stint working on Machine Learning startups (as one does), Andrew now works at InfluxData and a talented team of engineers on InfluxDB IOx, a new engine for time series data.
  • Mehmet Ozan Kabak, CEO & Co-founder of Synnada, Apache DataFusion PMC: After diving deep into distributed systems and big data throughout his career path through various startups and Meta, he now leads Synnada as CEO, bringing his Stanford Ph.D. and extensive machine learning expertise to build next-generation data infrastructure. His journey has consistently revolved around tackling large-scale distributed systems challenges and advancing the field of applied machine learning.
  • Daniël Heres, Apache DataFusion & Arrow PMC, Senior Software Engineer at Coralogix: Daniël Heres is Apache Arrow and DataFusion PMC, and Software Engineer Query Engine at Coralogix. He was previously Data / ML Engineer at Godatadriven and Data / ML Engineer at bol.com.
Agenda:

17:30-18.30: Walk-ins & dinner
18.30-18.35: Introduction to Xomnia
18:35-18:55: Intro to DataFusion: Technology, Community, and Not Quite Enough Time by Andrew Lamb
18:55-19:00: Break
19:00-19:25: Building A Unified Compute Engine with Apache DataFusion by Mehmet Ozan Kabak
19:25-19:35: Break
19:35-20.00: Distributed Joins with DataFusion at Coralogix
20:00-21.00: Borrel