This year’s Data + AI Summit is packed with more than 250 community sessions and keynotes presented by influential leaders in data and AI such as Peter Norvig, Daphne Koller, Andrew Ng and Hilary Mason, as well as some of the founders of Databricks, including Ali Ghodsi, Reynold Xin and Matei Zaharia. Sessions will span all key and emerging topics in data engineering and architecture, machine learning and AI. Here are just a few of the sessions I’m looking forward to:
- Data Boards — A Collaborative and Interactive Space for Data Science: I’ve been following this MIT project since its inception. It combines a sophisticated backend with a highly engaging user interface.
- The Modern Metadata Platform — What, Why and How: This is a must-attend session for data engineers and architects. Imagine how much more value data teams can deliver if they collect and unlock an even richer set of metadata.
- Cleanlab — An Open-Source Tool to Find and Fix Errors in ML Data Sets: One of my favorite new open source projects automatically finds and fixes errors in your machine learning data sets. By reducing manual work to resolve data issues, Cleanlab makes it easier to train ML models on partially mislabeled data sets.
- Adversarial AI—The Nature of the Threat, Impacts, and Mitigation Strategies: A review of current mitigation methods and approaches, and suggestions for dealing with the adversarial AI exploits.
- Security Best Practices for Lakehouse: Organizations adopting the lakehouse architecture will need to take advantage of new security features. This important presentation outlines how they can improve their overall security posture.
Ben Lorica
Technical Advisor, Databricks