This edition has 428 words which will take you about 2 minutes to read.
“Preferences are optional and subject to constraints, whereas constraints are neither optional nor subject to preferences.” – Marko Papic
Data Exchange podcast
- Making Boats Fly with Reinforcement Learning and Ray Nic Hohn (Chief Data Scientist, McKinsey/QuantumBlack Australia) describes how they used Ray RLlib to help design hydrofoils for the winning boat at the most recent America’s Cup sailing competition. We also discussed potential enterprise applications of reinforcement learning.
- How Companies Are Investing in AI Risk and Liability Minimization I get an update from Andrew Burt, Managing Partner of BNH.ai, the first law firm focused on AI compliance, risk mitigation, and related topics.
Upcoming Free Virtual Event
As the external co-chair for the Ray Summit, I’m excited about the outstanding program we’ve put together for developers, machine learning practitioners, data scientists, DevOps professionals, and architects. See you online in a few weeks!
Data & Machine Learning tools and infrastructure
- Model Monitoring Enables Robust Machine Learning Applications Paco Nathan and I detail key challenges in monitoring ML models, and we outlined key components of a model monitoring platform. This is a very active area with many startups rolling out new offerings. We believe that companies will gravitate towards holistic MLOps platforms that include model monitoring, as opposed to stitching together disparate components.
- Introducing Delta Live Tables Through a combination of declarative pipeline development, improved data reliability and cloud-scale production operations, DLT makes the ETL lifecycle easier. Data engineers will be able to leverage existing data pipelines by building production ETL pipelines while writing only SQL queries.
- Greykite: Linkedin’s new open source library for time series forecasting I’ve been experimenting with Greykite (paper, code) and I love its speed and flexibility. This is a relatively new release and the documentation can be somewhat overwhelming, but if you invest time learning it I believe you’ll end up using this library in production. At the very least you should add it to your toolbox alongside more mature options like Prophet.
- A gentle introduction to knowledge graphs, with sample use cases from search, data integration, and AI.
- immudb → Blockchain Concepts ∪ SQL A new TimeTravel feature allows you to run queries across your data’s change history.
- Ray Clusters provide users with a serverless experience Ray Clusters can automatically scale up and down based on an application’s resource demands while maximizing utilization and minimizing costs.
Funding Updates
- London (UK) startup, Faculty, raises £30M Series A
- Open source, data integration startup, Airbyte raises $26M Series A

Recommendations
- Portability: The Forgotten Right of GDPR
- A taxonomy of biases that occur in AI pipelines and applications
- Speech recognition systems that require no transcribed data
- Security and Open Source Software Distinguished engineers at Google describe a framework to strengthen and streamline the security of open source software.
- 2020 ACM Software System Award for Berkeley DB Congratulations to Margo Seltzer, Mike Olson, and Keith Bostic!
-
“There is no adequate rational market explanation for this performance” Is RenTec’s Medallion fund too good to be true? Cornell Capital Group combs through and unpacks performance metrics of the famed Medallion fund.
Closing Short:
The academic circle of life 😂 pic.twitter.com/9hm6EtNS03
— César A. Hidalgo (@cesifoti) May 20, 2021
If you enjoyed this newsletter please support our work by encouraging your friends and colleagues to subscribe: