This edition has 510 words which will take you about 3 minutes to read.
“The thing about machine learning scientists is that they never admit defeat because all of their problems can be solved with more data.” – William Tunstall-Pedoe
Data Exchange podcast
- Why You Should Optimize Your Deep Learning Inference Platform As companies deploy deep learning to critical products and services, the number of predictions that models have to render can easily reach millions per day (even hundreds of trillions, in the case of Facebook). I speak with Yonatan Geifman, CEO and co-founder of Deci, as well as with Ran El-Yaniv, Chief Scientist and co-founder of Deci and Professor of Computer Science at Technion. We deep dive into tools for systematically optimizing inference platforms.
- The Future of Machine Learning Lies in Better Abstractions Travis Addair previously led the team at Uber that was responsible for building Uber’s deep learning infrastructure. Travis is deeply involved with two popular open source projects related to deep learning: he is maintainer of Horovod, a distributed deep learning training framework, and he is a co-maintainer of Ludwig, a toolbox that allows users to train and test deep learning models without the need to write code.

Data & Machine Learning tools and infrastructure
- Applications of Reinforcement Learning: recent examples from Fortune 1000 companies I look at RL usage at some of the largest companies in the US, and as an added bonus we put together an accompanying summary poster/infographic.
- Knowledge Graphs at JPMorgan Chase A first look into a novel end-to-end neural entity linking model.
- Introducing the Ray Provider for Apache Airflow Turn a Python script into a reproducible pipeline with the distributed computing platform Ray and Airflow orchestration.
- Prefect Speaking of workflow tools, I’ve been coming across fans of this impressive open source workflow management system.
- The evolution of the lakehouse “We believe that the data lakehouse architecture presents an opportunity comparable to the one we saw during early years of the data warehouse market.” A new post co-written by Bill Inmon, widely regarded as the father of the data warehouse.

Recommendations
- My picks for the upcoming Data+AI Summit This FREE Databricks event features 200+ online presentations and is next week!
- The Missing Semester of your CS Education An MIT course that helps you master tools including the command-line, text editors, version control systems, and much more.
- Why companies should consider establishing an Internal Review Board for AI
- R for applied epidemiology and public health: a free e-book
- Short essay on Neural Algorithmic Reasoning A new paper from DeepMind explains how neural networks can be the basis and core for learning novel and old algorithms. The induction of algorithms from data will have profound implications in computer science, and this essay is a good introduction to research into combining deep learning and algorithms.
- How to visually explore any CSV file as a knowledge graph
Closing Short → The return of in-person events:
❛ Convenience and time savings were key factors for using video for events in our survey, but respondents in every country were adamant on their preference that events like concerts and religious services be in-person going forward. Virtual options were welcomed for those who needed a distraction or when in-person was not available.
Recent @Zoom survey: "We asked 7,689 people across 10 countries how they used video during the pandemic and what they wanted the post-pandemic world to look like"#virtualevent #virtualevents #eventplanner #livestream #conference pic.twitter.com/rO8XPDUwqo
— Ben Lorica 罗瑞卡 (@bigdata) May 14, 2021
If you enjoyed this newsletter please support our work by encouraging your friends and colleagues to subscribe: