“Time matters most when decisions are irreversible.” – Peter Bernstein
Data Exchange podcast
- Changes to the data science role and to data science tools Sean Taylor is a Data Science Manager at Lyft, and was previously a research scientist and manager at Facebook where he was instrumental in the creation and release of Prophet, a very popular open source library for time-series forecasting.
- What’s new in data engineering Jenn Webb hosts a mid year panel with Jesse Anderson and me.

Data & Machine Learning tools and infrastructure
- An Enterprise Software Roadmap for Sky Computing Assaf Araki and I explain how the enterprise software market will evolve alongside a more commoditized version of cloud computing.
- Revisiting the Data Mesh
- Facebook’s Blender Bot 2.0 A new, open source chatbot that builds long-term memory and adds to its knowledge by searching the internet. Facebook has a related open source project – ParlAI – an open-source software platform for dialog research implemented in Python.
- gorse, an open source recommendation system that incorporates AutoML and horizontal scaling.
- Using AntiPatterns to avoid MLOps Mistakes A taxonomy of recurring anti-patterns (defective practices and methodologies) that surfaced while deploying machine learning at BNY Mellon.

Recommendations
- 2021 Stack Overflow Developer Survey Results
- The Robotics Startup That Got Away (From Amazon)
- The analytical application stack In the very early days of “data science” here in the SF Bay Area, the first “data scientists” in places like Linkedin, Twitter, etc. were very much focused on building data products. In recent years the emphasis has been on internal data tooling and internal analytics (rightfully so). This is a welcome overview on opportunities and gaps in the suite of tools for building data applications.
- Growing open-source: from Torch to PyTorch Soumith Chintala explains how PyTorch’s focus on usability allowed them to grow their user base very quickly.
- How to retract an open dataset Open datasets are impressive for the advancement of machine learning research. In this important survey and study, researchers from Princeton provide stewardship guidelines that go beyond data creation but span the lifecycle of a dataset.
Closing Short: #Mesmerizing
greatest volley in the history of the world pic.twitter.com/TedE0vs03M
— YS (@NYinLA2121) August 2, 2021
If you enjoyed this newsletter please support our work by encouraging your friends and colleagues to subscribe: