Machine Learning Trends You Need to Know
In a new post with Assaf Araki of Intel Capital, we collect resources and insights to help you navigate the AI landscape. We believe machine learning is a platform play and companies will use at most two platforms to manage the entire pipeline: one platform to manage the exploration phase, and another platform to manage deployment and operations.

Data Exchange podcast
- Dataflow Automation: Jeremiah Lowin is co-creator of Prefect, one of the most popular frameworks for data and workflow orchestration. We discuss major changes to Prefect 2.0, their shift towards treating “code as workflows”, and current data engineering challenges faced by machine learning teams.
- Machine Learning Model Observability: Superwise co-founder and CEO Oren Razon explains what you need to deploy, maintain and improve machine learning models in production.

Free Report: State of Workflow Orchestration
In the past decade, several notable open source and SaaS orchestration systems have been released and a related group of startups has raised over $450 million in funding. Our survey attracted nearly 600 respondents, and this report examines the tools respondents use, the features they value, and the challenges they face.
Free Report: Identity Management Survey
The proliferation of software services and platforms comes at a time when security threats and data breaches continue to grow. To understand current trends and issues in identity security and management, we conducted an industry survey focused on challenges, key features, use cases, and emerging technologies.

Download
Introducing fastdup
A decade after deep learning systems first topped key computer vision (CV) benchmarks, computer vision applications and use cases can be found across all sectors. Many novel use cases are emerging, for example autonomous vehicles, 3D reconstruction of homes from images, robots that perform many different tasks, etc.
But while computer vision models have become easier to build and tune, progress in data infrastructure for CV applications has lagged behind. As a consequence computer vision teams struggle to incorporate basic data management features pertaining to data quality (deduplication, anomaly detection), search, and analytics.
My friends Danny Bickson and Amir Alush have taken a first step in tackling these challenges by writing and sharing an amazing free software tool called fastdup. In line with its name, it’s fast: it is written in C++ and can easily handle tens of millions of images. It is simple to run using a single Python command. It runs in your cloud account or on premise (no need to upload your data to a vendor’s cloud account!). Finally, it is designed to be accurate and to output meaningful insights.
If you enjoyed this newsletter please support our work by encouraging your friends and colleagues to subscribe: