Where Do Machine Learning Engineers Work?
About five years ago we published a post that highlighted the emergence of a role focused on making data science work in production. At the time we noticed job postings (mainly in the SF Bay Area) that used the title “machine learning engineer” to describe individuals skilled at making data products and machine learning models work in production.
To understand the current state of the “machine learning engineer” role, I turned to Diffbot, the largest publicly available knowledge graph of the world wide web. I examine what companies and industries employ the most machine learning engineers, which geographic regions they are located, and what skills individuals and employers are highlighting. This is a data-rich post where I also compare machine learning engineers to more established roles (data scientist; data engineer; cloud engineer).
Data Exchange podcast
- Machine Learning at Discord: Gaurav Chakravorty, is a Senior Manager at Discord, where he leads the team responsible for machine learning models for search and notification. They use many interesting ML techniques at scale including graph neural networks, privacy-preserving machine learning, and large language models.
- Applications of Knowledge Graphs: Mike Tung is CEO of Diffbot, a startup that crawls the entire web to produce one my favorite new tools – the most comprehensive knowledge graph of the world wide web.
- Democratizing NLP: My recent conversation with Moshe Wasserblat, Senior Principal Engineer at Intel, where he serves as a Research Manager focused on NLP and Deep Learning. Moshe and team have recently been building tools to help analysts and other non-expert NLP users, fine tune, test and customize NLP models for their specific domains and use cases.

Data & Machine Learning Tools and Infrastructure
- The Databricks Lakehouse Platform: A must-watch CIDR keynote by Databricks co-founder and Stanford University faculty member, Matei Zaharia.
- Distributed deep learning with Ray Train is now in Beta: a much needed library that simplifies distributed training for several popular frameworks.
- Continued Efficiency Improvements for ML: In this list of trends from Google Research is an observation that advances in computer hardware design coupled with advances in models and algorithms, means model training efficiency will improve “by a significant multiplicative factor”. Also see my conversation with MIT’s Neil Thompson, on the Computational Limits of Deep Learning.
If you enjoyed this newsletter please support our work by encouraging your friends and colleagues to subscribe: