Machine learning on encrypted data

[A version of this post appears on the O’Reilly Radar.] The O’Reilly Data Show Podcast: Alon Kaufman on the interplay between machine learning, encryption, and security. In this episode of the Data Show, I spoke with Alon Kaufman, CEO and co-founder of Duality Technologies, a startup building tools that will allow companies to apply analyticsContinue reading “Machine learning on encrypted data”

Simplifying machine learning lifecycle management

[A version of this post appears on the O’Reilly Radar.] The O’Reilly Data Show Podcast: Harish Doddi on accelerating the path from prototype to production. In this episode of the Data Show, I spoke with Harish Doddi, co-founder and CEO of Datatron, a startup focused on helping companies deploy and manage machine learning models. AsContinue reading “Simplifying machine learning lifecycle management”

We need to build machine learning tools to augment machine learning engineers

We need to build machine learning tools to augment our machine learning engineers. In this post, I share slides and notes from a talk I gave in December 2017 at the Strata Data Conference in Singapore offering suggestions to companies that are actively deploying products infused with machine learning capabilities. Over the past few years,Continue reading “We need to build machine learning tools to augment machine learning engineers”

How machine learning will accelerate data management systems

[A version of this post appears on the O’Reilly Radar.] The O’Reilly Data Show Podcast: Tim Kraska on why ML will change how we build core algorithms and data structures. In this episode of the Data Show, I spoke with Tim Kraska, associate professor of computer science at MIT. To take advantage of big data,Continue reading “How machine learning will accelerate data management systems”

Machine learning at Spotify: You are what you stream

[A version of this post appears on the O’Reilly Radar.] The O’Reilly Data Show Podcast: Christine Hung on using data to drive digital transformation and recommenders that increase user engagement. In this episode of the Data Show, I spoke with Christine Hung, head of data solutions at Spotify. Prior to joining Spotify, she led dataContinue reading “Machine learning at Spotify: You are what you stream”

Building a natural language processing library for Apache Spark

[A version of this post appears on the O’Reilly Radar.] The O’Reilly Data Show Podcast: David Talby on a new NLP library for Spark, and why model development starts after a model gets deployed to production. When I first discovered and started using Apache Spark, a majority of the use cases I used it forContinue reading “Building a natural language processing library for Apache Spark”

How companies can navigate the age of machine learning

To become a “machine learning company,” you need tools and processes to overcome challenges in data, engineering, and models. Over the last few years, the data community has focused on gathering and collecting data, building infrastructure for that purpose, and using data to improve decision-making. We are now seeing a surge in interest in advancedContinue reading “How companies can navigate the age of machine learning”

The state of machine learning in Apache Spark

[A version of this post appears on the O’Reilly Radar.] The O’Reilly Data Show Podcast: Ion Stoica and Matei Zaharia explore the rich ecosystem of analytic tools around Apache Spark. In this episode of the Data Show, we look back to a recent conversation I had at the Spark Summit in San Francisco with IonContinue reading “The state of machine learning in Apache Spark”

The current state of applied data science

[A version of this post appears on the O’Reilly Radar.] Recent trends in practical use and a discussion of key bottlenecks in supervised machine learning. As we enter the latter part of 2017, it’s time to take a look at the common challenges faced by companies interested in using data science and machine learning (ML).Continue reading “The current state of applied data science”

A framework for building and evaluating data products

[A version of this post appears on the O’Reilly Radar.] The O’Reilly Data Show Podcast: Pinterest data scientist Grace Huang on lessons learned in the course of machine learning product launches. Subscribe to the O’Reilly Data Show Podcast to explore the opportunities and techniques driving big data, data science, and AI. Find us on Stitcher,Continue reading “A framework for building and evaluating data products”