Companies in China are moving quickly to embrace AI technologies

[A version of this post appears on the O’Reilly Radar.] The O’Reilly Data Show Podcast: Jason Dai on the first year of BigDL and AI in China. In this episode of the Data Show, I spoke with Jason Dai, CTO of Big Data Technologies at Intel, and one of my co-chairs for the AI ConferenceContinue reading “Companies in China are moving quickly to embrace AI technologies”

Building a natural language processing library for Apache Spark

[A version of this post appears on the O’Reilly Radar.] The O’Reilly Data Show Podcast: David Talby on a new NLP library for Spark, and why model development starts after a model gets deployed to production. When I first discovered and started using Apache Spark, a majority of the use cases I used it forContinue reading “Building a natural language processing library for Apache Spark”

The state of machine learning in Apache Spark

[A version of this post appears on the O’Reilly Radar.] The O’Reilly Data Show Podcast: Ion Stoica and Matei Zaharia explore the rich ecosystem of analytic tools around Apache Spark. In this episode of the Data Show, we look back to a recent conversation I had at the Spark Summit in San Francisco with IonContinue reading “The state of machine learning in Apache Spark”

Scaling machine learning

[A version of this post appears on the O’Reilly Radar.] The O’Reilly Data Show Podcast: Reza Zadeh on deep learning, hardware/software interfaces, and why computer vision is so exciting. Subscribe to the O’Reilly Data Show Podcast to explore the opportunities and techniques driving big data, data science, and AI. Find us on Stitcher, TuneIn, iTunes,Continue reading “Scaling machine learning”

Deep learning for Apache Spark

[A version of this post appears on the O’Reilly Radar.] The O’Reilly Data Show Podcast: Jason Dai on BigDL, a library for deep learning on existing data frameworks. Subscribe to the O’Reilly Data Show Podcast to explore the opportunities and techniques driving big data, data science, and AI. Find us on Stitcher, TuneIn, iTunes, SoundCloud,Continue reading “Deep learning for Apache Spark”

Building the next-generation big data analytics stack

[A version of this post appears on the O’Reilly Radar.] The O’Reilly Data Show Podcast: Michael Franklin on the lasting legacy of AMPLab. Subscribe to the O’Reilly Data Show Podcast to explore the opportunities and techniques driving big data, data science, and AI. Find us on Stitcher, TuneIn, iTunes, SoundCloud, RSS. In this episode IContinue reading “Building the next-generation big data analytics stack”

Why businesses should pay attention to deep learning

[A version of this post appears on the O’Reilly Radar.] The O’Reilly Data Show Podcast: Christopher Nguyen on the early days of Apache Spark, deep learning for time-series and transactional data, innovation in China, and AI. Subscribe to the O’Reilly Data Show Podcast to explore the opportunities and techniques driving big data, data science, andContinue reading “Why businesses should pay attention to deep learning”

Data architectures for streaming applications

[A version of this post appears on the O’Reilly Radar.] The O’Reilly Data Show Podcast: Dean Wampler on streaming data applications, Scala and Spark, and cloud computing. Subscribe to the O’Reilly Data Show Podcast to explore the opportunities and techniques driving big data, data science, and AI. Find us on Stitcher, TuneIn, iTunes, SoundCloud, RSS.Continue reading “Data architectures for streaming applications”

Structured streaming comes to Apache Spark 2.0

[A version of this post appears on the O’Reilly Radar.] The O’Reilly Data Show Podcast: Michael Armbrust on enabling users to perform streaming analytics, without having to reason about streaming. Subscribe to the O’Reilly Data Show Podcast to explore the opportunities and techniques driving big data, data science, and AI. Find us on Stitcher, TuneIn,Continue reading “Structured streaming comes to Apache Spark 2.0”

Using Apache Spark to predict attack vectors among billions of users and trillions of events

[A version of this post appears on the O’Reilly Radar.] The O’Reilly Data Show podcast: Fang Yu on data science in security, unsupervised learning, and Apache Spark. Subscribe to the O’Reilly Data Show Podcast to explore the opportunities and techniques driving big data and data science: Stitcher, TuneIn, iTunes, SoundCloud, RSS. In this episode ofContinue reading “Using Apache Spark to predict attack vectors among billions of users and trillions of events”