Building a next-generation platform for deep learning

[A version of this post appears on the O’Reilly Radar.]

The O’Reilly Data Show Podcast: Naveen Rao on emerging hardware and software infrastructure for AI.

Subscribe to the O’Reilly Data Show Podcast to explore the opportunities and techniques driving big data, data science, and AI. Find us on Stitcher, TuneIn, iTunes, SoundCloud, RSS.

In this episode of the Data Show, I speak with Naveen Rao, VP and GM of the Artificial Intelligence Products Group at Intel. In an earlier episode, we learned that scaling current deep learning models requires innovations in both software and hardware. Through his startup Nervana (since acquired by Intel), Rao has been at the forefront of building a next generation platform for deep learning and AI.

I wanted to get his thoughts on what the future infrastructure for machine learning would look like. At least for now, we’re seeing a variety of approaches, and many companies are using heterogeneous processors (even specialized ones) and proprietary interconnects for deep learning. Nvidia and Intel Nervana are set to release processors that excel at both training and inference, but as Rao pointed out, at large-scale there are many considerations—including utilization, power consumption, and convenience—that come into play.

Here is a partial list of the items we discussed:

  • Deep learning in comparison to other machine learning algorithms
  • Key features and the current status of Intel Nervana’s Lake Cresttechnology
  • Deep learning frameworks and related software tools including Nervana Graph.
  • Building next-generation hardware and software components for deep learning
  • An overview of the major AI initiatives within Intel (including the establishment of a new AI Research Lab that Rao is leading)

Related resources:

Language understanding remains one of AI’s grand challenges

[A version of this post appears on the O’Reilly Radar.]

The O’Reilly Data Show Podcast: David Ferrucci on the evolution of AI systems for language understanding.

Subscribe to the O’Reilly Data Show Podcast to explore the opportunities and techniques driving big data, data science, and AI. Find us on Stitcher, TuneIn, iTunes, SoundCloud, RSS.

In this episode of the Data Show, I spoke with David Ferrucci, founder of Elemental Cognition and senior technologist at Bridgewater Associates. Ferrucci served as principal investigator of IBM’s DeepQA project and led the Watson team that became champion of the Jeopardy! quiz show. Elemental Cognition (EC) is a research group focused on building an AI system that will be equipped with state-of-the-art natural language understanding technologies. Ferrucci envisions that EC will ship with foundational knowledge in many subject areas, but will be able to very quickly acquire knowledge in other (specialized) domains with the help of “human mentors.”

Having built and deployed several prominent AI systems through the years, I also wanted to get Ferrucci’s perspective on the evolution of AI technologies, and how enterprises can take advantage of all the exciting recent developments.

Here are some highlights from our conversation:
Continue reading “Language understanding remains one of AI’s grand challenges”

Natural language analysis using Hierarchical Temporal Memory

[A version of this post appears on the O’Reilly Radar.]

The O’Reilly Data Show Podcast: Francisco Webber on building HTM-based enterprise applications.

Subscribe to the O’Reilly Data Show Podcast to explore the opportunities and techniques driving big data, data science, and AI. Find us on Stitcher, TuneIn, iTunes, SoundCloud, RSS.

In this episode of the Data Show, I spoke with Francisco Webber, founder of Cortical.io, a startup that is applying tools based on Hierarchical Temporal Memory (HTM) to natural language understanding. While HTM has been around for more than a decade, there aren’t many companies that have released products based on it (at least compared to other machine learning methods). Numenta, an organization developing open source machine intelligence based on the biology of the neocortex, maintains a community site featuring showcase applications. Webber’s company has been building tools based on HTM and applying them to big text data in a variety of industries; financial services has been a particularly strong vertical for Cortical.

Here are some highlights from our conversation:
Continue reading “Natural language analysis using Hierarchical Temporal Memory”

How big compute is powering the deep learning rocketship

[A version of this post appears on the O’Reilly Radar.]

The O’Reilly Data Show Podcast: Greg Diamos on building computer systems for deep learning and AI.

Subscribe to the O’Reilly Data Show Podcast to explore the opportunities and techniques driving big data, data science, and AI. Find us on Stitcher, TuneIn, iTunes, SoundCloud, RSS.

Specialists describe deep learning as akin to a rocketship that needs a really big engine (a model) and a lot of fuel (the data) in order to go anywhere interesting. To get a better understanding of the issues involved in building compute systems for deep learning, I spoke with one of the foremost experts on this subject: Greg Diamos, senior researcher at Baidu. Diamos has long worked to combine advances in software and hardware to make computers run faster. In recent years, he has focused on scaling deep learning to help advance the state-of-the-art in areas like speech recognition.

A big model, combined with big data, necessitates big compute—and at least at the bleeding edge of AI, researchers have gravitated toward high-performance computing (HPC) or supercomputer-like systems. Most practitioners use systems with multiple GPUs (ASICs or FPGAs) and software libraries that make it easy to run fast deep learning models on top of them.

In keeping with the convenience versus performance tradeoff discussions that play out in many enterprises, there are other efforts that fall more in the big data, rather than HPC, camp. In upcoming posts, I’ll highlight groups of engineers and data scientists who are starting to use these techniques and are creating software to run them on existing software and hardware infrastructure common in the big data community.

Continue reading “How big compute is powering the deep learning rocketship”

Introducing model-based thinking into AI systems

[A version of this post appears on the O’Reilly Radar.]

The O’Reilly Data Show Podcast: Vikash Mansinghka on recent developments in probabilistic programming.

Subscribe to the O’Reilly Data Show Podcast to explore the opportunities and techniques driving big data, data science, and AI. Find us on Stitcher, TuneIn, iTunes, SoundCloud, RSS.

In this episode I spoke with Vikash Mansinghka, research scientist at MIT, where he leads the Probabilistic Computing Project, and co-founder of Empirical Systems. I’ve long wanted to introduce listeners to recent developments in probabilistic programming, and I found the perfect guide in Mansinghka.

Probability is the mathematical language to represent, model, and manipulate uncertainty, and probabilistic programming provides frameworks for representing probabilistic models as computer programs. This family of tools and techniques distinguishes between models and the inference procedures, and in the process, encourages the kind of model-based thinking that may inform the design of future artificial intelligence systems and supplement current data and compute-intensive systems that rely primarily on large-scale pattern recognition.

Below are highlights from my conversation with Mansinghka:
Continue reading “Introducing model-based thinking into AI systems”

The technology behind self-driving vehicles

[A version of this post appears on the O’Reilly Radar.]

The O’Reilly Data Show Podcast: Shaoshan Liu on perception, knowledge, reasoning, and planning for autonomous cars.

Shaoshan Liu takes a deep dive into this topic in his recent post “Creating autonomous vehicle systems.”

Subscribe to the O’Reilly Data Show Podcast

to explore the opportunities and techniques driving big data, data science, and AI. Find us on Stitcher, TuneIn, iTunes, SoundCloud, RSS.

Ask a random person for an example of an AI system and chances are he or she will name self-driving vehicles. In this episode of the O’Reilly Data Show, I sat down with Shaoshan Liu, co-founder of PerceptIn and previously the senior architect (autonomous driving) at Baidu USA. We talked about the technology behind self-driving vehicles, their reliance on rule-based decision engines, and deploying large-scale deep learning systems.

Here are some highlights from our conversation:
Continue reading “The technology behind self-driving vehicles”