Natural language analysis using Hierarchical Temporal Memory

[A version of this post appears on the O’Reilly Radar.]

The O’Reilly Data Show Podcast: Francisco Webber on building HTM-based enterprise applications.

Subscribe to the O’Reilly Data Show Podcast to explore the opportunities and techniques driving big data, data science, and AI. Find us on Stitcher, TuneIn, iTunes, SoundCloud, RSS.

In this episode of the Data Show, I spoke with Francisco Webber, founder of Cortical.io, a startup that is applying tools based on Hierarchical Temporal Memory (HTM) to natural language understanding. While HTM has been around for more than a decade, there aren’t many companies that have released products based on it (at least compared to other machine learning methods). Numenta, an organization developing open source machine intelligence based on the biology of the neocortex, maintains a community site featuring showcase applications. Webber’s company has been building tools based on HTM and applying them to big text data in a variety of industries; financial services has been a particularly strong vertical for Cortical.

Here are some highlights from our conversation:
Continue reading

Deep learning that’s easy to implement and easy to scale

[A version of this post appears on the O’Reilly Radar.]

The O’Reilly Data Show Podcast: Anima Anandkumar on MXNet, tensor computations and deep learning, and techniques for scaling algorithms.

Subscribe to the O’Reilly Data Show Podcast to explore the opportunities and techniques driving big data, data science, and AI. Find us on Stitcher, TuneIn, iTunes, SoundCloud, RSS.

In this episode of the Data Show, I spoke with Anima Anandkumar, a leading machine learning researcher, and currently a principal research scientist at Amazon. I took the opportunity to get an update on the latest developments on the use of tensors in machine learning. Most of our conversation centered around MXNet—an open source, efficient, scalable deep learning framework. I’ve been a fan of MXNet dating back to when it was a research project out of CMU and UW, and I wanted to hear Anandkumar’s perspective on its recent progress as a framework for enterprises and practicing data scientists.

Here are some highlights from our conversation:
Continue reading

Deep learning for Apache Spark

[A version of this post appears on the O’Reilly Radar.]

The O’Reilly Data Show Podcast: Jason Dai on BigDL, a library for deep learning on existing data frameworks.

Subscribe to the O’Reilly Data Show Podcast to explore the opportunities and techniques driving big data, data science, and AI. Find us on Stitcher, TuneIn, iTunes, SoundCloud, RSS.

In this episode of the Data Show, I spoke with Jason Dai, CTO of big data technologies at Intel, and co-chair of Strata + Hadoop World Beijing. Dai and his team are prolific and longstanding contributors to the Apache Spark project. Their early contributions to Spark tended to be on the systems side and included Netty-based shuffle, a fair-scheduler, and the “yarn-client” mode. Recently, they have been contributing tools for advanced analytics. In partnership with major cloud providers in China, they’ve written implementations of algorithmic building blocks and machine learning models that let Apache Spark users scale to extremely high-dimensional models and large data sets. They achieve scalability by taking advantage of things like data sparsity and Intel’s MKL software. Along the way, they’ve gained valuable experience and insight into how companies deploy machine learning models in real-world applications.

When I predicted that 2017 would be the year when the big data and data science communities start exploring techniques like deep learning in earnest, I was relying on conversations with many members of those communities. I also knew that Dai and his team were at work on a distributed deep learning library for Apache Spark. This evolution from basic infrastructure, to machine learning applications, and now applications backed by deep learning models is to be expected.

Once you have a platform and a team that can deploy machine learning models, it’s natural to begin exploring deep learning. As I’ve highlighted in recent episodes of this podcast (here and here), companies are beginning to apply deep learning to time-series data, event data, text, and images. Many of these same companies have already invested in big data technologies (many of which are open source) and employ data scientists and data engineers who are comfortable with these tools.
Continue reading

The key to building deep learning solutions for large enterprises

[A version of this post appears on the O’Reilly Radar.]

The O’Reilly Data Show Podcast: Adam Gibson on the importance of ROI, integration, and the JVM.

Subscribe to the O’Reilly Data Show Podcast to explore the opportunities and techniques driving big data, data science, and AI. Find us on Stitcher, TuneIn, iTunes, SoundCloud, RSS.

As data scientists add deep learning to their arsenals, they need tools that integrate with existing platforms and frameworks. This is particularly important for those who work in large enterprises. In this episode of the Data Show, I spoke with Adam Gibson, co-founder and CTO of Skymind, and co-creator of Deeplearning4J (DL4J). Gibson has spent the last few years developing the DL4J library and community, while simultaneously building deep learning solutions and products for large enterprises.

Here are some highlights:

Continue reading

How big compute is powering the deep learning rocketship

[A version of this post appears on the O’Reilly Radar.]

The O’Reilly Data Show Podcast: Greg Diamos on building computer systems for deep learning and AI.

Subscribe to the O’Reilly Data Show Podcast to explore the opportunities and techniques driving big data, data science, and AI. Find us on Stitcher, TuneIn, iTunes, SoundCloud, RSS.

Specialists describe deep learning as akin to a rocketship that needs a really big engine (a model) and a lot of fuel (the data) in order to go anywhere interesting. To get a better understanding of the issues involved in building compute systems for deep learning, I spoke with one of the foremost experts on this subject: Greg Diamos, senior researcher at Baidu. Diamos has long worked to combine advances in software and hardware to make computers run faster. In recent years, he has focused on scaling deep learning to help advance the state-of-the-art in areas like speech recognition.

A big model, combined with big data, necessitates big compute—and at least at the bleeding edge of AI, researchers have gravitated toward high-performance computing (HPC) or supercomputer-like systems. Most practitioners use systems with multiple GPUs (ASICs or FPGAs) and software libraries that make it easy to run fast deep learning models on top of them.

In keeping with the convenience versus performance tradeoff discussions that play out in many enterprises, there are other efforts that fall more in the big data, rather than HPC, camp. In upcoming posts, I’ll highlight groups of engineers and data scientists who are starting to use these techniques and are creating software to run them on existing software and hardware infrastructure common in the big data community.

Continue reading

2017 will be the year the data science and big data community engage with AI technologies

[A version of this post appears on the O’Reilly Radar.]

The O’Reilly Data Show Podcast: A look at some trends we’re watching in 2017.

Subscribe to the O’Reilly Data Show Podcast to explore the opportunities and techniques driving big data, data science, and AI. Find us on Stitcher, TuneIn, iTunes, SoundCloud, RSS.

This episode consists of excerpts from a recent talk I gave at a conference commemorating the end of the UC Berkeley AMPLab project. This section pertained to some recent trends in Data and AI. For a complete list of trends we’re watching in 2017, as well as regular doses of highly curated resources, subscribe to our Data and AI newsletters.

As 2016 draws to a close, I see the big data and data science community beginning to engage with AI-related technologies, particularly deep learning. By early next year, there will be new tools that specifically cater to data scientists and data engineers who aren’t necessarily experts in these techniques. While the AI research community continues to tackle fundamental problems, these new sets of tools will make some recent breakthroughs in AI much more accessible and convenient to use for the data community.

Related resources:

Introducing model-based thinking into AI systems

[A version of this post appears on the O’Reilly Radar.]

The O’Reilly Data Show Podcast: Vikash Mansinghka on recent developments in probabilistic programming.

Subscribe to the O’Reilly Data Show Podcast to explore the opportunities and techniques driving big data, data science, and AI. Find us on Stitcher, TuneIn, iTunes, SoundCloud, RSS.

In this episode I spoke with Vikash Mansinghka, research scientist at MIT, where he leads the Probabilistic Computing Project, and co-founder of Empirical Systems. I’ve long wanted to introduce listeners to recent developments in probabilistic programming, and I found the perfect guide in Mansinghka.

Probability is the mathematical language to represent, model, and manipulate uncertainty, and probabilistic programming provides frameworks for representing probabilistic models as computer programs. This family of tools and techniques distinguishes between models and the inference procedures, and in the process, encourages the kind of model-based thinking that may inform the design of future artificial intelligence systems and supplement current data and compute-intensive systems that rely primarily on large-scale pattern recognition.

Below are highlights from my conversation with Mansinghka:
Continue reading