Using machine learning to improve dialog flow in conversational applications

[A version of this post appears on the O’Reilly Radar.]

The O’Reilly Data Show Podcast: Alan Nichol on building a suite of open source tools for chatbot developers.

In this episode of the Data Show, I spoke with Alan Nichol, co-founder and CTO of Rasa, a startup that builds open source tools to help developers and product teams build conversational applications. About 18 months ago, there was tremendous excitement and hype surrounding chatbots, and while things have quieted lately, companies and developers continue to refine and define tools for building conversational applications. We spoke about the current state of chatbots, specifically about the types of applications developers are building today and how he sees conversational applications evolving in the near future.

As I described in a recent post, workflow automation will happen in stages. With that in mind, chatbots and intelligent assistants are bound to improveas underlying algorithms, technologies, and training data get better.

Here are some highlights from our conversation:

Chatbots and state machines

The first component is what we call natural language understanding, which typically means taking a short message that a user sends and extracting some meaning from it, which means turning it into structured data. In the case we talked about regarding the SQL database, if somebody asks, for example, ‘What was my ROI on my Facebook campaigns last month?’, the first thing you want to understand is that this is a data question and you want to assign it a label identifying it as a person, and they’re not saying hello, or goodbye, or thank you, but asking a specific question. Then you want to pick out those fields to help you create a query.
Continue reading “Using machine learning to improve dialog flow in conversational applications”

Building a natural language processing library for Apache Spark

[A version of this post appears on the O’Reilly Radar.]

The O’Reilly Data Show Podcast: David Talby on a new NLP library for Spark, and why model development starts after a model gets deployed to production.

When I first discovered and started using Apache Spark, a majority of the use cases I used it for involved unstructured text. The absence of libraries meant rolling my own NLP utilities, and, in many cases, implementing a machine learning library (this was pre deep learning, and MLlib was much smaller). I’d always wondered why no one bothered to create an NLP library for Spark when many people were using Spark to process large amounts of text. The recent, early success of BigDL confirms that users like the option of having native libraries.

In this episode of the Data Show, I spoke with David Talby of Pacific.AI, a consulting company that specializes in data science, analytics, and big data. A couple of years ago I mentioned the need for an NLP library within Spark to Talby; he not only agreed, he rounded up collaborators to build such a library. They eventually carved out time to build the newly released Spark NLP library. Judging by the reception received by BigDL and the number of Spark users faced with large-scale text processing tasks, I suspect Spark NLP will be a standard tool among Spark users.

Talby and I also discussed his work helping companies build, deploy, and monitor machine learning models. Tools and best practices for model development and deployment are just beginning to emerge—I summarized some of them in a recent post, and, in this episode, I discussed these topics with a leading practitioner.

Here are some highlights from our conversation:

The state of NLP in Spark

Here are your two choices today. Either you want to leverage all of the performance and optimization that Spark gives you, which means you want to stay basically within the JVM, and you want to use a Java-based library. In which case, you have options that include OpenNLP, which is open source, or Stanford NLP, which requires licensing in order to use in a commercial product. These are older and more academically oriented libraries. So, they have limitations in performance and what they do.
Continue reading “Building a natural language processing library for Apache Spark”

Language understanding remains one of AI’s grand challenges

[A version of this post appears on the O’Reilly Radar.]

The O’Reilly Data Show Podcast: David Ferrucci on the evolution of AI systems for language understanding.

Subscribe to the O’Reilly Data Show Podcast to explore the opportunities and techniques driving big data, data science, and AI. Find us on Stitcher, TuneIn, iTunes, SoundCloud, RSS.

In this episode of the Data Show, I spoke with David Ferrucci, founder of Elemental Cognition and senior technologist at Bridgewater Associates. Ferrucci served as principal investigator of IBM’s DeepQA project and led the Watson team that became champion of the Jeopardy! quiz show. Elemental Cognition (EC) is a research group focused on building an AI system that will be equipped with state-of-the-art natural language understanding technologies. Ferrucci envisions that EC will ship with foundational knowledge in many subject areas, but will be able to very quickly acquire knowledge in other (specialized) domains with the help of “human mentors.”

Having built and deployed several prominent AI systems through the years, I also wanted to get Ferrucci’s perspective on the evolution of AI technologies, and how enterprises can take advantage of all the exciting recent developments.

Here are some highlights from our conversation:
Continue reading “Language understanding remains one of AI’s grand challenges”

Natural language analysis using Hierarchical Temporal Memory

[A version of this post appears on the O’Reilly Radar.]

The O’Reilly Data Show Podcast: Francisco Webber on building HTM-based enterprise applications.

Subscribe to the O’Reilly Data Show Podcast to explore the opportunities and techniques driving big data, data science, and AI. Find us on Stitcher, TuneIn, iTunes, SoundCloud, RSS.

In this episode of the Data Show, I spoke with Francisco Webber, founder of Cortical.io, a startup that is applying tools based on Hierarchical Temporal Memory (HTM) to natural language understanding. While HTM has been around for more than a decade, there aren’t many companies that have released products based on it (at least compared to other machine learning methods). Numenta, an organization developing open source machine intelligence based on the biology of the neocortex, maintains a community site featuring showcase applications. Webber’s company has been building tools based on HTM and applying them to big text data in a variety of industries; financial services has been a particularly strong vertical for Cortical.

Here are some highlights from our conversation:
Continue reading “Natural language analysis using Hierarchical Temporal Memory”

Topic Models: Past, Present, Future

[A version of this post appears on the O’Reilly Radar blog.]

The O’Reilly Data Show Podcast: David Blei, co-creator of one of the most popular tools in text mining and machine learning.

I don’t remember when I first came across topic models, but I do remember being an early proponent of them in industry. I came to appreciate how useful they were for exploring and navigating large amounts of unstructured text, and was able to use them, with some success, in consulting projects. When an MCMC algorithm came out, I even cooked up a Java program that I came to rely on (up until Mallet came along).

I recently sat down with David Blei, co-author of the seminal paper on topic models, and who remains one of the leading researchers in the field. We talked about the origins of topic models, their applications, improvements to the underlying algorithms, and his new role in training data scientists at Columbia University.

Generating features for other machine learning tasks

Blei frequently interacts with companies that use ideas from his group’s research projects. He noted that people in industry frequently use topic models for “feature generation.” The added bonus is that topic models produce features that are easy to explain and interpret:

“You might analyze a bunch of New York Times articles for example, and there’ll be an article about sports and business, and you get a representation of that article that says this is an article and it’s about sports and business. Of course, the ideas of sports and business were also discovered by the algorithm, but that representation, it turns out, is also useful for prediction. My understanding when I speak to people at different startup companies and other more established companies is that a lot of technology companies are using topic modeling to generate this representation of documents in terms of the discovered topics, and then using that representation in other algorithms for things like classification or other things.”

Continue reading “Topic Models: Past, Present, Future”