Using machine learning to improve dialog flow in conversational applications

[A version of this post appears on the O’Reilly Radar.]

The O’Reilly Data Show Podcast: Alan Nichol on building a suite of open source tools for chatbot developers.

In this episode of the Data Show, I spoke with Alan Nichol, co-founder and CTO of Rasa, a startup that builds open source tools to help developers and product teams build conversational applications. About 18 months ago, there was tremendous excitement and hype surrounding chatbots, and while things have quieted lately, companies and developers continue to refine and define tools for building conversational applications. We spoke about the current state of chatbots, specifically about the types of applications developers are building today and how he sees conversational applications evolving in the near future.

As I described in a recent post, workflow automation will happen in stages. With that in mind, chatbots and intelligent assistants are bound to improveas underlying algorithms, technologies, and training data get better.

Here are some highlights from our conversation:

Chatbots and state machines

The first component is what we call natural language understanding, which typically means taking a short message that a user sends and extracting some meaning from it, which means turning it into structured data. In the case we talked about regarding the SQL database, if somebody asks, for example, ‘What was my ROI on my Facebook campaigns last month?’, the first thing you want to understand is that this is a data question and you want to assign it a label identifying it as a person, and they’re not saying hello, or goodbye, or thank you, but asking a specific question. Then you want to pick out those fields to help you create a query.

… The second piece is, how do you actually know what to do next? How do you build a system that can hold a conversation that is coherent? What you realize very quickly is that it’s not enough to have one input always matched to the same output. For example, if you ask somebody a yes or no question and they say, ‘yes,’ the next thing to do, of course, depends on what the original question was.

… Real conversations aren’t stateless; they have some context and they need to pay attention to the history. So, the way developers do that is build a state machine. Which means, for example, that you have a bot that can do some different things. It can talk about flights; it can talk about hotels. Then you define different states for when the person is still searching, or for when they are comparing different things, or for when they finish a booking. And then you have to define rules for how to behave for every input, for every possible state.

Beyond state machines

The problem is that [the state machine] approach works for building your first version, but it really restricts you to what we call “the happy parts,” which is where the user is compliant and cooperative and does everything you ask them to do. But in typical cases, you ask a person, “Do you like option A, or option B?” Then you probably build the path for the person saying, A, you build a path for the person saying B. But then you give it to real users, and they say, “No, I don’t like either of those.” Or they ask a question like, “Why is A so much more expensive than B?” Or, “Let me get back to you about that.”

… They don’t scale, that’s the problem. If you’re a developer and somebody has a conversation with your bot and you realize that it did the wrong thing, now you have to go look back at your (literally) thousands or tens of thousands of rules to figure out which one crashed and which one did the wrong thing. You figure out where to inject one more rule to handle one more etiquette, and that just doesn’t scale at all.

… With our dialogue library Rasa core, we give the user the ability to talk to the bot and provide feedback. So, in Rasa, the whole flow of dialogue is also controlled with machine learning. And it’s learned from real sample conversations. You talk to the system and if it does something wrong, you provide feedback and it corrects itself. So, you explore the space of possible conversations interactively yourself, and then your users do as well.

Related resources: