Using machine learning to improve dialog flow in conversational applications

[A version of this post appears on the O’Reilly Radar.]

The O’Reilly Data Show Podcast: Alan Nichol on building a suite of open source tools for chatbot developers.

In this episode of the Data Show, I spoke with Alan Nichol, co-founder and CTO of Rasa, a startup that builds open source tools to help developers and product teams build conversational applications. About 18 months ago, there was tremendous excitement and hype surrounding chatbots, and while things have quieted lately, companies and developers continue to refine and define tools for building conversational applications. We spoke about the current state of chatbots, specifically about the types of applications developers are building today and how he sees conversational applications evolving in the near future.

As I described in a recent post, workflow automation will happen in stages. With that in mind, chatbots and intelligent assistants are bound to improveas underlying algorithms, technologies, and training data get better.

Here are some highlights from our conversation:

Chatbots and state machines

The first component is what we call natural language understanding, which typically means taking a short message that a user sends and extracting some meaning from it, which means turning it into structured data. In the case we talked about regarding the SQL database, if somebody asks, for example, ‘What was my ROI on my Facebook campaigns last month?’, the first thing you want to understand is that this is a data question and you want to assign it a label identifying it as a person, and they’re not saying hello, or goodbye, or thank you, but asking a specific question. Then you want to pick out those fields to help you create a query.
Continue reading “Using machine learning to improve dialog flow in conversational applications”

Building accessible tools for large-scale computation and machine learning

[A version of this post appears on the O’Reilly Radar.]

In this episode of the Data Show, I spoke with Eric Jonas, a postdoc in the new Berkeley Center for Computational Imaging. Jonas is also affiliated with UC Berkeley’s RISE Lab. It was at a RISE Lab event that he first announced Pywren, a framework that lets data enthusiasts proficient with Python run existing code at massive scale on Amazon Web Services. Jonas and his collaborators are working on a related project, NumPyWren, a system for linear algebra built on a serverless architecture. Their hope is that by lowering the barrier to large-scale (scientific) computation, we will see many more experiments and research projects from communities that have been unable to easily marshal massive compute resources. We talked about Bayesian machine learning, scientific computation, reinforcement learning, and his stint as an entrepreneur in the enterprise software space.

Here are some highlights from our conversation:

Pywren

The real enabling technology for us was when Amazon announced the availability of AWS Lambda, their microservices framework, in 2014. Following this prompting, I went home one weekend and thought, ‘I wonder how hard it is to take an arbitrary Python function and marshal it across the wire, get it running in Lambda; I wonder how many I can get at once?’ Thus, Pywren was born.
Continue reading “Building accessible tools for large-scale computation and machine learning”

Notes from the first Ray meetup

[A version of this post appears on the O’Reilly Radar.]

Ray is beginning to be used to power large-scale, real-time AI applications.

Machine learning adoption is accelerating due to the growing number of large labeled data sets, languages aimed at data scientists (R, Julia, Python), frameworks (scikit-learn, PyTorch, TensorFlow, etc.), and tools for building infrastructure to support end-to-end applications. While some interesting applications of unsupervised learning are beginning to emerge, many current machine learning applications rely on supervised learning. In a recent series of posts, Ben Recht makes the case for why some of the most interesting problems might actually fall under reinforcement learning (RL), specifically systems that are able to act based upon past data and do so in a way that is safe, robust, and reliable.

But first we need RL tools that are accessible for practitioners. Unlike supervised learning, in the past there hasn’t been a good open source tool for easily trying RL at scale. I think things are about to change. I was fortunate enough to receive an invite to the first meetup devoted to RayRISE Lab’s high-performance, distributed execution engine, which targets emerging AI applications, including those that rely on reinforcement learning. This was a small, invite-only affair held at OpenAI, and most of the attendees were interested in reinforcement learning.

Here’s a brief rundown of the program:

  • Robert Nishihara and Philipp Moritz gave a brief overview and update on the Ray project, including a description of items on the near-term roadmap.
  • Eric Liang and Richard Liawgave a quick tour of two libraries built on top of Ray: RLlib(scalable reinforcement learning) and Tune(a hyperparameter optimization framework). They also pointed a to a recent ICML paper on RLlib. Both of these libraries are easily accessible to anyone familiar with Python, and both should prove popular among industrial data scientists.

RLlib and reinforcement learning. Image courtesy of RISE Lab.

  • Eugene Vinitsky showed some amazing videos of how Ray is helping them understand and predict traffic patterns in real time, and in the process help researchers study large transportation networks. The videos were some of the best examples of the combination of IoT, sensor networks, and reinforcement learning that I’ve seen.
  • Alex Bao of Ant Financial described three applications they’ve identified for Ray. I’m not sure I’m allowed to describe them here, but they were all very interesting and important use cases. The most important takeaway for the evening was Ant Financial is already using Ray in production in two of the three use cases (and they are close to deploying Ray to production for the third)! Given that Ant Financial is the largest unicorn company in the world, this is amazing validation for Ray.

With the buzz generated by the evening’s presentations and early examples of production deployments beginning to happen, I expect meetups on Ray to start springing up in other geographic areas. We are still in the early stages of adoption of machine learning technologies. The presentations at this meetup confirm that an accessible and scalable platform like Ray, opens up many possible applications of reinforcement learning and online learning.

For more on Ray:

Specialized hardware for deep learning will unleash innovation

[A version of this post appears on the O’Reilly Radar.]

The O’Reilly Data Show Podcast: Andrew Feldman on why deep learning is ushering a golden age for compute architecture.

In this episode of the Data Show, I spoke with Andrew Feldman, founder and CEO of Cerebras Systems, a startup in the blossoming area of specialized hardware for machine learning. Since the release of AlexNet in 2012, we have seen an explosion in activity in machine learning, particularly in deep learning. A lot of the work to date happened primarily on general purpose hardware (CPU, GPU). But now that we’re six years into the resurgence in interest in machine learning and AI, these new workloads have attracted technologists and entrepreneurs who are building specialized hardware for both model training and inference, in the data center or on edge devices.

In fact, companies with enough volume have already begun building specialized processors for machine learning. But you have to either use specific cloud computing platforms or work at specific companies to have access to such hardware. A new wave of startups (including Cerebras) will make specialized hardware affordable and broadly available. Over the next 12-24 months architects and engineers will need to revisit their infrastructure and decide between general purpose or specialized hardware, and cloud or on-premise gear.

In light of the training durationand cost they face using current (general purpose) hardware, some experiments might be hard to justify. Upcoming specialized hardware will enable data scientists to try out ideas that they previously would have hesitated to pursue. This will surely lead to more research papers and interesting products as data scientists are able to run many more experiments (on even bigger models) and iterate faster.

As founder of one of the most anticipated hardware startups in the deep learning space, I wanted get Feldman’s views on the challenges and opportunities faced by engineers and entrepreneurs building hardware for machine learning workloads.

Here are some highlights from our conversation:
Continue reading “Specialized hardware for deep learning will unleash innovation”

Data collection and data markets in the age of privacy and machine learning

[A version of this post appears on the O’Reilly Radar.]

While models and algorithms garner most of the media coverage, this is a great time to be thinking about building tools in data.

In this post I share slides and notes from a keynote I gave at the Strata Data Conference in London at the end of May. My goal was to remind the data community about the many interesting opportunities and challenges in data itself. Much of the focus of recent press coverage has been on algorithms and models, specifically the expanding utility of deep learning. Because large deep learning architectures are quite data hungry, the importance of data has grown even more. In this short talk, I describe some interesting trends in how data is valued, collected, and shared.

Economic value of data

It’s no secret that companies place a lot of value on data and the data pipelines that produce key features. In the early phases of adopting machine learning (ML), companies focus on making sure they have sufficient amount of labeled (training) data for the applications they want to tackle. They then investigate additional data sources that they can use to augment their existing data. In fact, among many practitioners, data remains more valuable than models (many talk openly about what models they use, but are reticent to discuss the features they feed into those models).

But if data is precious, how do we go about estimating its value? For those among us who build machine learning models, we can estimate the value of data by examining the cost of acquiring training data:
Continue reading “Data collection and data markets in the age of privacy and machine learning”

What machine learning means for software development

[A version of this post appears on the O’Reilly Radar.]

“Human in the loop” software development will be a big part of the future.

By Ben Lorica and Mike Loukides

Machine learning is poised to change the nature of software development in fundamental ways, perhaps for the first time since the invention of FORTRAN and LISP. It presents the first real challenge to our decades-old paradigms for programming. What will these changes mean for the millions of people who are now practicing software development? Will we see job losses and layoffs, or will see programming evolve into something different—perhaps even something more focused on satisfying users?

We’ve built software more or less the same way since the 1970s. We’ve had high-level languages, low-level languages, scripting languages, and tools for building and testing software, but what those tools let us do hasn’t changed much. Our languages and tools are much better than they were 50 years ago, but they’re essentially the same. We still have editors. They’re fancier: they have color highlighting, name completion, and they can sometimes help with tasks like refactoring, but they’re still the descendants of emacs and vi. Object orientation represents a different programming style, rather than anything fundamentally new—and, of course, functional programming goes all the way back to the 50s (except we didn’t know it was called that). Can we do better?

We will focus on machine learning rather than artificial intelligence. Machine learning has been called “the part of AI that works,” but more important, the label “machine learning” steers clear of notions like general intelligence. We’re not discussing systems that can find a problem to be solved, design a solution, and implement that solution on their own. Such systems don’t exist, and may never exist. Humans are needed for that. Machine learning may be little more than pattern recognition, but we’ve already seen that pattern recognition can accomplish a lot. Indeed, hand-coded pattern recognition is at the heart of our current toolset: that’s really all a modern optimizing compiler is doing.

We also need to set expectations. McKinsey estimates that “fewer than 5% of occupations can be entirely automated using current technology. However, about 60% of occupations could have 30% or more of their constituent activities automated.” Software development and data science aren’t going to be among the occupations that are completely automated. But good software developers have always sought to automate tedious, repetitive tasks; that’s what computers are for. It should be no surprise that software development itself will increasingly be automated.
Continue reading “What machine learning means for software development”

Data regulations and privacy discussions are still in the early stages

[A version of this post appears on the O’Reilly Radar.]

The O’Reilly Data Show Podcast: Aurélie Pols on GDPR, ethics, and ePrivacy.

In this episode of the Data Show, I spoke with Aurélie Pols of Mind Your Privacy, one of my go-to resources when it comes to data privacy and data ethics. This interview took place at Strata Data London, a couple of days before the EU General Data Protection Regulation (GDPR) took effect. I wanted her perspective on this landmark regulation, as well as her take on trends in data privacy and growing interest in ethics among data professionals.

Here are some highlights from our conversation:

GDPR is just the starting point

GDPR is not an end point. It’s a starting point for a journey where a balance between companies and society and users of data needs to be redefined. Because when I look at my children, I look at how they use technology, I look at how smart my house might become or my car or my fridge, I know that in the long run this idea of giving consent to my fridge to share data is not totally viable. What are we going to be build for the next generations?
Continue reading “Data regulations and privacy discussions are still in the early stages”