How AI and machine learning are improving customer experience

[A version of this post appears on the O’Reilly Radar.]

From data quality to personalization, to customer acquisition and retention, and beyond, AI and ML will shape the customer experience of the future.

By Ben Lorica and Mike Loukides.

What can artificial intelligence (AI) and machine learning (ML) do to improve customer experience? AI and ML already have been intimately involved in online shopping since, well, the beginning of online shopping. You can’t use Amazon or any other shopping service without getting recommendations, which are often personalized based on the vendor’s understanding of your traits: your purchase history, your browsing history, and possibly much more. Amazon and other online businesses would love to invent a digital version of the (possibly mythical) sales person who knows you and your tastes, and can unerringly guide you to products you will enjoy.

Everything begins with better data

To make that vision a reality, we need to start with some heavy lifting on the back end. Who are your customers? Do you really know who they are? All customers leave behind a data trail, but that data trail is a series of fragments, and it’s hard to relate those fragments to each other. If one customer has multiple accounts, can you tell? If a customer has separate accounts for business and personal use, can you link them? And if an organization uses many different names (we remember a presentation in which someone talked of the hundreds of names—literally—that resolved to IBM), can you discover the single organization responsible for them? Customer experience starts with knowing exactly who your customers are and how they’re related. Scrubbing your customer lists to eliminate duplicates is called entity resolution; it used to be the domain of large companies that could afford substantial data teams. We’re now seeing the democratization of entity resolution: there are now startups that provide entity resolution software and services that are appropriate for small to mid-sized organizations.

Once you’ve found out who your customers are, you have to ask how well you know them. Getting a holistic view of a customer’s activities is central to understanding their needs. What data do you have about them, and how do you use it? ML and AI are now being used as tools in data gathering: in processing the data streams that come from sensors, apps, and other sources. Gathering customer data can be intrusive and ethically questionable; as you build your understanding of your customers, make sure you have their consent and that you aren’t compromising their privacy.
Continue reading “How AI and machine learning are improving customer experience”

It’s time for data scientists to collaborate with researchers in other disciplines

[A version of this post appears on the O’Reilly Radar.]

The O’Reilly Data Show Podcast: Forough Poursabzi Sangdeh on the interdisciplinary nature of interpretable and interactive machine learning.

In this episode of the Data Show, I spoke with Forough Poursabzi-Sangdeh, a postdoctoral researcher at Microsoft Research New York City. Poursabzi works in the interdisciplinary area of interpretable and interactive machine learning. As models and algorithms become more widespread, many important considerations are becoming active research areas: fairness and bias, safety and reliability, security and privacy, and Poursabzi’s area of focus—explainability and interpretability.

We had a great conversation spanning many topics, including:

  • Current best practices and state-of-the-art methods used to explain or interpret deep learning—or, more generally, machine learning models.
  • The limitations of current model interpretability methods.
  • The lack of clear/standard metrics for comparing different approaches used for model interpretability
  • Many current AI and machine learning applications augment humans, and, thus, Poursabzi believes it’s important for data scientists to work closely with researchers in other disciplines.
  • The importance of using human subjects in model interpretability studies.

Related resources:

Algorithms are shaping our lives – here’s how we wrest back control

[A version of this post appears on the O’Reilly Radar.]

The O’Reilly Data Show Podcast: Kartik Hosanagar on the growing power and sophistication of algorithms.

In this episode of the Data Show, I spoke with Kartik Hosanagar, professor of technology and digital business, and professor of marketing at The Wharton School of the University of Pennsylvania.  Hosanagar is also the author of a newly released book, A Human’s Guide to Machine Intelligence, an interesting tour through the recent evolution of AI applications, which draws from his extensive experience at the intersection of business and technology.

We had a great conversation spanning many topics, including:

  • The types of unanticipated consequences of which algorithm designers should be aware.
  • The predictability-resilience paradox: as systems become more intelligent and dynamic, they also become more unpredictable, so there are trade-offs algorithms designers must face.
  • Managing risk in machine learning: AI application designers need to weigh considerations such as fairness, security, privacy, explainability, safety, and reliability.
  • A bill of rights for humans impacted by the growing power and sophistication of algorithms.
  • Some best practices for bringing AI into the enterprise.

Related resources:

 

You created a machine learning application. Now make sure it’s secure.

[A version of this post appears on the O’Reilly Radar.]

The software industry has demonstrated, all too clearly, what happens when you don’t pay attention to security.

By Ben Lorica and Mike Loukides.

In a recent post, we described what it would take to build a sustainable machine learning practice. By “sustainable,” we mean projects that aren’t just proofs of concepts or experiments. A sustainable practice means projects that are integral to an organization’s mission: projects by which an organization lives or dies. These projects are built and supported by a stable team of engineers, and supported by a management team that understands what machine learning is, why it’s important, and what it’s capable of accomplishing. Finally, sustainable machine learning means that as many aspects of product development as possible are automated: not just building models, but cleaning data, building and managing data pipelines, testing, and much more. Machine learning will penetrate our organizations so deeply that it won’t be possible for humans to manage them unassisted.

Organizations throughout the world are waking up to the fact that security is essential to their software projects. Nobody wants to be the next Sony, the next Anthem, or the next Equifax. But while we know how to make traditional software more secure (even though we frequently don’t), machine learning presents a new set of problems. Any sustainable machine learning practice must address machine learning’s unique security issues. We didn’t do that for traditional software, and we’re paying the price now. Nobody wants to pay the price again. If we learn one thing from traditional software’s approach to security, it’s that we need to be ahead of the curve, not behind it. As Joanna Bryson writes, “Cyber security and AI are inseparable.”
Continue reading “You created a machine learning application. Now make sure it’s secure.”

Why your attention is like a piece of contested territory

[A version of this post appears on the O’Reilly Radar.]

The O’Reilly Data Show Podcast: P.W. Singer on how social media has changed, war, politics, and business.

In this episode of the Data Show, I spoke with P.W. Singer, strategist and senior fellow at the New America Foundation, and a contributing editor at Popular Science. He is co-author of an excellent new book, LikeWar: The Weaponization of Social Media, which explores how social media has changed war, politics, and business. The book is essential reading for anyone interested in how social media has become an important new battlefield in a diverse set of domains and settings.

We had a great conversation spanning many topics, including:

  • In light of the 10th anniversary of his earlier book Wired for War, we talked about progress in robotics over the past decade.
  • The challenge posed by the fact that social networks reward virality, not veracity.
  • How the internet has emerged as an important new battlefield.
  • How this new online battlefield changes how conflicts are fought and unfold.
  • How many of the ideas and techniques covered in LikeWarare trickling down from nation-state actors influencing global events, to consulting companies offering services that companies and individuals can use.

Continue reading “Why your attention is like a piece of contested territory”

How AI can help to prevent the spread of disinformation

[This post originally appeared on Information Age.]

Our industry has a duty to discuss the dark side of technology. Yet many organisations — including some that wield enormous power and influence — are reluctant to acknowledge that their platforms are used to spread disinformation, foster hatred, facilitate bullying, and much else that makes our world a worse place in which to live.

Disinformation — what is sometimes called “fake news” — is a prime example of the unintended consequences of new technology. Its purpose is purely to create discord; it poisons public discourse and feeds festering hatreds with a litany of lies. What makes disinformation so effective is that it exploits characteristics of human nature such as confirmation bias, then seizes on the smallest seed of doubt and amplifies it with untruths and obfuscation.

Disinformation has spawned a new sub-industry within journalism, with fact checkers working around the clock to analyse politicians’ speeches, articles from other publications and news reports, and government statistics among much else. But the sheer volume of disinformation, together with its ability to multiply and mutate like a virus on a variety of social platforms, means that thorough fact-checking is only possible on a tiny proportion of disputed articles.
Continue reading “How AI can help to prevent the spread of disinformation”

The evolution and expanding utility of Ray

[A version of this post appears on the O’Reilly Radar.]

There are growing numbers of users and contributors to the framework, as well as libraries for reinforcement learning, AutoML, and data science.

In a recent post, I listed some of the early use cases described in the first meetup dedicated to Ray—a distributed programming framework from UC Berkeley’s RISE Lab. A second meetup took place a few months later, and both events featured some of the first applications built with Ray. On the development front, the core API has stabilized and a lot of work has gone into improving Ray’s performance and stability. The project now has around 5,700 stars on GitHuband more than 100 contributors across many organizations.

At this stage of the project, how does one describe Ray to those who aren’t familiar with the project? The RISE Lab team describes Ray as a “general framework for programming your cluster or cloud.” To place the project into context, Ray and cloud functions (FaaS, serverless) currently sit somewhere in the middle between extremely flexible systems on one end or systems that are much more targeted and emphasize ease of use. More precisely, users currently can avail of extremely flexible cluster management and virtualization tools on one end (Docker, Kubernetes, Mesos, etc.), or domain specific systems on the other end of the flexibility spectrum (Spark, Kafka, Flink, PyTorch, TensorFlow, Redshift, etc.).
Continue reading “The evolution and expanding utility of Ray”