Speech Data Processing Takes Flight

Unlocking speech and audio data with new open source tools

Interest in neural networks and deep learning can be traced back to groundbreaking results in computer vision (2012) and speech recognition (2011). The number of companies working on computer vision applications is increasing, but the number of companies working on audio data is much lower, despite the fact that there are many speech models and services available.

A major reason is that audio data has historically been difficult to work with. There are many different formats for storing and compressing audio data. Data is either lossless or lossy and may require different codecs to read, plus audio data can have multiple channels.

Unbeknownst to many Data and AI teams, things are simpler today. In a new post with researchers from Meaning, we describe an ecosystem of open source projects that vastly simplify audio data processing and pipelining. These projects allow data scientists, developers, and machine learning engineers who are comfortable with Python to start incorporating audio data into their models.

Data Exchange podcast

  • Confidential Computing for Machine Learning: Sadegh Riazi is CEO and co-founder of CipherMode Labs, a startup building tools that enable data and AI teams to build and deploy models directly on encrypted data. They recently introduced CipherCore, an exciting new open source, high performance library that makes confidential computing accessible to data teams. 
  • Applied NLP Research: I discuss the process of translating ML research into products with John Bohannon, Senior Director of Data Science and Head of Research at Primer AI. Our discussion topics included zero shot entity recognition, MLOps, synthetic text, and automatic summarization.
  • Using SQL to Retrieve Data from APIs and Web Services: Jon Udell is community lead for Steampipe, an open source component that data scientists & ML engineers can insert into their data pipelines to make it easy to extract data from APIs, and get data into structures like data frames where they tend to do most of their work.

2022 NLP Summit

The NLP Summit is the world’s largest applied NLP community. As co-chair, I’m excited to announce that we have another outstanding slate of presentations that include many real world use cases and applications, updates on major open source projects, and cutting-edge research being conducted at Google Brain, Hugging Face, and OpenAI. If you work with NLP and text, you need to attend this FREE online conference.

What does it mean to build trust into AI?

In my recent Twitter Spaces conversation with Andrew Burt (Managing Partner at BNH.ai) and Bob Friday (Chief AI Officer at Juniper), we dig into that and more.


