Holistic Evaluation of Language Models

Stanford researchers develop tools to help understand language models in their totality. As general-purpose models become more prevalent and important, there’s a growing need for tools to help developers select what models are appropriate for their use case, and more importantly to help them understand the limitations of these models. As someone who uses theseContinue reading “Holistic Evaluation of Language Models”

Taking time series modeling and stream processing mainstream

Subscribe • Previous Issues Trends and Opportunities in Time Series Time series is an area that has given rise to publicly traded companies, a variety of open source tools, and startups that have collectively raised over a billion dollars. The global market for time series analysis software is expected to grow at a compound annual rate ofContinue reading “Taking time series modeling and stream processing mainstream”

The Stream Processing Index

Measuring the popularity of different stream processing tools. By Jesse Anderson and Ben Lorica. Streaming data is one of the most important areas in information technology today. The result has been that entrepreneurs have collectively raised more than $1.1 billion for stream processing startups. The ability to make informed decisions quickly and unlock enormous amountsContinue reading “The Stream Processing Index”

Trends and Opportunities in Time Series

Time series tools are transforming companies. We highlight areas that need to be addressed to enhance their effectiveness. By Ira Cohen and Ben Lorica. Time series and temporal data are everywhere. Most of the data that companies collect from users, sensors, and machines come with a date/time stamp. Time series data are used for reportsContinue reading “Trends and Opportunities in Time Series”

Data and AI job markets are slowing down

Subscribe • Previous Issues Hot technology job market is losing its sizzle If you have the right skills you’re probably still getting inundated with messages from recruiters. But beware that the overall job market is softening. We’re seeing large percentage declines (year-over-year) across a variety of data and AI keyword searches. Exceptions include “data governance” (essentially flat),Continue reading “Data and AI job markets are slowing down”

Here’s what we need to do to fix AutoML

Seven suggestions to enhance the effectiveness of AutoML solutions. By Assaf Araki and Ben Lorica. A recent McKinsey survey report reported that more than 56% of respondents have implemented at least one AI function, up from 50 percent in 2020. As AI adoption increases, the survey examined the factors and practices that differentiate the bestContinue reading “Here’s what we need to do to fix AutoML”

Embed Retrieve Win

Subscribe • Previous Issues The Vector Database Index If you work with text or images, chances are embeddings are already a key part of your machine learning and analytic pipelines. Embeddings are low-dimensional spaces into which higher-dimensional vectors can be mapped into. They can represent many kinds of data, whether a piece of text, an image orContinue reading “Embed Retrieve Win”

The Vector Database Index

Measuring the popularity of different Vector Databases. By Ben Lorica and Leo Meyerovich. Introduction Vector databases and vector search are on the radar of a growing number of technical teams. A key driver is that advances in neural networks have made dense vector representations of data more common. Interest has also grown due to theContinue reading “The Vector Database Index”

Speech Data Processing Takes Flight

Subscribe • Previous Issues Unlocking speech and audio data with new open source tools Interest in neural networks and deep learning can be traced back to groundbreaking results in computer vision (2012) and speech recognition (2011). The number of companies working on computer vision applications is increasing, but the number of companies working on audio data isContinue reading “Speech Data Processing Takes Flight”

New open source tools to unlock speech and audio data

Introducing Lhotse, a Python library for handling speech data. By Piotr Żelasko, Jan Vainer, Tomáš Nekvinda, and Ben Lorica. Introduction Of the many voice applications for AI, speech recognition is the most widely known and deployed as a building block of voice assistants. Voice and speech recognition market alone is expected to grow from $9.4Continue reading “New open source tools to unlock speech and audio data”