Trends and Opportunities in Time Series
Time series is an area that has given rise to publicly traded companies, a variety of open source tools, and startups that have collectively raised over a billion dollars. The global market for time series analysis software is expected to grow at a compound annual rate of 11.5% from 2020 to 2027. Despite their ubiquity and importance, time series data lack the cachet of other data types. The most discussed developments in machine learning and AI in recent years involve text (large language models), visual data (computer vision), audio (speech technologies), or their combinations (DALL·E).
In a recent post with Ira Cohen of Anodot, we described the landscape of time series tools and we also compiled ideas for how time series solutions can gain more users and increase their impact. This is an essential overview on tools that are already transforming companies.
Data Exchange podcast
- An open source, production grade vector search engine: Bob van Luijt, is CEO of SeMI Technologies, the company behind the popular vector search engine Weaviate. He describes their key features and core components, popular use cases, and Weaviate’s near-term roadmap. We also discuss how vector search engines compare with existing data management systems.
- A comprehensive suite of open source tools for time series modeling: Federico Garza and Max Canseco are co-founders of Nixtla, a startup building developer-friendly software that helps data scientists deploy predictive pipelines. Their libraries have been a great addition to my toolbox.
- Combining data and knowledge for AI applications: Christopher Nguyen is CEO and cofounder of Aitomatic, a startup that uses a knowledge-first approach to build and deploy machine learning solutions. We discuss the unique challenges and opportunities in combining human & machine intelligence.
Data & Machine Learning Tools and Infrastructure
Data anti-gravity with Skyplane, a tool for blazingly fast bulk data transfers in the cloud. Skyplane is an open source system for fast, low-cost transfers between object stores. It is designed to let you copy a large dataset in the cloud in a minute, not hours.
What Do NLP Researchers Believe? A plurality of the close to 500 respondents believe the most influential area of advances in the next ten years will be in problem formulation and task design, as opposed to hardware and data scaling.
Large Image Datasets are a Mess. A short presentation from the co-creator of fastdup, a tool for analyzing large image collections. By using fastdup, you can find anomalies, duplicate and near duplicate images, clusters of similarity, and trace interactions between images over time.
K1st World is next week! Researchers and companies will explore lessons in combining data, machine learning, and knowledge for life-critical applications. Use the discount code GRADIENTFLOW60 to attend in person or online.
Stream Processing Index
The ability to make informed decisions quickly and exploit enormous amounts of incoming data are crucial competitive advantages. Alas, real-time analytics and decision intelligence are difficult to pull off because streaming data requirements differ from batch or event-based processing applications. In a new post with Jesse Anderson of the Big Data Institute, we compare stream processing solutions using metrics that measure popularity. We believe that companies that master streaming will gain a decisive edge over the next few years.
If you enjoyed this newsletter please support our work by encouraging your friends and colleagues to subscribe: