“It’s time for big data scientists to become social scientists, not just computer scientists.” – Justin Grimmer
FREE Report: Data Engineering Survey Results
We examine the changing landscape of data engineering challenges, tools, and opportunities. This report is based on a global online survey that drew 372 respondents.
Data Exchange podcast
- Making Large Language Models Smarter Yoav Shoham is co-founder of AI21 Labs, creators of the largest language model available to developers. He is also a Professor Emeritus of Computer Science at Stanford University, and creator of The AI Index.
- Modernizing Data Integration Michel Tricot is co-founder and CEO of Airbyte, a startup behind the popular open source project with the same name. Despite being a relatively new project, Airbyte has been gaining traction among data and platform engineers who are responsible for building and maintaining integrated data systems.
- The Road to AI Begins With Data Quality Jeremy Stanley is co-founder and CTO of Anomalo, a startup building SaaS tools to help companies with data quality. Prior to Anomalo, Jeremy was VP of Data Science at Instacart.

Recommendations
- Resurgence of Conversational AI Kenn So of Shasta Ventures and I discuss how Large Language Models ease the development of useful chatbots.
- Top Places to Work for Data Engineers Fall has historically been a time when many people evaluate and consider job opportunities. This is the second in a series, the first listed great places to work for data scientists.
- Retrieval-based NLP A class of models that “search” for information from a corpus to exhibit knowledge, while using the representational strength of language models. Researchers at Stanford recently developed retrieval-based NLP models that delivered impressive results on a variety of Q&A benchmarks.
- The Third Generation of Production ML Architectures A good historical overview by Waleed Kadous of Anyscale.
- Databricks SQL sets a new world record in 100TB TPC-DS A question I get asked frequently is whether an open architecture built on a lakehouse can provide the performance, speed, and cost advantages of traditional data warehouses. This result proves beyond any doubt that the lakehouse architecture is capable of achieving this.
If you enjoyed this newsletter please support our work by encouraging your friends and colleagues to subscribe: